Models, code, and papers for "Eric Wu":

Learning Scene Gist with Convolutional Neural Networks to Improve Object Recognition

Jun 09, 2018
Kevin Wu, Eric Wu, Gabriel Kreiman

Advancements in convolutional neural networks (CNNs) have made significant strides toward achieving high performance levels on multiple object recognition tasks. While some approaches utilize information from the entire scene to propose regions of interest, the task of interpreting a particular region or object is still performed independently of other objects and features in the image. Here we demonstrate that a scene's 'gist' can significantly contribute to how well humans can recognize objects. These findings are consistent with the notion that humans foveate on an object and incorporate information from the periphery to aid in recognition. We use a biologically inspired two-part convolutional neural network ('GistNet') that models the fovea and periphery to provide a proof-of-principle demonstration that computational object recognition can significantly benefit from the gist of the scene as contextual information. Our model yields accuracy improvements of up to 50% in certain object categories when incorporating contextual gist, while only increasing the original model size by 5%. This proposed model mirrors our intuition about how the human visual system recognizes objects, suggesting specific biologically plausible constraints to improve machine vision and building initial steps towards the challenge of scene understanding.


  Click for Model/Code and Paper
Parameterized Exploration

Jul 13, 2019
Jesse Clifton, Lili Wu, Eric Laber

We introduce Parameterized Exploration (PE), a simple family of methods for model-based tuning of the exploration schedule in sequential decision problems. Unlike common heuristics for exploration, our method accounts for the time horizon of the decision problem as well as the agent's current state of knowledge of the dynamics of the decision problem. We show our method as applied to several common exploration techniques has superior performance relative to un-tuned counterparts in Bernoulli and Gaussian multi-armed bandits, contextual bandits, and a Markov decision process based on a mobile health (mHealth) study. We also examine the effects of the accuracy of the estimated dynamics model on the performance of PE.


  Click for Model/Code and Paper
Audio-Visual Kinship Verification

Jun 24, 2019
Xiaoting Wu, Eric Granger, Xiaoyi Feng

Visual kinship verification entails confirming whether or not two individuals in a given pair of images or videos share a hypothesized kin relation. As a generalized face verification task, visual kinship verification is particularly difficult with low-quality found Internet data. Due to uncontrolled variations in background, pose, facial expression, blur, illumination and occlusion, state-of-the-art methods fail to provide high level of recognition accuracy. As with many other visual recognition tasks, kinship verification may benefit from combining visual and audio signals. However, voice-based kinship verification has received very little prior attention. We hypothesize that the human voice contains kin-related cues that are complementary to visual cues. In this paper, we address, for the first time, the use of audio-visual information from face and voice modalities to perform kinship verification. We first propose a new multi-modal kinship dataset, called TALking KINship (TALKIN), that contains several pairs of Internet-quality video sequences. Using TALKIN, we then study the utility of various kinship verification methods including traditional local feature based methods, statistical methods and more recent deep learning approaches. Then, early and late fusion methods are evaluated on the TALKIN dataset for the study of kinship verification with both face and voice modalities. Finally, we propose a deep Siamese fusion network with contrastive loss for multi-modal fusion of kinship relations. Extensive experiments on the TALKIN dataset indicate that by combining face and voice modalities, the proposed Siamese network can provide a significantly higher level of accuracy compared to baseline uni-modal and multi-modal fusion techniques. Experimental results also indicate that audio (vocal) information is complementary (to facial information) and useful for kinship verification.

* 12 pages 

  Click for Model/Code and Paper
Solving Multistage Influence Diagrams using Branch-and-Bound Search

Mar 15, 2012
Changhe Yuan, Xiaojian Wu, Eric A. Hansen

A branch-and-bound approach to solving influ- ence diagrams has been previously proposed in the literature, but appears to have never been implemented and evaluated - apparently due to the difficulties of computing effective bounds for the branch-and-bound search. In this paper, we describe how to efficiently compute effective bounds, and we develop a practical implementa- tion of depth-first branch-and-bound search for influence diagram evaluation that outperforms existing methods for solving influence diagrams with multiple stages.

* Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010) 

  Click for Model/Code and Paper
Removing Confounding Factors Associated Weights in Deep Neural Networks Improves the Prediction Accuracy for Healthcare Applications

Aug 31, 2018
Haohan Wang, Zhenglin Wu, Eric P. Xing

The proliferation of healthcare data has brought the opportunities of applying data-driven approaches, such as machine learning methods, to assist diagnosis. Recently, many deep learning methods have been shown with impressive successes in predicting disease status with raw input data. However, the "black-box" nature of deep learning and the high-reliability requirement of biomedical applications have created new challenges regarding the existence of confounding factors. In this paper, with a brief argument that inappropriate handling of confounding factors will lead to models' sub-optimal performance in real-world applications, we present an efficient method that can remove the influences of confounding factors such as age or gender to improve the across-cohort prediction accuracy of neural networks. One distinct advantage of our method is that it only requires minimal changes of the baseline model's architecture so that it can be plugged into most of the existing neural networks. We conduct experiments across CT-scan, MRA, and EEG brain wave with convolutional neural networks and LSTM to verify the efficiency of our method.

* 12 pages, 5 figures 

  Click for Model/Code and Paper
Conditional Infilling GANs for Data Augmentation in Mammogram Classification

Aug 24, 2018
Eric Wu, Kevin Wu, David Cox, William Lotter

Deep learning approaches to breast cancer detection in mammograms have recently shown promising results. However, such models are constrained by the limited size of publicly available mammography datasets, in large part due to privacy concerns and the high cost of generating expert annotations. Limited dataset size is further exacerbated by substantial class imbalance since "normal" images dramatically outnumber those with findings. Given the rapid progress of generative models in synthesizing realistic images, and the known effectiveness of simple data augmentation techniques (e.g. horizontal flipping), we ask if it is possible to synthetically augment mammogram datasets using generative adversarial networks (GANs). We train a class-conditional GAN to perform contextual in-filling, which we then use to synthesize lesions onto healthy screening mammograms. First, we show that GANs are capable of generating high-resolution synthetic mammogram patches. Next, we experimentally evaluate using the augmented dataset to improve breast cancer classification performance. We observe that a ResNet-50 classifier trained with GAN-augmented training data produces a higher AUROC compared to the same model trained only on traditionally augmented data, demonstrating the potential of our approach.

* To appear in MICCAI 2018, Breast Image Analysis Workshop 

  Click for Model/Code and Paper
Explaining a black-box using Deep Variational Information Bottleneck Approach

Feb 19, 2019
Seojin Bang, Pengtao Xie, Wei Wu, Eric Xing

Briefness and comprehensiveness are necessary in order to give a lot of information concisely in explaining a black-box decision system. However, existing interpretable machine learning methods fail to consider briefness and comprehensiveness simultaneously, which may lead to redundant explanations. We propose a system-agnostic interpretable method that provides a brief but comprehensive explanation by adopting the inspiring information theoretic principle, information bottleneck principle. Using an information theoretic objective, VIBI selects instance-wise key features that are maximally compressed about an input (briefness), and informative about a decision made by a black-box on that input (comprehensive). The selected key features act as an information bottleneck that serves as a concise explanation for each black-box decision. We show that VIBI outperforms other interpretable machine learning methods in terms of both interpretability and fidelity evaluated by human and quantitative metrics.


  Click for Model/Code and Paper
High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks

May 28, 2019
Haohan Wang, Xindi Wu, Pengcheng Yin, Eric P. Xing

We investigate the relationship between the frequency spectrum of image data and the generalization behavior of convolutional neural networks (CNN). We first notice CNN's ability in capturing the high-frequency components of images. These high-frequency components are almost imperceptible to a human. Thus the observation can serve as one of the explanations of the existence of adversarial examples, and can also help verify CNN's trade-off between robustness and accuracy. Our observation also immediately leads to methods that can improve the adversarial robustness of trained CNN. Finally, we also utilize this observation to design a (semi) black-box adversarial attack method.


  Click for Model/Code and Paper
Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis

Feb 16, 2018
Pengtao Xie, Wei Wu, Yichen Zhu, Eric P. Xing

Distance metric learning (DML), which learns a distance metric from labeled "similar" and "dissimilar" data pairs, is widely utilized. Recently, several works investigate orthogonality-promoting regularization (OPR), which encourages the projection vectors in DML to be close to being orthogonal, to achieve three effects: (1) high balancedness -- achieving comparable performance on both frequent and infrequent classes; (2) high compactness -- using a small number of projection vectors to achieve a "good" metric; (3) good generalizability -- alleviating overfitting to training data. While showing promising results, these approaches suffer three problems. First, they involve solving non-convex optimization problems where achieving the global optimal is NP-hard. Second, it lacks a theoretical understanding why OPR can lead to balancedness. Third, the current generalization error analysis of OPR is not directly on the regularizer. In this paper, we address these three issues by (1) seeking convex relaxations of the original nonconvex problems so that the global optimal is guaranteed to be achievable; (2) providing a formal analysis on OPR's capability of promoting balancedness; (3) providing a theoretical analysis that directly reveals the relationship between OPR and generalization performance. Experiments on various datasets demonstrate that our convex methods are more effective in promoting balancedness, compactness, and generalization, and are computationally more efficient, compared with the nonconvex methods.


  Click for Model/Code and Paper
Validation of a deep learning mammography model in a population with low screening rates

Nov 01, 2019
Kevin Wu, Eric Wu, Yaping Wu, Hongna Tan, Greg Sorensen, Meiyun Wang, Bill Lotter

A key promise of AI applications in healthcare is in increasing access to quality medical care in under-served populations and emerging markets. However, deep learning models are often only trained on data from advantaged populations that have the infrastructure and resources required for large-scale data collection. In this paper, we aim to empirically investigate the potential impact of such biases on breast cancer detection in mammograms. We specifically explore how a deep learning algorithm trained on screening mammograms from the US and UK generalizes to mammograms collected at a hospital in China, where screening is not widely implemented. For the evaluation, we use a top-scoring model developed for the Digital Mammography DREAM Challenge. Despite the change in institution and population composition, we find that the model generalizes well, exhibiting similar performance to that achieved in the DREAM Challenge, even when controlling for tumor size. We also illustrate a simple but effective method for filtering predictions based on model variance, which can be particularly useful for deployment in new settings. While there are many components in developing a clinically effective system, these results represent a promising step towards increasing access to life-saving screening mammography in populations where screening rates are currently low.

* NeurIPS 2019. Fair ML for Health Workshop 

  Click for Model/Code and Paper
Learning a Unified Embedding for Visual Search at Pinterest

Aug 05, 2019
Andrew Zhai, Hao-Yu Wu, Eric Tzeng, Dong Huk Park, Charles Rosenberg

At Pinterest, we utilize image embeddings throughout our search and recommendation systems to help our users navigate through visual content by powering experiences like browsing of related content and searching for exact products for shopping. In this work we describe a multi-task deep metric learning system to learn a single unified image embedding which can be used to power our multiple visual search products. The solution we present not only allows us to train for multiple application objectives in a single deep neural network architecture, but takes advantage of correlated information in the combination of all training data from each application to generate a unified embedding that outperforms all specialized embeddings previously deployed for each product. We discuss the challenges of handling images from different domains such as camera photos, high quality web images, and clean product catalog images. We also detail how to jointly train for multiple product objectives and how to leverage both engagement data and human labeled data. In addition, our trained embeddings can also be binarized for efficient storage and retrieval without compromising precision and recall. Through comprehensive evaluations on offline metrics, user studies, and online A/B experiments, we demonstrate that our proposed unified embedding improves both relevance and engagement of our visual search products for both browsing and searching purposes when compared to existing specialized embeddings. Finally, the deployment of the unified embedding at Pinterest has drastically reduced the operation and engineering cost of maintaining multiple embeddings while improving quality.

* in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge and Discovery and Data Mining, 2019 

  Click for Model/Code and Paper
What evidence does deep learning model use to classify Skin Lesions?

Nov 02, 2018
Junyan Wu, Xiaoxiao Li, Eric Z. Chen, Hongda Jiang, Xu Dong, Ruichen Rong

Melanoma is a type of skin cancer with the most rapidly increasing incidence. Early detection of melanoma using dermoscopy images significantly increases patients' survival rate. However, accurately classifying skin lesions, especially in the early stage, is extremely challenging via dermatologists' observation. Hence, the discovery of reliable biomarkers for melanoma diagnosis will be meaningful. Recent years, deep learning empowered computer-assisted diagnosis has been shown its value in medical imaging-based decision making. However, lots of research focus on improving disease detection accuracy but not exploring the evidence of pathology. In this paper, we propose a method to interpret the deep learning classification findings. Firstly, we propose an accurate neural network architecture to classify skin lesion. Secondly, we utilize a prediction difference analysis method that examining each patch on the image through patch wised corrupting for detecting the biomarkers. Lastly, we validate that our biomarker findings are corresponding to the patterns in the literature. The findings might be significant to guide clinical diagnosis.

* 4 pages, submitted to ISBI 2019 

  Click for Model/Code and Paper
A Leaf Recognition Algorithm for Plant Classification Using Probabilistic Neural Network

Jul 29, 2007
Stephen Gang Wu, Forrest Sheng Bao, Eric You Xu, Yu-Xuan Wang, Yi-Fan Chang, Qiao-Liang Xiang

In this paper, we employ Probabilistic Neural Network (PNN) with image and data processing techniques to implement a general purpose automated leaf recognition algorithm. 12 leaf features are extracted and orthogonalized into 5 principal variables which consist the input vector of the PNN. The PNN is trained by 1800 leaves to classify 32 kinds of plants with an accuracy greater than 90%. Compared with other approaches, our algorithm is an accurate artificial intelligence approach which is fast in execution and easy in implementation.

* 6 pages, 3 figures, 2 tables 

  Click for Model/Code and Paper
Knee menisci segmentation and relaxometry of 3D ultrashort echo time (UTE) cones MR imaging using attention U-Net with transfer learning

Aug 05, 2019
Michal Byra, Mei Wu, Xiaodong Zhang, Hyungseok Jang, Ya-Jun Ma, Eric Y Chang, Sameer Shah, Jiang Du

The purpose of this work is to develop a deep learning-based method for knee menisci segmentation in 3D ultrashort echo time (UTE) cones magnetic resonance (MR) imaging, and to automatically determine MR relaxation times, namely the T1, T1$\rho$, and T2* parameters, which can be used to assess knee osteoarthritis (OA). Whole knee joint imaging was performed using 3D UTE cones sequences to collect data from 61 human subjects. Regions of interest (ROIs) were outlined by two experienced radiologists based on subtracted T1$\rho$-weighted MR images. Transfer learning was applied to develop 2D attention U-Net convolutional neural networks for the menisci segmentation based on each radiologist's ROIs separately. Dice scores were calculated to assess segmentation performance. Next, the T1, T1$\rho$, T2* relaxations, and ROI areas were determined for the manual and automatic segmentations, then compared.The models developed using ROIs provided by two radiologists achieved high Dice scores of 0.860 and 0.833, while the radiologists' manual segmentations achieved a Dice score of 0.820. Linear correlation coefficients for the T1, T1$\rho$, and T2* relaxations calculated using the automatic and manual segmentations ranged between 0.90 and 0.97, and there were no associated differences between the estimated average meniscal relaxation parameters. The deep learning models achieved segmentation performance equivalent to the inter-observer variability of two radiologists. The proposed deep learning-based approach can be used to efficiently generate automatic segmentations and determine meniscal relaxations times. The method has the potential to help radiologists with the assessment of meniscal diseases, such as OA.

* 30 pages, 7 figures 

  Click for Model/Code and Paper
Breast density classification with deep convolutional neural networks

Nov 10, 2017
Nan Wu, Krzysztof J. Geras, Yiqiu Shen, Jingyi Su, S. Gene Kim, Eric Kim, Stacey Wolfson, Linda Moy, Kyunghyun Cho

Breast density classification is an essential part of breast cancer screening. Although a lot of prior work considered this problem as a task for learning algorithms, to our knowledge, all of them used small and not clinically realistic data both for training and evaluation of their models. In this work, we explore the limits of this task with a data set coming from over 200,000 breast cancer screening exams. We use this data to train and evaluate a strong convolutional neural network classifier. In a reader study, we find that our model can perform this task comparably to a human expert.


  Click for Model/Code and Paper
Selectivity Estimation with Deep Likelihood Models

May 10, 2019
Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M. Hellerstein, Sanjay Krishnan, Ion Stoica

Selectivity estimation has long been grounded in statistical tools for density estimation. To capture the rich multivariate distributions of relational tables, we propose the use of a new type of high-capacity statistical model: deep likelihood models. However, direct application of these models leads to a limited estimator that is prohibitively expensive to evaluate for range and wildcard predicates. To make a truly usable estimator, we develop a Monte Carlo integration scheme on top of likelihood models that can efficiently handle range queries with dozens of filters or more. Like classical synopses, our estimator summarizes the data without supervision. Unlike previous solutions, our estimator approximates the joint data distribution without any independence assumptions. When evaluated on real-world datasets and compared against real systems and dominant families of techniques, our likelihood model based estimator achieves single-digit multiplicative error at tail, a 40-200$\times$ accuracy improvement over the second best method, and is space- and runtime-efficient.


  Click for Model/Code and Paper
High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks

Jun 28, 2018
Krzysztof J. Geras, Stacey Wolfson, Yiqiu Shen, Nan Wu, S. Gene Kim, Eric Kim, Laura Heacock, Ujas Parikh, Linda Moy, Kyunghyun Cho

Advances in deep learning for natural images have prompted a surge of interest in applying similar techniques to medical images. The majority of the initial attempts focused on replacing the input of a deep convolutional neural network with a medical image, which does not take into consideration the fundamental differences between these two types of images. Specifically, fine details are necessary for detection in medical images, unlike in natural images where coarse structures matter most. This difference makes it inadequate to use the existing network architectures developed for natural images, because they work on heavily downscaled images to reduce the memory requirements. This hides details necessary to make accurate predictions. Additionally, a single exam in medical imaging often comes with a set of views which must be fused in order to reach a correct conclusion. In our work, we propose to use a multi-view deep convolutional neural network that handles a set of high-resolution medical images. We evaluate it on large-scale mammography-based breast cancer screening (BI-RADS prediction) using 886,000 images. We focus on investigating the impact of the training set size and image size on the prediction accuracy. Our results highlight that performance increases with the size of training set, and that the best performance can only be achieved using the original resolution. In the reader study, performed on a random subset of the test set, we confirmed the efficacy of our model, which achieved performance comparable to a committee of radiologists when presented with the same data.


  Click for Model/Code and Paper
Residual Attention based Network for Hand Bone Age Assessment

Dec 21, 2018
Eric Wu, Bin Kong, Xin Wang, Junjie Bai, Yi Lu, Feng Gao, Shaoting Zhang, Kunlin Cao, Qi Song, Siwei Lyu, Youbing Yin

Computerized automatic methods have been employed to boost the productivity as well as objectiveness of hand bone age assessment. These approaches make predictions according to the whole X-ray images, which include other objects that may introduce distractions. Instead, our framework is inspired by the clinical workflow (Tanner-Whitehouse) of hand bone age assessment, which focuses on the key components of the hand. The proposed framework is composed of two components: a Mask R-CNN subnet of pixelwise hand segmentation and a residual attention network for hand bone age assessment. The Mask R-CNN subnet segments the hands from X-ray images to avoid the distractions of other objects (e.g., X-ray tags). The hierarchical attention components of the residual attention subnet force our network to focus on the key components of the X-ray images and generate the final predictions as well as the associated visual supports, which is similar to the assessment procedure of clinicians. We evaluate the performance of the proposed pipeline on the RSNA pediatric bone age dataset and the results demonstrate its superiority over the previous methods.


  Click for Model/Code and Paper
Comparing Rule-Based and Deep Learning Models for Patient Phenotyping

Mar 25, 2017
Sebastian Gehrmann, Franck Dernoncourt, Yeran Li, Eric T. Carlson, Joy T. Wu, Jonathan Welt, John Foote Jr., Edward T. Moseley, David W. Grant, Patrick D. Tyler, Leo Anthony Celi

Objective: We investigate whether deep learning techniques for natural language processing (NLP) can be used efficiently for patient phenotyping. Patient phenotyping is a classification task for determining whether a patient has a medical condition, and is a crucial part of secondary analysis of healthcare data. We assess the performance of deep learning algorithms and compare them with classical NLP approaches. Materials and Methods: We compare convolutional neural networks (CNNs), n-gram models, and approaches based on cTAKES that extract pre-defined medical concepts from clinical notes and use them to predict patient phenotypes. The performance is tested on 10 different phenotyping tasks using 1,610 discharge summaries extracted from the MIMIC-III database. Results: CNNs outperform other phenotyping algorithms in all 10 tasks. The average F1-score of our model is 76 (PPV of 83, and sensitivity of 71) with our model having an F1-score up to 37 points higher than alternative approaches. We additionally assess the interpretability of our model by presenting a method that extracts the most salient phrases for a particular prediction. Conclusion: We show that NLP methods based on deep learning improve the performance of patient phenotyping. Our CNN-based algorithm automatically learns the phrases associated with each patient phenotype. As such, it reduces the annotation complexity for clinical domain experts, who are normally required to develop task-specific annotation rules and identify relevant phrases. Our method performs well in terms of both performance and interpretability, which indicates that deep learning is an effective approach to patient phenotyping based on clinicians' notes.


  Click for Model/Code and Paper
Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Dec 27, 2019
William Lotter, Abdul Rahman Diab, Bryan Haslam, Jiye G. Kim, Giorgia Grisot, Eric Wu, Kevin Wu, Jorge Onieva Onieva, Jerrold L. Boxerman, Meiyun Wang, Mack Bandler, Gopal Vijayaraghavan, A. Gregory Sorensen

Breast cancer remains a global challenge, causing over 1 million deaths globally in 2018. To achieve earlier breast cancer detection, screening x-ray mammography is recommended by health organizations worldwide and has been estimated to decrease breast cancer mortality by 20-40%. Nevertheless, significant false positive and false negative rates, as well as high interpretation costs, leave opportunities for improving quality and access. To address these limitations, there has been much recent interest in applying deep learning to mammography; however, obtaining large amounts of annotated data poses a challenge for training deep learning models for this purpose, as does ensuring generalization beyond the populations represented in the training dataset. Here, we present an annotation-efficient deep learning approach that 1) achieves state-of-the-art performance in mammogram classification, 2) successfully extends to digital breast tomosynthesis (DBT; "3D mammography"), 3) detects cancers in clinically-negative prior mammograms of cancer patients, 4) generalizes well to a population with low screening rates, and 5) outperforms five-out-of-five full-time breast imaging specialists by improving absolute sensitivity by an average of 14%. Our results demonstrate promise towards software that can improve the accuracy of and access to screening mammography worldwide.


  Click for Model/Code and Paper