Models, code, and papers for "Jun Huan":

Evaluating Robustness of Deep Image Super-Resolution against Adversarial Attacks

Apr 12, 2019
Jun-Ho Choi, Huan Zhang, Jun-Hyuk Kim, Cho-Jui Hsieh, Jong-Seok Lee

Single-image super-resolution aims to generate a high-resolution version of a low-resolution image, which serves as an essential component in many computer vision applications. This paper investigates the robustness of deep learning-based super-resolution methods against adversarial attacks, which can significantly deteriorate the super-resolved images without noticeable distortion in the attacked low-resolution images. It is demonstrated that state-of-the-art deep super-resolution methods are highly vulnerable to adversarial attacks. Different levels of robustness of different methods are analyzed theoretically and experimentally. We also present analysis on transferability of attacks, and feasibility of targeted attacks and universal attacks.


  Click for Model/Code and Paper
AGAN: Towards Automated Design of Generative Adversarial Networks

Jun 25, 2019
Hanchao Wang, Jun Huan

Recent progress in Generative Adversarial Networks (GANs) has shown promising signs of improving GAN training via architectural change. Despite some early success, at present the design of GAN architectures requires human expertise, laborious trial-and-error testings, and often draws inspiration from its image classification counterpart. In the current paper, we present the first neural architecture search algorithm, automated neural architecture search for deep generative models, or AGAN for abbreviation, that is specifically suited for GAN training. For unsupervised image generation tasks on CIFAR-10, our algorithm finds architecture that outperforms state-of-the-art models under same regularization techniques. For supervised tasks, the automatically searched architectures also achieve highly competitive performance, outperforming best human-invented architectures at resolution $32\times32$. Moreover, we empirically demonstrate that the modules learned by AGAN are transferable to other image generation tasks such as STL-10.


  Click for Model/Code and Paper
Discriminatory Transfer

Aug 07, 2017
Chao Lan, Jun Huan

We observe standard transfer learning can improve prediction accuracies of target tasks at the cost of lowering their prediction fairness -- a phenomenon we named discriminatory transfer. We examine prediction fairness of a standard hypothesis transfer algorithm and a standard multi-task learning algorithm, and show they both suffer discriminatory transfer on the real-world Communities and Crime data set. The presented case study introduces an interaction between fairness and transfer learning, as an extension of existing fairness studies that focus on single task learning.

* Presented as a poster at the 2017 Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML 2017) 

  Click for Model/Code and Paper
An Empirical Study on Regularization of Deep Neural Networks by Local Rademacher Complexity

Feb 14, 2019
Yingzhen Yang, Xingjian Li, Jun Huan

Regularization of Deep Neural Networks (DNNs) for the sake of improving their generalization capability is important and challenging. The development in this line benefits theoretical foundation of DNNs and promotes their usability in different areas of artificial intelligence. In this paper, we investigate the role of Rademacher complexity in improving generalization of DNNs and propose a novel regularizer rooted in Local Rademacher Complexity (LRC). While Rademacher complexity is well known as a distribution-free complexity measure of function class that help boost generalization of statistical learning methods, extensive study shows that LRC, its counterpart focusing on a restricted function class, leads to sharper convergence rates and potential better generalization given finite training sample. Our LRC based regularizer is developed by estimating the complexity of the function class centered at the minimizer of the empirical loss of DNNs. Experiments on various types of network architecture demonstrate the effectiveness of LRC regularization in improving generalization. Moreover, our method features the state-of-the-art result on the CIFAR-$10$ dataset with network architecture found by neural architecture search.

* Updated the link to the open source PaddlePaddle code of LRC Regularization 

  Click for Model/Code and Paper
FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary

Feb 13, 2019
Yingzhen Yang, Nebojsa Jojic, Jun Huan

We present a novel method of compression of deep Convolutional Neural Networks (CNNs). Our method reduces the number of parameters of each convolutional layer by learning a 3D tensor termed Filter Summary (FS). The convolutional filters are extracted from FS as overlapping 3D blocks, and nearby filters in FS share weights in their overlapping regions in a natural way. The resultant neural network based on such weight sharing scheme, termed Filter Summary CNNs or FSNet, has a FS in each convolution layer instead of a set of independent filters in the conventional convolution layer. FSNet has the same architecture as that of the baseline CNN to be compressed, and each convolution layer of FSNet generates the same number of filters from FS as that of the basline CNN in the forward process. Without hurting the inference speed, the parameter space of FSNet is much smaller than that of the baseline CNN. In addition, FSNet is compatible with weight quantization, leading to even higher compression ratio when combined with weight quantization. Experiments demonstrate the effectiveness of FSNet in compression of CNNs for computer vision tasks including image classification and object detection. For classification task, FSNet of 0.22M effective parameters has prediction accuracy of 93.91% on the CIFAR-10 dataset with less than 0.3% accuracy drop, using ResNet-18 of 11.18M parameters as baseline. Furthermore, FSNet version of ResNet-50 with 2.75M effective parameters achieves the top-1 and top-5 accuracy of 63.80% and 85.72% respectively on ILSVRC-12 benchmark. For object detection task, FSNet is used to compress the Single Shot MultiBox Detector (SSD300) of 26.32M parameters. FSNet of 0.45M effective parameters achieves mAP of 67.63% on the VOC2007 test data with weight quantization, and FSNet of 0.68M effective parameters achieves mAP of 70.00% with weight quantization on the same test data.


  Click for Model/Code and Paper
Instance-based Deep Transfer Learning

Sep 08, 2018
Tianyang Wang, Jun Huan, Michelle Zhu

Deep transfer learning has acquired significant research interest. It makes use of pre-trained models that are learned from a source domain, and utilizes these models for the tasks in a target domain. Model-based deep transfer learning is arguably the most frequently used method. However, very little work has been devoted to enhancing deep transfer learning by focusing on the influence of data. In this work, we propose an instance-based approach to improve deep transfer learning in target domain. Specifically, we choose a pre-trained model which is learned from a source domain, and utilize this model to estimate the influence of each training sample in a target domain. Then we optimize training data of the target domain by removing the training samples that will lower the performance of the pre-trained model. We then fine-tune the pre-trained model with the optimized training data in the target domain, or build a new model which can be initialized partially based on the pre-trained model, and fine-tune it with the optimized training data in the target domain. Using this approach, transfer learning can help deep learning models to learn more useful features. Extensive experiments demonstrate the effectiveness of our approach on further boosting deep learning models for typical high-level computer vision tasks, such as image classification.


  Click for Model/Code and Paper
Data Dropout: Optimizing Training Data for Convolutional Neural Networks

Sep 07, 2018
Tianyang Wang, Jun Huan, Bo Li

Deep learning models learn to fit training data while they are highly expected to generalize well to testing data. Most works aim at finding such models by creatively designing architectures and fine-tuning parameters. To adapt to particular tasks, hand-crafted information such as image prior has also been incorporated into end-to-end learning. However, very little progress has been made on investigating how an individual training sample will influence the generalization ability of a model. In other words, to achieve high generalization accuracy, do we really need all the samples in a training dataset? In this paper, we demonstrate that deep learning models such as convolutional neural networks may not favor all training samples, and generalization accuracy can be further improved by dropping those unfavorable samples. Specifically, the influence of removing a training sample is quantifiable, and we propose a Two-Round Training approach, aiming to achieve higher generalization accuracy. We locate unfavorable samples after the first round of training, and then retrain the model from scratch with the reduced training dataset in the second round. Since our approach is essentially different from fine-tuning or further training, the computational cost should not be a concern. Our extensive experimental results indicate that, with identical settings, the proposed approach can boost performance of the well-known networks on both high-level computer vision problems such as image classification, and low-level vision problems such as image denoising.


  Click for Model/Code and Paper
On the Unreported-Profile-is-Negative Assumption for Predictive Cheminformatics

Aug 07, 2017
Chao Lan, Sai Nivedita Chandrasekaran, Jun Huan

In cheminformatics, compound-target binding profiles has been a main source of data for research. For data repositories that only provide positive profiles, a popular assumption is that unreported profiles are all negative. In this paper, we caution audience not to take this assumption for granted, and present empirical evidence of its ineffectiveness from a machine learning perspective. Our examination is based on a setting where binding profiles are used as features to train predictive models; we show (1) prediction performance degrades when the assumption fails and (2) explicit recovery of unreported profiles improves prediction performance. In particular, we propose a framework that jointly recovers profiles and learns predictive model, and show it achieves further performance improvement. The presented study not only suggests applying matrix recovery methods to recover unreported profiles, but also initiates a new missing feature problem which we called Learning with Positive and Unknown Features.

* the quality of the current version is unsatisfactory. we decide to withdraw the manuscript. thank you 

  Click for Model/Code and Paper
Improving Adversarial Robustness via Attention and Adversarial Logit Pairing

Aug 23, 2019
Dou Goodman, Xingjian Li, Jun Huan, Tao Wei

Though deep neural networks have achieved the state of the art performance in visual classification, recent studies have shown that they are all vulnerable to the attack of adversarial examples. In this paper, we develop improved techniques for defending against adversarial examples.First, we introduce enhanced defense using a technique we call \textbf{Attention and Adversarial Logit Pairing(AT+ALP)}, a method that encourages both attention map and logit for pairs of examples to be similar. When applied to clean examples and their adversarial counterparts, \textbf{AT+ALP} improves accuracy on adversarial examples over adversarial training.Next,We show that our \textbf{AT+ALP} can effectively increase the average activations of adversarial examples in the key area and demonstrate that it focuse on more discriminate features to improve the robustness of the model.Finally,we conducte extensive experiments using a wide range of datasets and the experiment results show that our \textbf{AT+ALP} achieves \textbf{the state of the art} defense.For example,on \textbf{17 Flower Category Database}, under strong 200-iteration \textbf{PGD} gray-box and black-box attacks where prior art has 34\% and 39\% accuracy, our method achieves \textbf{50\%} and \textbf{51\%}.Compared with previous work,our work is evaluated under highly challenging PGD attack:the maximum perturbation $\epsilon \in \{0.25,0.5\}$ i.e. $L_\infty \in \{0.25,0.5\}$ with 10 to 200 attack iterations.To our knowledge, such a strong attack has not been previously explored on a wide range of datasets.


  Click for Model/Code and Paper
Quasi-potential as an implicit regularizer for the loss function in the stochastic gradient descent

Jan 18, 2019
Wenqing Hu, Zhanxing Zhu, Haoyi Xiong, Jun Huan

We interpret the variational inference of the Stochastic Gradient Descent (SGD) as minimizing a new potential function named the \textit{quasi-potential}. We analytically construct the quasi-potential function in the case when the loss function is convex and admits only one global minimum point. We show in this case that the quasi-potential function is related to the noise covariance structure of SGD via a partial differential equation of Hamilton-Jacobi type. This relation helps us to show that anisotropic noise leads to faster escape than isotropic noise. We then consider the dynamics of SGD in the case when the loss function is non-convex and admits several different local minima. In this case, we demonstrate an example that shows how the noise covariance structure plays a role in "implicit regularization", a phenomenon in which SGD favors some particular local minimum points. This is done through the relation between the noise covariance structure and the quasi-potential function. Our analysis is based on Large Deviations Theory (LDT), and they are validated by numerical experiments.

* first and preliminary version 

  Click for Model/Code and Paper
Fast Universal Style Transfer for Artistic and Photorealistic Rendering

Jul 06, 2019
Jie An, Haoyi Xiong, Jiebo Luo, Jun Huan, Jinwen Ma

Universal style transfer is an image editing task that renders an input content image using the visual style of arbitrary reference images, including both artistic and photorealistic stylization. Given a pair of images as the source of content and the reference of style, existing solutions usually first train an auto-encoder (AE) to reconstruct the image using deep features and then embeds pre-defined style transfer modules into the AE reconstruction procedure to transfer the style of the reconstructed image through modifying the deep features. While existing methods typically need multiple rounds of time-consuming AE reconstruction for better stylization, our work intends to design novel neural network architectures on top of AE for fast style transfer with fewer artifacts and distortions all in one pass of end-to-end inference. To this end, we propose two network architectures named ArtNet and PhotoNet to improve artistic and photo-realistic stylization, respectively. Extensive experiments demonstrate that ArtNet generates images with fewer artifacts and distortions against the state-of-the-art artistic transfer algorithms, while PhotoNet improves the photorealistic stylization results by creating sharp images faithfully preserving rich details of the input content. Moreover, ArtNet and PhotoNet can achieve 3X to 100X speed-up over the state-of-the-art algorithms, which is a major advantage for large content images.


  Click for Model/Code and Paper
StyleNAS: An Empirical Study of Neural Architecture Search to Uncover Surprisingly Fast End-to-End Universal Style Transfer Networks

Jun 06, 2019
Jie An, Haoyi Xiong, Jinwen Ma, Jiebo Luo, Jun Huan

Neural Architecture Search (NAS) has been widely studied for designing discriminative deep learning models such as image classification, object detection, and semantic segmentation. As a large number of priors have been obtained through the manual design of architectures in the fields, NAS is usually considered as a supplement approach. In this paper, we have significantly expanded the application areas of NAS by performing an empirical study of NAS to search generative models, or specifically, auto-encoder based universal style transfer, which lacks systematic exploration, if any, from the architecture search aspect. In our work, we first designed a search space where common operators for image style transfer such as VGG-based encoders, whitening and coloring transforms (WCT), convolution kernels, instance normalization operators, and skip connections were searched in a combinatorial approach. With a simple yet effective parallel evolutionary NAS algorithm with multiple objectives, we derived the first group of end-to-end deep networks for universal photorealistic style transfer. Comparing to random search, a NAS method that is gaining popularity recently, we demonstrated that carefully designed search strategy leads to much better architecture design. Finally compared to existing universal style transfer networks for photorealistic rendering such as PhotoWCT that stacks multiple well-trained auto-encoders and WCT transforms in a non-end-to-end manner, the architectures designed by StyleNAS produce better style-transferred images with details preserving, using a tiny number of operators/parameters, and enjoying around 500x inference time speed-up.


  Click for Model/Code and Paper
The Multiplicative Noise in Stochastic Gradient Descent: Data-Dependent Regularization, Continuous and Discrete Approximation

Jun 18, 2019
Jingfeng Wu, Wenqing Hu, Haoyi Xiong, Jun Huan, Zhanxing Zhu

The randomness in Stochastic Gradient Descent (SGD) is considered to play a central role in the observed strong generalization capability of deep learning. In this work, we re-interpret the stochastic gradient of vanilla SGD as a matrix-vector product of the matrix of gradients and a random noise vector (namely multiplicative noise, M-Noise). Comparing to the existing theory that explains SGD using additive noise, the M-Noise helps establish a general case of SGD, namely Multiplicative SGD (M-SGD). The advantage of M-SGD is that it decouples noise from parameters, providing clear insights at the inherent randomness in SGD. Our analysis shows that 1) the M-SGD family, including the vanilla SGD, can be viewed as an minimizer with a data-dependent regularizer resemble of Rademacher complexity, which contributes to the implicit bias of M-SGD; 2) M-SGD holds a strong convergence to a continuous stochastic differential equation under the Gaussian noise assumption, ensuring the path-wise closeness of the discrete and continuous dynamics. For applications, based on M-SGD we design a fast algorithm to inject noise of different types (e.g., Gaussian and Bernoulli) into gradient descent. Based on the algorithm, we further demonstrate that M-SGD can approximate SGD with various noise types and recover the generalization performance, which reveals the potential of M-SGD to solve practical deep learning problems, e.g., large batch training with strong generalization performance. We have validated our observations on multiple practical deep learning scenarios.


  Click for Model/Code and Paper
Fast Interactive Object Annotation with Curve-GCN

Mar 16, 2019
Huan Ling, Jun Gao, Amlan Kar, Wenzheng Chen, Sanja Fidler

Manually labeling objects by tracing their boundaries is a laborious process. In Polygon-RNN++ the authors proposed Polygon-RNN that produces polygonal annotations in a recurrent manner using a CNN-RNN architecture, allowing interactive correction via humans-in-the-loop. We propose a new framework that alleviates the sequential nature of Polygon-RNN, by predicting all vertices simultaneously using a Graph Convolutional Network (GCN). Our model is trained end-to-end. It supports object annotation by either polygons or splines, facilitating labeling efficiency for both line-based and curved objects. We show that Curve-GCN outperforms all existing approaches in automatic mode, including the powerful PSP-DeepLab and is significantly more efficient in interactive mode than Polygon-RNN++. Our model runs at 29.3ms in automatic, and 2.6ms in interactive mode, making it 10x and 100x faster than Polygon-RNN++.

* In Computer Vision and Pattern Recognition (CVPR), Long Beach, US, 2019 

  Click for Model/Code and Paper
Learning Social Circles in Ego Networks based on Multi-View Social Graphs

Dec 24, 2016
Chao Lan, Yuhao Yang, Xiaoli Li, Bo Luo, Jun Huan

In social network analysis, automatic social circle detection in ego-networks is becoming a fundamental and important task, with many potential applications such as user privacy protection or interest group recommendation. So far, most studies have focused on addressing two questions, namely, how to detect overlapping circles and how to detect circles using a combination of network structure and network node attributes. This paper asks an orthogonal research question, that is, how to detect circles based on network structures that are (usually) described by multiple views? Our investigation begins with crawling ego-networks from Twitter and employing classic techniques to model their structures by six views, including user relationships, user interactions and user content. We then apply both standard and our modified multi-view spectral clustering techniques to detect social circles in these ego-networks. Based on extensive automatic and manual experimental evaluations, we deliver two major findings: first, multi-view clustering techniques perform better than common single-view clustering techniques, which only use one view or naively integrate all views for detection, second, the standard multi-view clustering technique is less robust than our modified technique, which selectively transfers information across views based on an assumption that sparse network structures are (potentially) incomplete. In particular, the second finding makes us believe a direct application of standard clustering on potentially incomplete networks may yield biased results. We lightly examine this issue in theory, where we derive an upper bound for such bias by integrating theories of spectral clustering and matrix perturbation, and discuss how it may be affected by several network characteristics.

* This paper has been withdrawn by the author due to its current unsatisfactory quality 

  Click for Model/Code and Paper
DELTA: DEep Learning Transfer using Feature Map with Attention for Convolutional Networks

Jan 26, 2019
Xingjian Li, Haoyi Xiong, Hanchao Wang, Yuxuan Rao, Liping Liu, Jun Huan

Transfer learning through fine-tuning a pre-trained neural network with an extremely large dataset, such as ImageNet, can significantly accelerate training while the accuracy is frequently bottlenecked by the limited dataset size of the new target task. To solve the problem, some regularization methods, constraining the outer layer weights of the target network using the starting point as references (SPAR), have been studied. In this paper, we propose a novel regularized transfer learning framework DELTA, namely DEep Learning Transfer using Feature Map with Attention. Instead of constraining the weights of neural network, DELTA aims to preserve the outer layer outputs of the target network. Specifically, in addition to minimizing the empirical loss, DELTA intends to align the outer layer outputs of two networks, through constraining a subset of feature maps that are precisely selected by attention that has been learned in an supervised learning manner. We evaluate DELTA with the state-of-the-art algorithms, including L2 and L2-SP. The experiment results show that our proposed method outperforms these baselines with higher accuracy for new tasks.

* Accepted at ICLR 2019 

  Click for Model/Code and Paper
NormLime: A New Feature Importance Metric for Explaining Deep Neural Networks

Oct 15, 2019
Isaac Ahern, Adam Noack, Luis Guzman-Nateras, Dejing Dou, Boyang Li, Jun Huan

The problem of explaining deep learning models, and model predictions generally, has attracted intensive interest recently. Many successful approaches forgo global approximations in order to provide more faithful local interpretations of the model's behavior. LIME develops multiple interpretable models, each approximating a large neural network on a small region of the data manifold and SP-LIME aggregates the local models to form a global interpretation. Extending this line of research, we propose a simple yet effective method, NormLIME for aggregating local models into global and class-specific interpretations. A human user study strongly favored class-specific interpretations created by NormLIME to other feature importance metrics. Numerical experiments confirm that NormLIME is effective at recognizing important features.


  Click for Model/Code and Paper
Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

Aug 03, 2019
Wenzheng Chen, Jun Gao, Huan Ling, Edward J. Smith, Jaakko Lehtinen, Alec Jacobson, Sanja Fidler

Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering. Enabling ML models to understand image formation might be key for generalization. However, due to an essential rasterization step involving discrete assignment operations, rendering pipelines are non-differentiable and thus largely inaccessible to gradient-based ML techniques. In this paper, we present DIB-R, a differentiable rendering framework which allows gradients to be analytically computed for all pixels in an image. Key to our approach is to view foreground rasterization as a weighted interpolation of local properties and background rasterization as an distance-based aggregation of global geometry. Our approach allows for accurate optimization over vertex positions, colors, normals, light directions and texture coordinates through a variety of lighting models. We showcase our approach in two ML applications: single-image 3D object prediction, and 3D textured object generation, both trained using exclusively using 2D supervision. Our project website is: https://nv-tlabs.github.io/DIB-R/

* https://nv-tlabs.github.io/DIB-R/ 

  Click for Model/Code and Paper