Exponential Convergence of the Deep Neural Network Approximation for Analytic Functions

Jul 01, 2018

Weinan E, Qingcan Wang

We prove that for analytic functions in low dimension, the convergence rate of the deep neural network approximation is exponential.
Jul 01, 2018

Weinan E, Qingcan Wang

**Click to Read Paper and Get Code**

A Priori Estimates of the Population Risk for Residual Networks

Mar 06, 2019

Weinan E, Chao Ma, Qingcan Wang

Optimal a priori estimates are derived for the population risk of a regularized residual network model. The key lies in the designing of a new path norm, called the weighted path norm, which serves as the regularization term in the regularized model. The weighted path norm treats the skip connections and the nonlinearities differently so that paths with more nonlinearities have larger weights. The error estimates are a priori in nature in the sense that the estimates depend only on the target function and not on the parameters obtained in the training process. The estimates are optimal in the sense that the bound scales as O(1/L) with the network depth and the estimation error is comparable to the Monte Carlo error rates. In particular, optimal error bounds are obtained, for the first time, in terms of the depth of the network model. Comparisons are made with existing norm-based generalization error bounds.
Mar 06, 2019

Weinan E, Chao Ma, Qingcan Wang

**Click to Read Paper and Get Code**

Featurized Bidirectional GAN: Adversarial Defense via Adversarially Learned Semantic Inference

Sep 29, 2018

Ruying Bao, Sihang Liang, Qingcan Wang

Deep neural networks have been demonstrated to be vulnerable to adversarial attacks, where small perturbations intentionally added to the original inputs can fool the classifier. In this paper, we propose a defense method, Featurized Bidirectional Generative Adversarial Networks (FBGAN), to extract the semantic features of the input and filter the non-semantic perturbation. FBGAN is pre-trained on the clean dataset in an unsupervised manner, adversarially learning a bidirectional mapping between the high-dimensional data space and the low-dimensional semantic space; also mutual information is applied to disentangle the semantically meaningful features. After the bidirectional mapping, the adversarial data can be reconstructed to denoised data, which could be fed into any pre-trained classifier. We empirically show the quality of reconstruction images and the effectiveness of defense.
Sep 29, 2018

Ruying Bao, Sihang Liang, Qingcan Wang

**Click to Read Paper and Get Code**

Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections

Apr 14, 2019

Weinan E, Chao Ma, Qingcan Wang, Lei Wu

The behavior of the gradient descent (GD) algorithm is analyzed for a deep neural network model with skip-connections. It is proved that in the over-parametrized regime, for a suitable initialization, with high probability GD can find a global minimum exponentially fast. Generalization error estimates along the GD path are also established. As a consequence, it is shown that when the target function is in the reproducing kernel Hilbert space (RKHS) with a kernel defined by the initialization, there exist generalizable early-stopping solutions along the GD path. In addition, it is also shown that the GD path is uniformly close to the functions given by the related random feature model. Consequently, in this "implicit regularization" setting, the deep neural network model deteriorates to a random feature model. Our results hold for neural networks of any width larger than the input dimension.
Apr 14, 2019

Weinan E, Chao Ma, Qingcan Wang, Lei Wu

* 29 pages, 4 figures

**Click to Read Paper and Get Code**

Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

Jul 16, 2018

Guiliang Liu, Oliver Schulte, Wang Zhu, Qingcan Li

Deep Reinforcement Learning (DRL) has achieved impressive success in many applications. A key component of many DRL models is a neural network representing a Q function, to estimate the expected cumulative reward following a state-action pair. The Q function neural network contains a lot of implicit knowledge about the RL problems, but often remains unexamined and uninterpreted. To our knowledge, this work develops the first mimic learning framework for Q functions in DRL. We introduce Linear Model U-trees (LMUTs) to approximate neural network predictions. An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment. Empirical evaluation shows that an LMUT mimics a Q function substantially better than five baseline methods. The transparent tree structure of an LMUT facilitates understanding the network's learned knowledge by analyzing feature influence, extracting rules, and highlighting the super-pixels in image inputs.
Jul 16, 2018

Guiliang Liu, Oliver Schulte, Wang Zhu, Qingcan Li

* This paper is accepted by ECML-PKDD 2018

**Click to Read Paper and Get Code**