Models, code, and papers for "Di Wang":

Differentially Private High Dimensional Sparse Covariance Matrix Estimation

Jan 18, 2019
Di Wang, Jinhui Xu

In this paper, we study the problem of estimating the covariance matrix under differential privacy, where the underlying covariance matrix is assumed to be sparse and of high dimensions. We propose a new method, called DP-Thresholding, to achieve a non-trivial $\ell_2$-norm based error bound, which is significantly better than the existing ones from adding noise directly to the empirical covariance matrix. We also extend the $\ell_2$-norm based error bound to a general $\ell_w$-norm based one for any $1\leq w\leq \infty$, and show that they share the same upper bound asymptotically. Our approach can be easily extended to local differential privacy. Experiments on the synthetic datasets show consistent results with our theoretical claims.

* A short version will be appeared in CISS 2019 

  Access Model/Code and Paper
Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioning

Feb 09, 2018
Di Wang, Jinhui Xu

In this paper, we revisit the large-scale constrained linear regression problem and propose faster methods based on some recent developments in sketching and optimization. Our algorithms combine (accelerated) mini-batch SGD with a new method called two-step preconditioning to achieve an approximate solution with a time complexity lower than that of the state-of-the-art techniques for the low precision case. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity. Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.

* Appear in AAAI-18 

  Access Model/Code and Paper
Learning Spatial Fusion for Single-Shot Object Detection

Nov 25, 2019
Songtao Liu, Di Huang, Yunhong Wang

Pyramidal feature representation is the common practice to address the challenge of scale variation in object detection. However, the inconsistency across different feature scales is a primary limitation for the single-shot detectors based on feature pyramid. In this work, we propose a novel and data driven strategy for pyramidal feature fusion, referred to as adaptively spatial feature fusion (ASFF). It learns the way to spatially filter conflictive information to suppress the inconsistency, thus improving the scale-invariance of features, and introduces nearly free inference overhead. With the ASFF strategy and a solid baseline of YOLOv3, we achieve the best speed-accuracy trade-off on the MS COCO dataset, reporting 38.1% AP at 60 FPS, 42.4% AP at 45 FPS and 43.9% AP at 29 FPS. The code is available at

  Access Model/Code and Paper
Knowledge-Enriched Transformer for Emotion Detection in Textual Conversations

Oct 01, 2019
Peixiang Zhong, Di Wang, Chunyan Miao

Messages in human conversations inherently convey emotions. The task of detecting emotions in textual conversations leads to a wide range of applications such as opinion mining in social networks. However, enabling machines to analyze emotions in conversations is challenging, partly because humans often rely on the context and commonsense knowledge to express emotions. In this paper, we address these challenges by proposing a Knowledge-Enriched Transformer (KET), where contextual utterances are interpreted using hierarchical self-attention and external commonsense knowledge is dynamically leveraged using a context-aware affective graph attention mechanism. Experiments on multiple textual conversation datasets demonstrate that both context and commonsense knowledge are consistently beneficial to the emotion detection performance. In addition, the experimental results show that our KET model outperforms the state-of-the-art models on most of the tested datasets in F1 score.

* EMNLP 2019 

  Access Model/Code and Paper
Faster width-dependent algorithm for mixed packing and covering LPs

Sep 26, 2019
Digvijay Boob, Saurabh Sawlani, Di Wang

In this paper, we give a faster width-dependent algorithm for mixed packing-covering LPs. Mixed packing-covering LPs are fundamental to combinatorial optimization in computer science and operations research. Our algorithm finds a $1+\eps$ approximate solution in time $O(Nw/ \eps)$, where $N$ is number of nonzero entries in the constraint matrix and $w$ is the maximum number of nonzeros in any constraint. This run-time is better than Nesterov's smoothing algorithm which requires $O(N\sqrt{n}w/ \eps)$ where $n$ is the dimension of the problem. Our work utilizes the framework of area convexity introduced in [Sherman-FOCS'17] to obtain the best dependence on $\eps$ while breaking the infamous $\ell_{\infty}$ barrier to eliminate the factor of $\sqrt{n}$. The current best width-independent algorithm for this problem runs in time $O(N/\eps^2)$ [Young-arXiv-14] and hence has worse running time dependence on $\eps$. Many real life instances of the mixed packing-covering problems exhibit small width and for such cases, our algorithm can report higher precision results when compared to width-independent algorithms. As a special case of our result, we report a $1+\eps$ approximation algorithm for the densest subgraph problem which runs in time $O(md/ \eps)$, where $m$ is the number of edges in the graph and $d$ is the maximum graph degree.

* Accepted for oral presentation at NeurIPS 2019 

  Access Model/Code and Paper
EEG-Based Emotion Recognition Using Regularized Graph Neural Networks

Aug 26, 2019
Peixiang Zhong, Di Wang, Chunyan Miao

EEG signals measure the neuronal activities on different brain regions via electrodes. Many existing studies on EEG-based emotion recognition do not exploit the topological structure of EEG signals. In this paper, we propose a regularized graph neural network (RGNN) for EEG-based emotion recognition, which is biologically supported and captures both local and global inter-channel relations. Specifically, we model the inter-channel relations in EEG signals via an adjacency matrix in our graph neural network where the connection and sparseness of the adjacency matrix are supported by the neurosicience theories of human brain organization. In addition, we propose two regularizers, namely node-wise domain adversarial training (NodeDAT) and emotion-aware distribution learning (EmotionDL), to improve the robustness of our model against cross-subject EEG variations and noisy labels, respectively. To thoroughly evaluate our model, we conduct extensive experiments in both subject-dependent and subject-independent classification settings on two public datasets: SEED and SEED-IV. Our model obtains better performance than competitive baselines such as SVM, DBN, DGCNN, BiDANN, and the state-of-the-art BiHDM in most experimental settings . Our model analysis demonstrates that the proposed biologically supported adjacency matrix and two regularizers contribute consistent and significant gain to the performance. Investigations on the neuronal activities reveal that pre-frontal, parietal and occipital regions may be the most informative regions for emotion recognition, which is consistent with relevant prior studies. In addition, experimental results suggest that global inter-channel relations between the left and right hemispheres are important for emotion recognition and local inter-channel relations between (FP1, AF3), (F6, F8) and (FP2, AF4) may also provide useful information.

* 13 pages 

  Access Model/Code and Paper
Neural Learning of Online Consumer Credit Risk

Jun 05, 2019
Di Wang, Qi Wu, Wen Zhang

This paper takes a deep learning approach to understand consumer credit risk when e-commerce platforms issue unsecured credit to finance customers' purchase. The "NeuCredit" model can capture both serial dependences in multi-dimensional time series data when event frequencies in each dimension differ. It also captures nonlinear cross-sectional interactions among different time-evolving features. Also, the predicted default probability is designed to be interpretable such that risks can be decomposed into three components: the subjective risk indicating the consumers' willingness to repay, the objective risk indicating their ability to repay, and the behavioral risk indicating consumers' behavioral differences. Using a unique dataset from one of the largest global e-commerce platforms, we show that the inclusion of shopping behavioral data, besides conventional payment records, requires a deep learning approach to extract the information content of these data, which turns out significantly enhancing forecasting performance than the traditional machine learning methods.

* 49 pages, 11 tables, 7 figures 

  Access Model/Code and Paper
Adaptive NMS: Refining Pedestrian Detection in a Crowd

Apr 07, 2019
Songtao Liu, Di Huang, Yunhong Wang

Pedestrian detection in a crowd is a very challenging issue. This paper addresses this problem by a novel Non-Maximum Suppression (NMS) algorithm to better refine the bounding boxes given by detectors. The contributions are threefold: (1) we propose adaptive-NMS, which applies a dynamic suppression threshold to an instance, according to the target density; (2) we design an efficient subnetwork to learn density scores, which can be conveniently embedded into both the single-stage and two-stage detectors; and (3) we achieve state of the art results on the CityPersons and CrowdHuman benchmarks.

* To appear at CVPR 2019 (Oral) 

  Access Model/Code and Paper
Differentially Private Empirical Risk Minimization in Non-interactive Local Model via Polynomial of Inner Product Approximation

Dec 17, 2018
Di Wang, Adam Smith, Jinhui Xu

In this paper, we study the Empirical Risk Minimization problem in the non-interactive Local Differential Privacy (LDP) model. First, we show that for the hinge loss function, there is an $(\epsilon, \delta)$-LDP algorithm whose sample complexity for achieving an error of $\alpha$ is only linear in the dimensionality $p$ and quasi-polynomial in other terms. Then, we extend the result to any $1$-Lipschitz generalized linear convex loss functions by showing that every such function can be approximated by a linear combination of hinge loss functions and some linear functions. Finally, we apply our technique to the Euclidean median problem and show that its sample complexity needs only to be quasi-polynomial in $p$, which is the first result with a sub-exponential sample complexity in $p$ for non-generalized linear loss functions. Our results are based on a technique, called polynomial of inner product approximation, which may be applicable to other problems.

* To appear in Algorithmic Learning Theory 2019 (ALT 2019), draft version 

  Access Model/Code and Paper
An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss

Nov 17, 2018
Peixiang Zhong, Di Wang, Chunyan Miao

Affect conveys important implicit information in human communication. Having the capability to correctly express affect during human-machine conversations is one of the major milestones in artificial intelligence. In recent years, extensive research on open-domain neural conversational models has been conducted. However, embedding affect into such models is still under explored. In this paper, we propose an end-to-end affect-rich open-domain neural conversational model that produces responses not only appropriate in syntax and semantics, but also with rich affect. Our model extends the Seq2Seq model and adopts VAD (Valence, Arousal and Dominance) affective notations to embed each word with affects. In addition, our model considers the effect of negators and intensifiers via a novel affective attention mechanism, which biases attention towards affect-rich words in input sentences. Lastly, we train our model with an affect-incorporated objective function to encourage the generation of affect-rich words in the output responses. Evaluations based on both perplexity and human evaluations show that our model outperforms the state-of-the-art baseline model of comparable size in producing natural and affect-rich responses.

* AAAI-19 

  Access Model/Code and Paper
Query-Free Attacks on Industry-Grade Face Recognition Systems under Resource Constraints

Aug 22, 2018
Di Tang, XiaoFeng Wang, Kehuan Zhang

To launch black-box attacks against a Deep Neural Network (DNN) based Face Recognition (FR) system, one needs to build \textit{substitute} models to simulate the target model, so the adversarial examples discovered from substitute models could also mislead the target model. Such \textit{transferability} is achieved in recent studies through querying the target model to obtain data for training the substitute models. A real-world target, likes the FR system of law enforcement, however, is less accessible to the adversary. To attack such a system, a substitute model with similar quality as the target model is needed to identify their common defects. This is hard since the adversary often does not have the enough resources to train such a powerful model (hundreds of millions of images and rooms of GPUs are needed to train a commercial FR system). We found in our research, however, that a resource-constrained adversary could still effectively approximate the target model's capability to recognize \textit{specific} individuals, by training \textit{biased} substitute models on additional images of those victims whose identities the attacker want to cover or impersonate. This is made possible by a new property we discovered, called \textit{Nearly Local Linearity} (NLL), which models the observation that an ideal DNN model produces the image representations (embeddings) whose distances among themselves truthfully describe the human perception of the differences among the input images. By simulating this property around the victim's images, we significantly improve the transferability of black-box impersonation attacks by nearly 50\%. Particularly, we successfully attacked a commercial system trained over 20 million images, using 4 million images and 1/5 of the training time but achieving 62\% transferability in an impersonation attack and 89\% in a dodging attack.

  Access Model/Code and Paper
Bike Flow Prediction with Multi-Graph Convolutional Networks

Jul 28, 2018
Di Chai, Leye Wang, Qiang Yang

One fundamental issue in managing bike sharing systems is the bike flow prediction. Due to the hardness of predicting the flow for a single station, recent research works often predict the bike flow at cluster-level. While such studies gain satisfactory prediction accuracy, they cannot directly guide some fine-grained bike sharing system management issues at station-level. In this paper, we revisit the problem of the station-level bike flow prediction, aiming to boost the prediction accuracy leveraging the breakthroughs of deep learning techniques. We propose a new multi-graph convolutional neural network model to predict the bike flow at station-level, where the key novelty is viewing the bike sharing system from the graph perspective. More specifically, we construct multiple inter-station graphs for a bike sharing system. In each graph, nodes are stations, and edges are a certain type of relations between stations. Then, multiple graphs are constructed to reflect heterogeneous relationships (e.g., distance, ride record correlation). Afterward, we fuse the multiple graphs and then apply the convolutional layers on the fused graph to predict station-level future bike flow. In addition to the estimated bike flow value, our model also gives the prediction confidence interval so as to help the bike sharing system managers make decisions. Using New York City and Chicago bike sharing data for experiments, our model can outperform state-of-the-art station-level prediction models by reducing 25.1% and 17.0% of prediction error in New York City and Chicago, respectively.

  Access Model/Code and Paper
Receptive Field Block Net for Accurate and Fast Object Detection

Jul 26, 2018
Songtao Liu, Di Huang, Yunhong Wang

Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and Inception, benefiting from their powerful feature representations but suffering from high computational costs. Conversely, some lightweight model based detectors fulfil real time processing, while their accuracies are often criticized. In this paper, we explore an alternative to build a fast and accurate detector by strengthening lightweight features using a hand-crafted mechanism. Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the feature discriminability and robustness. We further assemble RFB to the top of SSD, constructing the RFB Net detector. To evaluate its effectiveness, experiments are conducted on two major benchmarks and the results show that RFB Net is able to reach the performance of advanced very deep detectors while keeping the real-time speed. Code is available at

* Accepted by ECCV 2018 

  Access Model/Code and Paper
Empirical Risk Minimization in Non-interactive Local Differential Privacy: Efficiency and High Dimensional Case

May 16, 2018
Di Wang, Marco Gaboardi, Jinhui Xu

In this paper, we study the Empirical Risk Minimization problem in the non-interactive local model of differential privacy. In the case of constant or low dimensionality ($p\ll n$), we first show that if the ERM loss function is $(\infty, T)$-smooth, then we can avoid a dependence of the sample complexity, to achieve error $\alpha$, on the exponential of the dimensionality $p$ with base $1/\alpha$ (i.e., $\alpha^{-p}$), which answers a question in [smith 2017 interaction]. Our approach is based on polynomial approximation. Then, we propose player-efficient algorithms with $1$-bit communication complexity and $O(1)$ computation cost for each player. The error bound is asymptotically the same as the original one. Also with additional assumptions we show a server efficient algorithm. Next we consider the high dimensional case ($n\ll p$), we show that if the loss function is Generalized Linear function and convex, then we could get an error bound which is dependent on the Gaussian width of the underlying constrained set instead of $p$, which is lower than that in [smith 2017 interaction].

* Add a new section on high dimensional case 

  Access Model/Code and Paper
Differentially Private Empirical Risk Minimization Revisited: Faster and More General

Feb 14, 2018
Di Wang, Minwei Ye, Jinhui Xu

In this paper we study the differentially private Empirical Risk Minimization (ERM) problem in different settings. For smooth (strongly) convex loss function with or without (non)-smooth regularization, we give algorithms that achieve either optimal or near optimal utility bounds with less gradient complexity compared with previous work. For ERM with smooth convex loss function in high-dimensional ($p\gg n$) setting, we give an algorithm which achieves the upper bound with less gradient complexity than previous ones. At last, we generalize the expected excess empirical risk from convex loss functions to non-convex ones satisfying the Polyak-Lojasiewicz condition and give a tighter upper bound on the utility than the one in \cite{ijcai2017-548}.

* Thirty-first Annual Conference on Neural Information Processing Systems (NIPS-2017) 

  Access Model/Code and Paper
Self-supervised Image Enhancement Network: Training with Low Light Images Only

Feb 26, 2020
Yu Zhang, Xiaoguang Di, Bin Zhang, Chunhui Wang

This paper proposes a self-supervised low light image enhancement method based on deep learning. Inspired by information entropy theory and Retinex model, we proposed a maximum entropy based Retinex model. With this model, a very simple network can separate the illumination and reflectance, and the network can be trained with low light images only. We introduce a constraint that the maximum channel of the reflectance conforms to the maximum channel of the low light image and its entropy should be largest in our model to achieve self-supervised learning. Our model is very simple and does not rely on any well-designed data set (even one low light image can complete the training). The network only needs minute-level training to achieve image enhancement. It can be proved through experiments that the proposed method has reached the state-of-the-art in terms of processing speed and effect.

* 14 pages,13 figures 

  Access Model/Code and Paper
Robust Feature-Based Point Registration Using Directional Mixture Model

Nov 25, 2019
Saman Fahandezh-Saadi, Di Wang, Masayoshi Tomizuka

This paper presents a robust probabilistic point registration method for estimating the rigid transformation (i.e. rotation matrix and translation vector) between two pointcloud dataset. The method improves the robustness of point registration and consequently the robot localization in the presence of outliers in the pointclouds which always occurs due to occlusion, dynamic objects, and sensor errors. The framework models the point registration task based on directional statistics on a unit sphere. In particular, a Kent distribution mixture model is adopted and the process of point registration has been carried out in the two phases of Expectation-Maximization algorithm. The proposed method has been evaluated on the pointcloud dataset from LiDAR sensors in an indoor environment.

  Access Model/Code and Paper