2D fully convolutional network has been recently successfully applied to object detection from images. In this paper, we extend the fully convolutional network based detection techniques to 3D and apply it to point cloud data. The proposed approach is verified on the task of vehicle detection from lidar point cloud for autonomous driving. Experiments on the KITTI dataset shows a significant performance improvement over the previous point cloud based detection approaches.

**Click to Read Paper and Get Code**
Adversarial adaptive 1-D convolutional neural networks for bearing fault diagnosis under varying working condition

May 09, 2018

Bo Zhang, Wei Li, Jie Hao, Xiao-Li Li, Meng Zhang

Traditional intelligent fault diagnosis of rolling bearings work well only under a common assumption that the labeled training data (source domain) and unlabeled testing data (target domain) are drawn from the same distribution. However, in many real-world applications, this assumption does not hold, especially when the working condition varies. In this paper, a new adversarial adaptive 1-D CNN called A2CNN is proposed to address this problem. A2CNN consists of four parts, namely, a source feature extractor, a target feature extractor, a label classifier and a domain discriminator. The layers between the source and target feature extractor are partially untied during the training stage to take both training efficiency and domain adaptation into consideration. Experiments show that A2CNN has strong fault-discriminative and domain-invariant capacity, and therefore can achieve high accuracy under different working conditions. We also visualize the learned features and the networks to explore the reasons behind the high performance of our proposed model.
May 09, 2018

Bo Zhang, Wei Li, Jie Hao, Xiao-Li Li, Meng Zhang

**Click to Read Paper and Get Code**

Efficient Relative Pose Estimation for Cameras and Generalized Cameras in Case of Known Relative Rotation Angle

Jan 31, 2019

Evgeniy Martyushev, Bo Li

We propose two minimal solutions to the problem of relative pose estimation of (i) a calibrated camera from four points in two views and (ii) a calibrated generalized camera from five points in two views. In both cases, the relative rotation angle between the views is assumed to be known. In practice, such angle can be derived from the readings of a 3d gyroscope. We represent the rotation part of the motion in terms of unit quaternions in order to construct polynomial equations encoding the epipolar constraints. The Gr\"{o}bner basis technique is then used to efficiently derive the solutions. Our first solver for regular cameras significantly improves the existing state-of-the-art solution. The second solver for generalized cameras is novel. The presented minimal solvers can be used in a hypothesize-and-test architecture such as RANSAC for reliable pose estimation. Experiments on synthetic and real datasets confirm that our algorithms are numerically stable, fast and robust.
Jan 31, 2019

Evgeniy Martyushev, Bo Li

* 15 pages, 9 eps-figures

**Click to Read Paper and Get Code**

The function space of deep-learning machines is investigated by studying growth in the entropy of functions of a given error with respect to a reference function, realized by a deep-learning machine. Using physics-inspired methods we study both sparsely and densely-connected architectures to discover a layer-wise convergence of candidate functions, marked by a corresponding reduction in entropy when approaching the reference function, gain insight into the importance of having a large number of layers, and observe phase transitions as the error increases.

* Phys. Rev. Lett. 120, 248301 (2018)

* New examples of networks with ReLU activation and convolutional networks are included

* Phys. Rev. Lett. 120, 248301 (2018)

* New examples of networks with ReLU activation and convolutional networks are included

**Click to Read Paper and Get Code**
3D model retrieval techniques can be classified as histogram-based, view-based and graph-based approaches. We propose a hybrid shape descriptor which combines the global and local radial distance features by utilizing the histogram-based and view-based approaches respectively. We define an area-weighted global radial distance with respect to the center of the bounding sphere of the model and encode its distribution into a 2D histogram as the global radial distance shape descriptor. We then uniformly divide the bounding cube of a 3D model into a set of small cubes and define their centers as local centers. Then, we compute the local radial distance of a point based on the nearest local center. By sparsely sampling a set of views and encoding the local radial distance feature on the rendered views by color coding, we extract the local radial distance shape descriptor. Based on these two shape descriptors, we develop a hybrid radial distance shape descriptor for 3D model retrieval. Experiment results show that our hybrid shape descriptor outperforms several typical histogram-based and view-based approaches.

* The International Workshop on Advanced Image Technology (IWAIT2010), 2010

* 6

* The International Workshop on Advanced Image Technology (IWAIT2010), 2010

* 6

**Click to Read Paper and Get Code**
Pairwise Teacher-Student Network for Semi-Supervised Hashing

Feb 02, 2019

Shifeng Zhang, Jianmin Li, Bo Zhang

Hashing method maps similar high-dimensional data to binary hashcodes with smaller hamming distance, and it has received broad attention due to its low storage cost and fast retrieval speed. Pairwise similarity is easily obtained and widely used for retrieval, and most supervised hashing algorithms are carefully designed for the pairwise supervisions. As labeling all data pairs is difficult, semi-supervised hashing is proposed which aims at learning efficient codes with limited labeled pairs and abundant unlabeled ones. Existing methods build graphs to capture the structure of dataset, but they are not working well for complex data as the graph is built based on the data representations and determining the representations of complex data is difficult. In this paper, we propose a novel teacher-student semi-supervised hashing framework in which the student is trained with the pairwise information produced by the teacher network. The network follows the smoothness assumption, which achieves consistent distances for similar data pairs so that the retrieval results are similar for neighborhood queries. Experiments on large-scale datasets show that the proposed method reaches impressive gain over the supervised baselines and is superior to state-of-the-art semi-supervised hashing methods.
Feb 02, 2019

Shifeng Zhang, Jianmin Li, Bo Zhang

**Click to Read Paper and Get Code**

Semantic Cluster Unary Loss for Efficient Deep Hashing

May 15, 2018

Shifeng Zhang, Jianmin Li, Bo Zhang

Hashing method maps similar data to binary hashcodes with smaller hamming distance, which has received a broad attention due to its low storage cost and fast retrieval speed. With the rapid development of deep learning, deep hashing methods have achieved promising results in efficient information retrieval. Most of the existing deep hashing methods adopt pairwise or triplet losses to deal with similarities underlying the data, but the training is difficult and less efficient because $O(n^2)$ data pairs and $O(n^3)$ triplets are involved. To address these issues, we propose a novel deep hashing algorithm with unary loss which can be trained very efficiently. We first of all introduce a Unary Upper Bound of the traditional triplet loss, thus reducing the complexity to $O(n)$ and bridging the classification-based unary loss and the triplet loss. Second, we propose a novel Semantic Cluster Deep Hashing (SCDH) algorithm by introducing a modified Unary Upper Bound loss, named Semantic Cluster Unary Loss (SCUL). The resultant hashcodes form several compact clusters, which means hashcodes in the same cluster have similar semantic information. We also demonstrate that the proposed SCDH is easy to be extended to semi-supervised settings by incorporating the state-of-the-art semi-supervised learning algorithms. Experiments on large-scale datasets show that the proposed method is superior to state-of-the-art hashing algorithms.
May 15, 2018

Shifeng Zhang, Jianmin Li, Bo Zhang

* 12 pages

**Click to Read Paper and Get Code**

Monocular Depth Estimation with Hierarchical Fusion of Dilated CNNs and Soft-Weighted-Sum Inference

Aug 02, 2017

Bo Li, Yuchao Dai, Mingyi He

Monocular depth estimation is a challenging task in complex compositions depicting multiple objects of diverse scales. Albeit the recent great progress thanks to the deep convolutional neural networks (CNNs), the state-of-the-art monocular depth estimation methods still fall short to handle such real-world challenging scenarios. In this paper, we propose a deep end-to-end learning framework to tackle these challenges, which learns the direct mapping from a color image to the corresponding depth map. First, we represent monocular depth estimation as a multi-category dense labeling task by contrast to the regression based formulation. In this way, we could build upon the recent progress in dense labeling such as semantic segmentation. Second, we fuse different side-outputs from our front-end dilated convolutional neural network in a hierarchical way to exploit the multi-scale depth cues for depth estimation, which is critical to achieve scale-aware depth estimation. Third, we propose to utilize soft-weighted-sum inference instead of the hard-max inference, transforming the discretized depth score to continuous depth value. Thus, we reduce the influence of quantization error and improve the robustness of our method. Extensive experiments on the NYU Depth V2 and KITTI datasets show the superiority of our method compared with current state-of-the-art methods. Furthermore, experiments on the NYU V2 dataset reveal that our model is able to learn the probability distribution of depth.
Aug 02, 2017

Bo Li, Yuchao Dai, Mingyi He

**Click to Read Paper and Get Code**

A Critical Note on the Evaluation of Clustering Algorithms

Aug 10, 2019

Li Zhong, Tiantian Zhang, Bo Yuan

Experimental evaluation is a major research methodology for investigating clustering algorithms. For this purpose, a number of benchmark datasets have been widely used in the literature and their quality plays an important role on the value of the research work. However, in most of the existing studies, little attention has been paid to the specific properties of the datasets and they are often regarded as black-box problems. In our work, with the help of advanced visualization and dimension reduction techniques, we show that there are potential issues with some of the popular benchmark datasets used to evaluate clustering algorithms that may seriously compromise the research quality and even may produce completely misleading results. We suggest that significant efforts need to be devoted to improving the current practice of experimental evaluation of clustering algorithms by having a principled analysis of each benchmark dataset of interest.
Aug 10, 2019

Li Zhong, Tiantian Zhang, Bo Yuan

* 4 pages, 9 figures

**Click to Read Paper and Get Code**

Efficient Navigation of Colloidal Robots in an Unknown Environment via Deep Reinforcement Learning

Jul 31, 2019

Yuguang Yang, Michael A. Bevan, Bo Li

Equipping active colloidal robots with intelligence such that they can efficiently navigate in unknown complex environments could dramatically impact their use in emerging applications like precision surgery and targeted drug delivery. Here we develop a model-free deep reinforcement learning that can train colloidal robots to learn effective navigation strategies in unknown environments with random obstacles. We show that trained robot agents learn to make navigation decisions regarding both obstacle avoidance and travel time minimization, based solely on local sensory inputs without prior knowledge of the global environment. Such agents with biologically inspired mechanisms can acquire competitive navigation capabilities in large-scale, complex environments containing obstacles of diverse shapes, sizes, and configurations. This study illustrates the potential of artificial intelligence in engineering active colloidal systems for future applications and constructing complex active systems with visual and learning capability.
Jul 31, 2019

Yuguang Yang, Michael A. Bevan, Bo Li

* 6 pages, 5 figures

**Click to Read Paper and Get Code**

Efficient Navigation of Active Particles in an Unseen Environment via Deep Reinforcement Learning

Jun 26, 2019

Yuguang Yang, Michael A. Bevan, Bo Li

Equipping active particles with intelligence such that they can efficiently navigate in an unknown complex environment is essential for emerging applications like precision surgery and targeted drug delivery. Here we develop a deep reinforcement learning algorithm that can train active particles to navigate in environments with random obstacles. Through numerical experiments, we show that the trained particle agent learns to make navigation decision regarding both obstacle avoidance and travel time minimization, relying only on local pixel-level sensory inputs but not on pre-knowledge of the entire environment. In unseen complex obstacle environments, the trained particle agent can navigate nearly optimally in arbitrarily long distance nearly optimally at a fixed computational cost. This study illustrates the potentials of employing artificial intelligence to bridge the gap between active particle engineering and emerging real-world applications.
Jun 26, 2019

Yuguang Yang, Michael A. Bevan, Bo Li

* 14 pages, 4 figures

**Click to Read Paper and Get Code**

Computing Committor Functions for the Study of Rare Events Using Deep Learning

Jun 14, 2019

Qianxiao Li, Bo Lin, Weiqing Ren

The committor function is a central object of study in understanding transitions between metastable states in complex systems. However, computing the committor function for realistic systems at low temperatures is a challenging task, due to the curse of dimensionality and the scarcity of transition data. In this paper, we introduce a computational approach that overcomes these issues and achieves good performance on complex benchmark problems with rough energy landscapes. The new approach combines deep learning, data sampling and feature engineering techniques. This establishes an alternative practical method for studying rare transition events between metastable states in complex, high dimensional systems.
Jun 14, 2019

Qianxiao Li, Bo Lin, Weiqing Ren

**Click to Read Paper and Get Code**

Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression

Apr 16, 2019

Xinyao Wang, Liefeng Bo, Li Fuxin

Heatmap regression has became one of the mainstream approaches to localize facial landmarks. As Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) are becoming popular in solving computer vision tasks, extensive research has been done on these architectures. However, the loss function for heatmap regression is rarely studied. In this paper, we analyze the ideal loss function properties for heatmap regression in face alignment problems. Then we propose a novel loss function, named Adaptive Wing loss, that is able to adapt its shape to different types of ground truth heatmap pixels. This adaptability decreases the loss to zero on foreground pixels while leaving some loss on background pixels. To address the imbalance between foreground and background pixels, we also propose Weighted Loss Map, which assigns high weights on foreground and difficult background pixels to help training process focus more on pixels that are crucial to landmark localization. To further improve face alignment accuracy, we introduce boundary prediction and CoordConv with boundary coordinates. Extensive experiments on different benchmarks, including COFW, 300W and WFLW, show our approach outperforms the state-of-the-art by a significant margin on various evaluation metrics. Besides, the Adaptive Wing loss also helps other heatmap regression tasks. Code will be made publicly available.
Apr 16, 2019

Xinyao Wang, Liefeng Bo, Li Fuxin

**Click to Read Paper and Get Code**

The multi-armed bandit (MAB) model has been widely adopted for studying many practical optimization problems (network resource allocation, ad placement, crowdsourcing, etc.) with unknown parameters. The goal of the player here is to maximize the cumulative reward in the face of uncertainty. However, the basic MAB model neglects several important factors of the system in many real-world applications, where multiple arms can be simultaneously played and an arm could sometimes be "sleeping". Besides, ensuring fairness is also a key design concern in practice. To that end, we propose a new Combinatorial Sleeping MAB model with Fairness constraints, called CSMAB-F, aiming to address the aforementioned crucial modeling issues. The objective is now to maximize the reward while satisfying the fairness requirement of a minimum selection fraction for each individual arm. To tackle this new problem, we extend an online learning algorithm, UCB, to deal with a critical tradeoff between exploitation and exploration and employ the virtual queue technique to properly handle the fairness constraints. By carefully integrating these two techniques, we develop a new algorithm, called Learning with Fairness Guarantee (LFG), for the CSMAB-F problem. Further, we rigorously prove that not only LFG is feasibility-optimal, but it also has a time-average regret upper bounded by $\frac{N}{2\eta}+\frac{\beta_1\sqrt{mNT\log{T}}+\beta_2 N}{T}$, where N is the total number of arms, m is the maximum number of arms that can be simultaneously played, T is the time horizon, $\beta_1$ and $\beta_2$ are constants, and $\eta$ is a design parameter that we can tune. Finally, we perform extensive simulations to corroborate the effectiveness of the proposed algorithm. Interestingly, the simulation results reveal an important tradeoff between the regret and the speed of convergence to a point satisfying the fairness constraints.

* 13 pages, 9 figures

* 13 pages, 9 figures

**Click to Read Paper and Get Code**
A Weakly Supervised Adaptive DenseNet for Classifying Thoracic Diseases and Identifying Abnormalities

Nov 05, 2018

Bo Zhou, Yuemeng Li, Jiangcong Wang

We present a weakly supervised deep learning model for classifying thoracic diseases and identifying abnormalities in chest radiography. In this work, instead of learning from medical imaging data with region-level annotations, our model was merely trained on imaging data with image-level labels to classify diseases, and is able to identify abnormal image regions simultaneously. Our model consists of a customized pooling structure and an adaptive DenseNet front-end, which can effectively recognize possible disease features for classification and localization tasks. Our method has been validated on the publicly available ChestX-ray14 dataset. Experimental results have demonstrated that our classification and localization prediction performance achieved significant improvement over the previous models on the ChestX-ray14 dataset. In summary, our network can produce accurate disease classification and localization, which can potentially support clinical decisions.
Nov 05, 2018

Bo Zhou, Yuemeng Li, Jiangcong Wang

* 10 pages, 6 figures; accepted by IEEE Winter Conference on Applications of Computer Vision (2019 WACV)

**Click to Read Paper and Get Code**

Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition

Oct 31, 2018

Chris Donahue, Bo Li, Rohit Prabhavalkar

We investigate the effectiveness of generative adversarial networks (GANs) for speech enhancement, in the context of improving noise robustness of automatic speech recognition (ASR) systems. Prior work demonstrates that GANs can effectively suppress additive noise in raw waveform speech signals, improving perceptual quality metrics; however this technique was not justified in the context of ASR. In this work, we conduct a detailed study to measure the effectiveness of GANs in enhancing speech contaminated by both additive and reverberant noise. Motivated by recent advances in image processing, we propose operating GANs on log-Mel filterbank spectra instead of waveforms, which requires less computation and is more robust to reverberant noise. While GAN enhancement improves the performance of a clean-trained ASR system on noisy speech, it falls short of the performance achieved by conventional multi-style training (MTR). By appending the GAN-enhanced features to the noisy inputs and retraining, we achieve a 7% WER improvement relative to the MTR system.
Oct 31, 2018

Chris Donahue, Bo Li, Rohit Prabhavalkar

* Published as a conference paper at ICASSP 2018

**Click to Read Paper and Get Code**

Recent studies have shown the vulnerability of reinforcement learning (RL) models in noisy settings. The sources of noises differ across scenarios. For instance, in practice, the observed reward channel is often subject to noise (e.g., when observed rewards are collected through sensors), and thus observed rewards may not be credible as a result. Also, in applications such as robotics, a deep reinforcement learning (DRL) algorithm can be manipulated to produce arbitrary errors. In this paper, we consider noisy RL problems where observed rewards by RL agents are generated with a reward confusion matrix. We call such observed rewards as perturbed rewards. We develop an unbiased reward estimator aided robust RL framework that enables RL agents to learn in noisy environments while observing only perturbed rewards. Our framework draws upon approaches for supervised learning with noisy data. The core ideas of our solution include estimating a reward confusion matrix and defining a set of unbiased surrogate rewards. We prove the convergence and sample complexity of our approach. Extensive experiments on different DRL platforms show that policies based on our estimated surrogate reward can achieve higher expected rewards, and converge faster than existing baselines. For instance, the state-of-the-art PPO algorithm is able to obtain 67.5% and 46.7% improvements in average on five Atari games, when the error rates are 10% and 30% respectively.

**Click to Read Paper and Get Code**
Data Dropout: Optimizing Training Data for Convolutional Neural Networks

Sep 07, 2018

Tianyang Wang, Jun Huan, Bo Li

Deep learning models learn to fit training data while they are highly expected to generalize well to testing data. Most works aim at finding such models by creatively designing architectures and fine-tuning parameters. To adapt to particular tasks, hand-crafted information such as image prior has also been incorporated into end-to-end learning. However, very little progress has been made on investigating how an individual training sample will influence the generalization ability of a model. In other words, to achieve high generalization accuracy, do we really need all the samples in a training dataset? In this paper, we demonstrate that deep learning models such as convolutional neural networks may not favor all training samples, and generalization accuracy can be further improved by dropping those unfavorable samples. Specifically, the influence of removing a training sample is quantifiable, and we propose a Two-Round Training approach, aiming to achieve higher generalization accuracy. We locate unfavorable samples after the first round of training, and then retrain the model from scratch with the reduced training dataset in the second round. Since our approach is essentially different from fine-tuning or further training, the computational cost should not be a concern. Our extensive experimental results indicate that, with identical settings, the proposed approach can boost performance of the well-known networks on both high-level computer vision problems such as image classification, and low-level vision problems such as image denoising.
Sep 07, 2018

Tianyang Wang, Jun Huan, Bo Li

**Click to Read Paper and Get Code**

Context-aware Data Aggregation with Localized Information Privacy

Jul 31, 2018

Bo Jiang, Ming Li, Ravi Tandon

In this paper, localized information privacy (LIP) is proposed, as a new privacy definition, which allows statistical aggregation while protecting users' privacy without relying on a trusted third party. The notion of context-awareness is incorporated in LIP by the introduction of priors, which enables the design of privacy-preserving data aggregation with knowledge of priors. We show that LIP relaxes the Localized Differential Privacy (LDP) notion by explicitly modeling the adversary's knowledge. However, it is stricter than $2\epsilon$-LDP and $\epsilon$-mutual information privacy. The incorporation of local priors allows LIP to achieve higher utility compared to other approaches. We then present an optimization framework for privacy-preserving data aggregation, with the goal of minimizing the expected squared error while satisfying the LIP privacy constraints. Utility-privacy tradeoffs are obtained under several models in closed-form. We then validate our analysis by {numerical analysis} using both synthetic and real-world data. Results show that our LIP mechanism provides better utility-privacy tradeoffs than LDP and when the prior is not uniformly distributed, the advantage of LIP is even more significant.
Jul 31, 2018

Bo Jiang, Ming Li, Ravi Tandon

* 17 pages, 15 figures, To appear in the processing of the IEEE Conference on Communications and Network Security, 30 May-1 June , 2018, Beijing, China

**Click to Read Paper and Get Code**

A General Retraining Framework for Scalable Adversarial Classification

Nov 26, 2016

Bo Li, Yevgeniy Vorobeychik, Xinyun Chen

Traditional classification algorithms assume that training and test data come from similar distributions. This assumption is violated in adversarial settings, where malicious actors modify instances to evade detection. A number of custom methods have been developed for both adversarial evasion attacks and robust learning. We propose the first systematic and general-purpose retraining framework which can: a) boost robustness of an \emph{arbitrary} learning algorithm, in the face of b) a broader class of adversarial models than any prior methods. We show that, under natural conditions, the retraining framework minimizes an upper bound on optimal adversarial risk, and show how to extend this result to account for approximations of evasion attacks. Extensive experimental evaluation demonstrates that our retraining methods are nearly indistinguishable from state-of-the-art algorithms for optimizing adversarial risk, but are more general and far more scalable. The experiments also confirm that without retraining, our adversarial framework dramatically reduces the effectiveness of learning. In contrast, retraining significantly boosts robustness to evasion attacks without significantly compromising overall accuracy.
Nov 26, 2016

Bo Li, Yevgeniy Vorobeychik, Xinyun Chen

**Click to Read Paper and Get Code**