Models, code, and papers for "Hong Zhang":

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

May 23, 2017
Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, Min Sun

We introduce two tactics to attack agents trained by deep reinforcement learning algorithms using adversarial examples, namely the strategically-timed attack and the enchanting attack. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an adversarial example should be crafted and applied. In the enchanting attack, the adversary aims at luring the agent to a designated target state. This is achieved by combining a generative model and a planning algorithm: while the generative model predicts the future states, the planning algorithm generates a preferred sequence of actions for luring the agent. A sequence of adversarial examples is then crafted to lure the agent to take the preferred sequence of actions. We apply the two tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. Videos are available at

* To Appear at IJCAI 2017. Project website: 

  Click for Model/Code and Paper
A Multi-State Diagnosis and Prognosis Framework with Feature Learning for Tool Condition Monitoring

Apr 30, 2018
Chong Zhang, Geok Soon Hong, Jun-Hong Zhou, Kay Chen Tan, Haizhou Li, Huan Xu, Jihoon Hong, Hian-Leng Chan

In this paper, a multi-state diagnosis and prognosis (MDP) framework is proposed for tool condition monitoring via a deep belief network based multi-state approach (DBNMS). For fault diagnosis, a cost-sensitive deep belief network (namely ECS-DBN) is applied to deal with the imbalanced data problem for tool state estimation. An appropriate prognostic degradation model is then applied for tool wear estimation based on the different tool states. The proposed framework has the advantage of automatic feature representation learning and shows better performance in accuracy and robustness. The effectiveness of the proposed DBNMS is validated using a real-world dataset obtained from the gun drilling process. This dataset contains a large amount of measured signals involving different tool geometries under various operating conditions. The DBNMS is examined for both the tool state estimation and tool wear estimation tasks. In the experimental studies, the prediction results are evaluated and compared with popular machine learning approaches, which show the superior performance of the proposed DBNMS approach.

* 14 pages, 12 figures, 10 tables, submitted to IEEE Transactions on Cybernetics 

  Click for Model/Code and Paper
MANELA: A Multi-Agent Algorithm for Learning Network Embeddings

Dec 01, 2019
Han Zhang, Hong Xu

Playing an essential role in data mining, machine learning has a long history of being applied to networks on multifarious tasks and has played an essential role in data mining. However, the discrete and sparse natures of networks often render it difficult to apply machine learning directly to networks. To circumvent this difficulty, one major school of thought to approach networks using machine learning is via network embeddings. On the one hand, this network embeddings have achieved huge success on aggregated network data in recent years. On the other hand, learning network embeddings on distributively stored networks still remained understudied: To the best of our knowledge, all existing algorithms for learning network embeddings have hitherto been exclusively centralized and thus cannot be applied to these networks. To accommodate distributively stored networks, in this paper, we proposed a multi-agent model. Under this model, we developed the multi-agent network embedding learning algorithm (MANELA) for learning network embeddings. We demonstrate MANELA's advantages over other existing centralized network embedding learning algorithms both theoretically and experimentally. Finally, we further our understanding in MANELA via visualization and exploration of its relationship to DeepWalk.

* 11 pages, 4 figures, 5 tables 

  Click for Model/Code and Paper
Moving Object Detection under Discontinuous Change in Illumination Using Tensor Low-Rank and Invariant Sparse Decomposition

Apr 08, 2019
Moein Shakeri, Hong Zhang

Although low-rank and sparse decomposition based methods have been successfully applied to the problem of moving object detection using structured sparsity-inducing norms, they are still vulnerable to significant illumination changes that arise in certain applications. We are interested in moving object detection in applications involving time-lapse image sequences for which current methods mistakenly group moving objects and illumination changes into foreground. Our method relies on the multilinear (tensor) data low-rank and sparse decomposition framework to address the weaknesses of existing methods. The key to our proposed method is to create first a set of prior maps that can characterize the changes in the image sequence due to illumination. We show that they can be detected by a k-support norm. To deal with concurrent, two types of changes, we employ two regularization terms, one for detecting moving objects and the other for accounting for illumination changes, in the tensor low-rank and sparse decomposition formulation. Through comprehensive experiments using challenging datasets, we show that our method demonstrates a remarkable ability to detect moving objects under discontinuous change in illumination, and outperforms the state-of-the-art solutions to this challenging problem.

* 14 pages, 14 figures, including supplementary materials 

  Click for Model/Code and Paper
On The Stability of Video Detection and Tracking

Apr 04, 2017
Hong Zhang, Naiyan Wang

In this paper, we study an important yet less explored aspect in video detection and tracking -- stability. Surprisingly, there is no prior work that tried to study it. As a result, we start our work by proposing a novel evaluation metric for video detection which considers both stability and accuracy. For accuracy, we extend the existing accuracy metric mean Average Precision (mAP). For stability, we decompose it into three terms: fragment error, center position error, scale and ratio error. Each error represents one aspect of stability. Furthermore, we demonstrate that the stability metric has low correlation with accuracy metric. Thus, it indeed captures a different perspective of quality. Lastly, based on this metric, we evaluate several existing methods for video detection and show how they affect accuracy and stability. We believe our work can provide guidance and solid baselines for future researches in the related areas.

  Click for Model/Code and Paper
COROLA: A Sequential Solution to Moving Object Detection Using Low-rank Approximation

Jan 28, 2016
Moein Shakeri, Hong Zhang

Extracting moving objects from a video sequence and estimating the background of each individual image are fundamental issues in many practical applications such as visual surveillance, intelligent vehicle navigation, and traffic monitoring. Recently, some methods have been proposed to detect moving objects in a video via low-rank approximation and sparse outliers where the background is modeled with the computed low-rank component of the video and the foreground objects are detected as the sparse outliers in the low-rank approximation. All of these existing methods work in a batch manner, preventing them from being applied in real time and long duration tasks. In this paper, we present an online sequential framework, namely contiguous outliers representation via online low-rank approximation (COROLA), to detect moving objects and learn the background model at the same time. We also show that our model can detect moving objects with a moving camera. Our experimental evaluation uses simulated data and real public datasets and demonstrates the superior performance of COROLA in terms of both accuracy and execution time.

* 37 pages, 10 figures 

  Click for Model/Code and Paper
Stopping Rules for Bag-of-Words Image Search and Its Application in Appearance-Based Localization

Dec 28, 2013
Kiana Hajebi, Hong Zhang

We propose a technique to improve the search efficiency of the bag-of-words (BoW) method for image retrieval. We introduce a notion of difficulty for the image matching problems and propose methods that reduce the amount of computations required for the feature vector-quantization task in BoW by exploiting the fact that easier queries need less computational resources. Measuring the difficulty of a query and stopping the search accordingly is formulated as a stopping problem. We introduce stopping rules that terminate the image search depending on the difficulty of each query, thereby significantly reducing the computational cost. Our experimental results show the effectiveness of our approach when it is applied to appearance-based localization problem.

  Click for Model/Code and Paper
An Efficient Index for Visual Search in Appearance-based SLAM

Sep 27, 2013
Kiana Hajebi, Hong Zhang

Vector-quantization can be a computationally expensive step in visual bag-of-words (BoW) search when the vocabulary is large. A BoW-based appearance SLAM needs to tackle this problem for an efficient real-time operation. We propose an effective method to speed up the vector-quantization process in BoW-based visual SLAM. We employ a graph-based nearest neighbor search (GNNS) algorithm to this aim, and experimentally show that it can outperform the state-of-the-art. The graph-based search structure used in GNNS can efficiently be integrated into the BoW model and the SLAM framework. The graph-based index, which is a k-NN graph, is built over the vocabulary words and can be extracted from the BoW's vocabulary construction procedure, by adding one iteration to the k-means clustering, which adds small extra cost. Moreover, exploiting the fact that images acquired for appearance-based SLAM are sequential, GNNS search can be initiated judiciously which helps increase the speedup of the quantization process considerably.

  Click for Model/Code and Paper
Low-Depth Optical Neural Networks

May 18, 2019
Xiao-Ming Zhang, Man-Hong Yung

Optical neural network (ONN) is emerging as an attractive proposal for machine-learning applications, enabling high-speed computation with low-energy consumption. However, there are several challenges in applying ONN for industrial applications, including the realization of activation functions and maintaining stability. In particular, the stability of ONNs decrease with the circuit depth, limiting the scalability of the ONNs for practical uses. Here we demonstrate how to compress the circuit depth of ONN to scale only logarithmically, leading to an exponential gain in terms of noise robustness. Our low-depth (LD) ONN is based on an architecture, called Optical CompuTing Of dot-Product UnitS (OCTOPUS), which can also be applied individually as a linear perceptron for solving classification problems. Using the standard data set of Letter Recognition, we present numerical evidence showing that LD-ONN can exhibit a significant gain in noise robustness, compared with a previous ONN proposal based on singular-value decomposition [Nature Photonics 11, 441 (2017)].

* 16 pages, 6 figures 

  Click for Model/Code and Paper
Lightweight Monocular Depth Estimation Model by Joint End-to-End Filter pruning

May 13, 2019
Sara Elkerdawy, Hong Zhang, Nilanjan Ray

Convolutional neural networks (CNNs) have emerged as the state-of-the-art in multiple vision tasks including depth estimation. However, memory and computing power requirements remain as challenges to be tackled in these models. Monocular depth estimation has significant use in robotics and virtual reality that requires deployment on low-end devices. Training a small model from scratch results in a significant drop in accuracy and it does not benefit from pre-trained large models. Motivated by the literature of model pruning, we propose a lightweight monocular depth model obtained from a large trained model. This is achieved by removing the least important features with a novel joint end-to-end filter pruning. We propose to learn a binary mask for each filter to decide whether to drop the filter or not. These masks are trained jointly to exploit relations between filters at different layers as well as redundancy within the same layer. We show that we can achieve around 5x compression rate with small drop in accuracy on the KITTI driving dataset. We also show that masking can improve accuracy over the baseline with fewer parameters, even without enforcing compression loss.

  Click for Model/Code and Paper
Reinforcing Classical Planning for Adversary Driving Scenarios

Mar 20, 2019
Nazmus Sakib, Hengshuai Yao, Hong Zhang

Adversary scenarios in driving, where the other vehicles may make mistakes or have a competing or malicious intent, have to be studied not only for our safety but also for addressing the concerns from public in order to push the technology forward. Classical planning solutions for adversary driving do not exist so far, especially when the vehicles do not communicate their intent. Given recent success in solving hard problems in artificial intelligence (AI), it is worth studying the potential of reinforcement learning for safety driving in adversary settings. In most recent reinforcement learning applications, there is a deep neural networks that maps an input state to an optimal policy over primitive actions. However, learning a policy over primitive actions is very difficult and inefficient. On the other hand, the knowledge already learned in classical planning methods should be inherited and reused. In order to take advantage of reinforcement learning good at exploring the action space for safety and classical planning skill models good at handling most driving scenarios, we propose to learn a policy over an action space of primitive actions augmented with classical planning methods. We show two advantages by doing so. First, training this reinforcement learning agent is easier and faster than training the primitive-action agent. Second, our new agent outperforms the primitive-action reinforcement learning agent, human testers as well as the classical planning methods that our agent queries as skills.

  Click for Model/Code and Paper
Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network

Mar 16, 2018
Sepideh Hosseinzadeh, Moein Shakeri, Hong Zhang

In recent years, various shadow detection methods from a single image have been proposed and used in vision systems; however, most of them are not appropriate for the robotic applications due to the expensive time complexity. This paper introduces a fast shadow detection method using a deep learning framework, with a time cost that is appropriate for robotic applications. In our solution, we first obtain a shadow prior map with the help of multi-class support vector machine using statistical features. Then, we use a semantic- aware patch-level Convolutional Neural Network that efficiently trains on shadow examples by combining the original image and the shadow prior map. Experiments on benchmark datasets demonstrate the proposed method significantly decreases the time complexity of shadow detection, by one or two orders of magnitude compared with state-of-the-art methods, without losing accuracy.

* 6 pages, 5 figures, Submitted to IROS 2018 

  Click for Model/Code and Paper
Joint Projection and Dictionary Learning using Low-rank Regularization and Graph Constraints

Dec 06, 2016
Homa Foroughi, Nilanjan Ray, Hong Zhang

In this paper, we aim at learning simultaneously a discriminative dictionary and a robust projection matrix from noisy data. The joint learning, makes the learned projection and dictionary a better fit for each other, so a more accurate classification can be obtained. However, current prevailing joint dimensionality reduction and dictionary learning methods, would fail when the training samples are noisy or heavily corrupted. To address this issue, we propose a joint projection and dictionary learning using low-rank regularization and graph constraints (JPDL-LR). Specifically, the discrimination of the dictionary is achieved by imposing Fisher criterion on the coding coefficients. In addition, our method explicitly encodes the local structure of data by incorporating a graph regularization term, that further improves the discriminative ability of the projection matrix. Inspired by recent advances of low-rank representation for removing outliers and noise, we enforce a low-rank constraint on sub-dictionaries of all classes to make them more compact and robust to noise. Experimental results on several benchmark datasets verify the effectiveness and robustness of our method for both dimensionality reduction and image classification, especially when the data contains considerable noise or variations.

  Click for Model/Code and Paper
Object Classification with Joint Projection and Low-rank Dictionary Learning

Dec 05, 2016
Homa Foroughi, Nilanjan Ray, Hong Zhang

For an object classification system, the most critical obstacles towards real-world applications are often caused by large intra-class variability, arising from different lightings, occlusion and corruption, in limited sample sets. Most methods in the literature would fail when the training samples are heavily occluded, corrupted or have significant illumination or viewpoint variations. Besides, most of the existing methods and especially deep learning-based methods, need large training sets to achieve a satisfactory recognition performance. Although using the pre-trained network on a generic large-scale dataset and fine-tune it to the small-sized target dataset is a widely used technique, this would not help when the content of base and target datasets are very different. To address these issues, we propose a joint projection and low-rank dictionary learning method using dual graph constraints (JP-LRDL). The proposed joint learning method would enable us to learn the features on top of which dictionaries can be better learned, from the data with large intra-class variability. Specifically, a structured class-specific dictionary is learned and the discrimination is further improved by imposing a graph constraint on the coding coefficients, that maximizes the intra-class compactness and inter-class separability. We also enforce low-rank and structural incoherence constraints on sub-dictionaries to make them more compact and robust to variations and outliers and reduce the redundancy among them, respectively. To preserve the intrinsic structure of data and penalize unfavourable relationship among training samples simultaneously, we introduce a projection graph into the framework, which significantly enhances the discriminative ability of the projection matrix and makes the method robust to small-sized and high-dimensional datasets.

* arXiv admin note: text overlap with arXiv:1603.07697; text overlap with arXiv:1404.3606 by other authors 

  Click for Model/Code and Paper
Effective Deterministic Initialization for $k$-Means-Like Methods via Local Density Peaks Searching

Nov 21, 2016
Fengfu Li, Hong Qiao, Bo Zhang

The $k$-means clustering algorithm is popular but has the following main drawbacks: 1) the number of clusters, $k$, needs to be provided by the user in advance, 2) it can easily reach local minima with randomly selected initial centers, 3) it is sensitive to outliers, and 4) it can only deal with well separated hyperspherical clusters. In this paper, we propose a Local Density Peaks Searching (LDPS) initialization framework to address these issues. The LDPS framework includes two basic components: one of them is the local density that characterizes the density distribution of a data set, and the other is the local distinctiveness index (LDI) which we introduce to characterize how distinctive a data point is compared with its neighbors. Based on these two components, we search for the local density peaks which are characterized with high local densities and high LDIs to deal with 1) and 2). Moreover, we detect outliers characterized with low local densities but high LDIs, and exclude them out before clustering begins. Finally, we apply the LDPS initialization framework to $k$-medoids, which is a variant of $k$-means and chooses data samples as centers, with diverse similarity measures other than the Euclidean distance to fix the last drawback of $k$-means. Combining the LDPS initialization framework with $k$-means and $k$-medoids, we obtain two novel clustering methods called LDPS-means and LDPS-medoids, respectively. Experiments on synthetic data sets verify the effectiveness of the proposed methods, especially when the ground truth of the cluster number $k$ is large. Further, experiments on several real world data sets, Handwritten Pendigits, Coil-20, Coil-100 and Olivetti Face Database, illustrate that our methods give a superior performance than the analogous approaches on both estimating $k$ and unsupervised object categorization.

* 16 pages, 9 figures, journal paper 

  Click for Model/Code and Paper
Convolutional Neural Network-Based Image Representation for Visual Loop Closure Detection

Apr 20, 2015
Yi Hou, Hong Zhang, Shilin Zhou

Deep convolutional neural networks (CNN) have recently been shown in many computer vision and pattern recog- nition applications to outperform by a significant margin state- of-the-art solutions that use traditional hand-crafted features. However, this impressive performance is yet to be fully exploited in robotics. In this paper, we focus one specific problem that can benefit from the recent development of the CNN technology, i.e., we focus on using a pre-trained CNN model as a method of generating an image representation appropriate for visual loop closure detection in SLAM (simultaneous localization and mapping). We perform a comprehensive evaluation of the outputs at the intermediate layers of a CNN as image descriptors, in comparison with state-of-the-art image descriptors, in terms of their ability to match images for detecting loop closures. The main conclusions of our study include: (a) CNN-based image representations perform comparably to state-of-the-art hand- crafted competitors in environments without significant lighting change, (b) they outperform state-of-the-art competitors when lighting changes significantly, and (c) they are also significantly faster to extract than the state-of-the-art hand-crafted features even on a conventional CPU and are two orders of magnitude faster on an entry-level GPU.

* 8 pages, 4 figures 

  Click for Model/Code and Paper
Isometric Multi-Manifolds Learning

Dec 03, 2009
Mingyu Fan, Hong Qiao, Bo Zhang

Isometric feature mapping (Isomap) is a promising manifold learning method. However, Isomap fails to work on data which distribute on clusters in a single manifold or manifolds. Many works have been done on extending Isomap to multi-manifolds learning. In this paper, we first proposed a new multi-manifolds learning algorithm (M-Isomap) with help of a general procedure. The new algorithm preserves intra-manifold geodesics and multiple inter-manifolds edges precisely. Compared with previous methods, this algorithm can isometrically learn data distributed on several manifolds. Secondly, the original multi-cluster manifold learning algorithm first proposed in \cite{DCIsomap} and called D-C Isomap has been revised so that the revised D-C Isomap can learn multi-manifolds data. Finally, the features and effectiveness of the proposed multi-manifolds learning algorithms are demonstrated and compared through experiments.

  Click for Model/Code and Paper
Model-based Lookahead Reinforcement Learning

Aug 15, 2019
Zhang-Wei Hong, Joni Pajarinen, Jan Peters

Model-based Reinforcement Learning (MBRL) allows data-efficient learning which is required in real world applications such as robotics. However, despite the impressive data-efficiency, MBRL does not achieve the final performance of state-of-the-art Model-free Reinforcement Learning (MFRL) methods. We leverage the strengths of both realms and propose an approach that obtains high performance with a small amount of data. In particular, we combine MFRL and Model Predictive Control (MPC). While MFRL's strength in exploration allows us to train a better forward dynamics model for MPC, MPC improves the performance of the MFRL policy by sampling-based planning. The experimental results in standard continuous control benchmarks show that our approach can achieve MFRL`s level of performance while being as data-efficient as MBRL.

  Click for Model/Code and Paper
Ranking and Selection with Covariates for Personalized Decision Making

Oct 07, 2017
Haihui Shen, L. Jeff Hong, Xiaowei Zhang

We consider a ranking and selection problem in the context of personalized decision making, where the best alternative is not universal but varies as a function of observable covariates. The goal of ranking and selection with covariates (R&S-C) is to use sampling to compute a decision rule that can specify the best alternative with certain statistical guarantee for each subsequent individual after observing his or her covariates. A linear model is proposed to capture the relationship between the mean performance of an alternative and the covariates. Under the indifference-zone formulation, we develop two-stage procedures for both homoscedastic and heteroscedastic sampling errors, respectively, and prove their statistical validity, which is defined in terms of probability of correct selection. We also generalize the well-known slippage configuration, and prove that the generalized slippage configuration is the least favorable configuration of our procedures. Extensive numerical experiments are conducted to investigate the performance of the proposed procedures. Finally, we demonstrate the usefulness of R&S-C via a case study of selecting the best treatment regimen in the prevention of esophageal cancer. We find that by leveraging disease-related personal information, R&S-C can improve substantially the expected quality-adjusted life years for some groups of patients through providing patient-specific treatment regimen.

  Click for Model/Code and Paper