Models, code, and papers for "Jin Zhou":

##### Experimentally detecting a quantum change point via Bayesian inference

Detecting a change point is a crucial task in statistics that has been recently extended to the quantum realm. A source state generator that emits a series of single photons in a default state suffers an alteration at some point and starts to emit photons in a mutated state. The problem consists in identifying the point where the change took place. In this work, we consider a learning agent that applies Bayesian inference on experimental data to solve this problem. This learning machine adjusts the measurement over each photon according to the past experimental results finds the change position in an online fashion. Our results show that the local-detection success probability can be largely improved by using such a machine learning technique. This protocol provides a tool for improvement in many applications where a sequence of identical quantum states is required.

* Phys. Rev. A 98, 040301 (2018)
##### Combining tabu search and graph reduction to solve the maximum balanced biclique problem

May 20, 2017
Yi Zhou, Jin-Kao Hao

The Maximum Balanced Biclique Problem is a well-known graph model with relevant applications in diverse domains. This paper introduces a novel algorithm, which combines an effective constraint-based tabu search procedure and two dedicated graph reduction techniques. We verify the effectiveness of the algorithm on 30 classical random benchmark graphs and 25 very large real-life sparse graphs from the popular Koblenz Network Collection (KONECT). The results show that the algorithm improves the best-known results (new lower bounds) for 10 classical benchmarks and obtains the optimal solutions for 14 KONECT instances.

##### Distributed estimation of principal support vector machines for sufficient dimension reduction

Nov 28, 2019
Jun Jin, Chao Ying, Zhou Yu

The principal support vector machines method (Li et al., 2011) is a powerful tool for sufficient dimension reduction that replaces original predictors with their low-dimensional linear combinations without loss of information. However, the computational burden of the principal support vector machines method constrains its use for massive data. To address this issue, we in this paper propose two distributed estimation algorithms for fast implementation when the sample size is large. Both the two distributed sufficient dimension reduction estimators enjoy the same statistical efficiency as merging all the data together, which provides rigorous statistical guarantees for their application to large scale datasets. The two distributed algorithms are further adapt to principal weighted support vector machines (Shin et al., 2016) for sufficient dimension reduction in binary classification. The statistical accuracy and computational complexity of our proposed methods are examined through comprehensive simulation studies and a real data application with more than 600000 samples.

##### Predicting Opioid Relapse Using Social Media Data

Nov 14, 2018
Zhou Yang, Long Nguyen, Fang Jin

Opioid addiction is a severe public health threat in the U.S, causing massive deaths and many social problems. Accurate relapse prediction is of practical importance for recovering patients since relapse prediction promotes timely relapse preventions that help patients stay clean. In this paper, we introduce a Generative Adversarial Networks (GAN) model to predict the addiction relapses based on sentiment images and social influences. Experimental results on real social media data from Reddit.com demonstrate that the GAN model delivers a better performance than comparable alternative techniques. The sentiment images generated by the model show that relapse is closely connected with two emotions joy' and negative'. This work is one of the first attempts to predict relapses using massive social media data and generative adversarial nets. The proposed method, combined with knowledge of social media mining, has the potential to revolutionize the practice of opioid addiction prevention and treatment.

##### A Fast Sampling Gradient Tree Boosting Framework

Nov 20, 2019
Daniel Chao Zhou, Zhongming Jin, Tong Zhang

As an adaptive, interpretable, robust, and accurate meta-algorithm for arbitrary differentiable loss functions, gradient tree boosting is one of the most popular machine learning techniques, though the computational expensiveness severely limits its usage. Stochastic gradient boosting could be adopted to accelerates gradient boosting by uniformly sampling training instances, but its estimator could introduce a high variance. This situation arises motivation for us to optimize gradient tree boosting. We combine gradient tree boosting with importance sampling, which achieves better performance by reducing the stochastic variance. Furthermore, we use a regularizer to improve the diagonal approximation in the Newton step of gradient boosting. The theoretical analysis supports that our strategies achieve a linear convergence rate on logistic loss. Empirical results show that our algorithm achieves a 2.5x--18x acceleration on two different gradient boosting algorithms (LogitBoost and LambdaMART) without appreciable performance loss.

##### Memetic search for identifying critical nodes in sparse graphs

Oct 07, 2017
Yangming Zhou, Jin-Kao Hao, Fred Glover

Critical node problems involve identifying a subset of critical nodes from an undirected graph whose removal results in optimizing a pre-defined measure over the residual graph. As useful models for a variety of practical applications, these problems are computational challenging. In this paper, we study the classic critical node problem (CNP) and introduce an effective memetic algorithm for solving CNP. The proposed algorithm combines a double backbone-based crossover operator (to generate promising offspring solutions), a component-based neighborhood search procedure (to find high-quality local optima) and a rank-based pool updating strategy (to guarantee a healthy population). Specially, the component-based neighborhood search integrates two key techniques, i.e., two-phase node exchange strategy and node weighting scheme. The double backbone-based crossover extends the idea of general backbone-based crossovers. Extensive evaluations on 42 synthetic and real-world benchmark instances show that the proposed algorithm discovers 21 new upper bounds and matches 18 previous best-known upper bounds. We also demonstrate the relevance of our algorithm for effectively solving a variant of the classic CNP, called the cardinality-constrained critical node problem. Finally, we investigate the usefulness of each key algorithmic component.

##### Reinforcement learning based local search for grouping problems: A case study on graph coloring

Apr 01, 2016
Yangming Zhou, Jin-Kao Hao, Béatrice Duval

Grouping problems aim to partition a set of items into multiple mutually disjoint subsets according to some specific criterion and constraints. Grouping problems cover a large class of important combinatorial optimization problems that are generally computationally difficult. In this paper, we propose a general solution approach for grouping problems, i.e., reinforcement learning based local search (RLS), which combines reinforcement learning techniques with descent-based local search. The viability of the proposed approach is verified on a well-known representative grouping problem (graph coloring) where a very simple descent-based coloring algorithm is applied. Experimental studies on popular DIMACS and COLOR02 benchmark graphs indicate that RLS achieves competitive performances compared to a number of well-known coloring algorithms.

##### Differentially Private Online Learning for Cloud-Based Video Recommendation with Multimedia Big Data in Social Networks

Feb 01, 2016
Pan Zhou, Yingxue Zhou, Dapeng Wu, Hai Jin

With the rapid growth in multimedia services and the enormous offers of video contents in online social networks, users have difficulty in obtaining their interests. Therefore, various personalized recommendation systems have been proposed. However, they ignore that the accelerated proliferation of social media data has led to the big data era, which has greatly impeded the process of video recommendation. In addition, none of them has considered both the privacy of users' contexts (e,g., social status, ages and hobbies) and video service vendors' repositories, which are extremely sensitive and of significant commercial value. To handle the problems, we propose a cloud-assisted differentially private video recommendation system based on distributed online learning. In our framework, service vendors are modeled as distributed cooperative learners, recommending videos according to user's context, while simultaneously adapting the video-selection strategy based on user-click feedback to maximize total user clicks (reward). Considering the sparsity and heterogeneity of big social media data, we also propose a novel geometric differentially private model, which can greatly reduce the performance (recommendation accuracy) loss. Our simulation shows the proposed algorithms outperform other existing methods and keep a delicate balance between computing accuracy and privacy preserving level.

##### CUR Algorithm for Partially Observed Matrices

Nov 04, 2014
Miao Xu, Rong Jin, Zhi-Hua Zhou

CUR matrix decomposition computes the low rank approximation of a given matrix by using the actual rows and columns of the matrix. It has been a very useful tool for handling large matrices. One limitation with the existing algorithms for CUR matrix decomposition is that they need an access to the {\it full} matrix, a requirement that can be difficult to fulfill in many real world applications. In this work, we alleviate this limitation by developing a CUR decomposition algorithm for partially observed matrices. In particular, the proposed algorithm computes the low rank approximation of the target matrix based on (i) the randomly sampled rows and columns, and (ii) a subset of observed entries that are randomly sampled from the matrix. Our analysis shows the relative error bound, measured by spectral norm, for the proposed algorithm when the target matrix is of full rank. We also show that only $O(n r\ln r)$ observed entries are needed by the proposed algorithm to perfectly recover a rank $r$ matrix of size $n\times n$, which improves the sample complexity of the existing algorithms for matrix completion. Empirical studies on both synthetic and real-world datasets verify our theoretical claims and demonstrate the effectiveness of the proposed algorithm.

##### Top Rank Optimization in Linear Time

Oct 06, 2014
Nan Li, Rong Jin, Zhi-Hua Zhou

Bipartite ranking aims to learn a real-valued ranking function that orders positive instances before negative instances. Recent efforts of bipartite ranking are focused on optimizing ranking accuracy at the top of the ranked list. Most existing approaches are either to optimize task specific metrics or to extend the ranking loss by emphasizing more on the error associated with the top ranked instances, leading to a high computational cost that is super-linear in the number of training instances. We propose a highly efficient approach, titled TopPush, for optimizing accuracy at the top that has computational complexity linear in the number of training instances. We present a novel analysis that bounds the generalization error for the top ranked instances for the proposed approach. Empirical study shows that the proposed approach is highly competitive to the state-of-the-art approaches and is 10-100 times faster.

##### \emph{cm}SalGAN: RGB-D Salient Object Detection with Cross-View Generative Adversarial Networks

Dec 21, 2019
Bo Jiang, Zitai Zhou, Xiao Wang, Jin Tang

Image salient object detection (SOD) is an active research topic in computer vision and multimedia area. Fusing complementary information of RGB and depth has been demonstrated to be effective for image salient object detection which is known as RGB-D salient object detection problem. The main challenge for RGB-D salient object detection is how to exploit the salient cues of both intra-modality (RGB, depth) and cross-modality simultaneously which is known as cross-modality detection problem. In this paper, we tackle this challenge by designing a novel cross-modality Saliency Generative Adversarial Network (\emph{cm}SalGAN). \emph{cm}SalGAN aims to learn an optimal view-invariant and consistent pixel-level representation for RGB and depth images via a novel adversarial learning framework, which thus incorporates both information of intra-view and correlation information of cross-view images simultaneously for RGB-D saliency detection problem. To further improve the detection results, the attention mechanism and edge detection module are also incorporated into \emph{cm}SalGAN. The entire \emph{cm}SalGAN can be trained in an end-to-end manner by using the standard deep neural network framework. Experimental results show that \emph{cm}SalGAN achieves the new state-of-the-art RGB-D saliency detection performance on several benchmark datasets.

* Submitted to IEEE Transactions on Multimedia
##### Discovering Opioid Use Patterns from Social Media for Relapse Prevention

Dec 02, 2019
Zhou Yang, Spencer Bradshaw, Rattikorn Hewett, Fang Jin

The United States is currently experiencing an unprecedented opioid crisis, and opioid overdose has become a leading cause of injury and death. Effective opioid addiction recovery calls for not only medical treatments, but also behavioral interventions for impacted individuals. In this paper, we study communication and behavior patterns of patients with opioid use disorder (OUD) from social media, intending to demonstrate how existing information from common activities, such as online social networking, might lead to better prediction, evaluation, and ultimately prevention of relapses. Through a multi-disciplinary and advanced novel analytic perspective, we characterize opioid addiction behavior patterns by analyzing opioid groups from Reddit.com - including modeling online discussion topics, analyzing text co-occurrence and correlations, and identifying emotional states of people with OUD. These quantitative analyses are of practical importance and demonstrate innovative ways to use information from online social media, to create technology that can assist in relapse prevention.

* 7 pages, and 5 figures
##### Job Scheduling on Data Centers with Deep Reinforcement Learning

Sep 16, 2019
Sisheng Liang, Zhou Yang, Fang Jin, Yong Chen

Efficient job scheduling on data centers under heterogeneous complexity is crucial but challenging since it involves the allocation of multi-dimensional resources over time and space. To adapt the complex computing environment in data centers, we proposed an innovative Advantage Actor-Critic (A2C) deep reinforcement learning based approach called DeepScheduler for job scheduling. DeepScheduler consists of two agents, one of which, dubbed the actor, is responsible for learning the scheduling policy automatically and the other one, the critic, reduces the estimation error. Unlike previous policy gradient approaches, DeepScheduler is designed to reduce the gradient estimation variance and to update parameters efficiently. We show that the DeepScheduler can achieve competitive scheduling performance using both simulated workloads and real data collected from an academic data center.

* 6 pages
##### Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment

Sep 07, 2019
Di Jin, Zhijing Jin, Joey Tianyi Zhou, Peter Szolovits

Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models. It is helpful to evaluate or even improve the robustness of these models by exposing the maliciously crafted adversarial examples. In this paper, we present TextFooler, a simple but strong baseline to generate natural adversarial text. By applying it to two fundamental natural language tasks, text classification and textual entailment, we successfully attacked three target models, including the powerful pre-trained BERT, and the widely used convolutional and recurrent neural networks. We demonstrate the advantages of this framework in three ways: (1) effective---it outperforms state-of-the-art attacks in terms of success rate and perturbation rate, (2) utility-preserving---it preserves semantic content and grammaticality, and remains correctly classified by humans, and (3) efficient---it generates adversarial text with computational complexity linear to the text length.

##### Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment

Jul 27, 2019
Di Jin, Zhijing Jin, Joey Tianyi Zhou, Peter Szolovits

Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models. It is helpful to evaluate or even improve the robustness of these models by exposing the maliciously crafted adversarial examples. In this paper, we present the TextFooler, a general attack framework, to generate natural adversarial texts. By successfully applying it to two fundamental natural language tasks, text classification and textual entailment, against various target models, convolutional and recurrent neural networks as well as the most powerful pre-trained BERT, we demonstrate the advantages of this framework in three ways: (i) effective---it outperforms state-of-the-art attacks in terms of success rate and perturbation rate; (ii) utility-preserving---it preserves semantic content and grammaticality, and remains correctly classified by humans; and (iii) efficient---it generates adversarial text with computational complexity linear in the text length.

##### Fast Rigid 3D Registration Solution: A Simple Method Free of SVD and Eigen-Decomposition

Jul 02, 2018
Jin Wu, Ming Liu, Zebo Zhou, Rui Li

A novel solution is obtained to solve the rigid 3D registration problem, motivated by previous eigen-decomposition approaches. Different from existing solvers, the proposed algorithm does not require sophisticated matrix operations e.g. singular value decomposition or eigenvalue decomposition. Instead, the optimal eigenvector of the point cross-covariance matrix can be computed within several iterations. It is also proven that the optimal rotation matrix can be directly computed for cases without need of quaternion. The simple framework provides very easy approach of integer-implementation on embedded platforms. Simulations on noise-corrupted point clouds have verified the robustness and computation speed of the proposed method. The final results indicate that the proposed algorithm is accurate, robust and owns over $60\% \sim 80\%$ less computation time than representatives. It has also been applied to real-world applications for faster relative robotic navigation.

##### Fast Symbolic 3D Registration Solution

May 12, 2018
Jin Wu, Ming Liu, Zebo Zhou, Rui Li

3D registration has always been performed invoking singular value decomposition (SVD) or eigenvalue decomposition (EIG) in real engineering practices. However, numerical algorithms suffer from uncertainty of convergence in many cases. A novel fast symbolic solution is proposed in this paper by following our recent publication in this journal. The equivalence analysis shows that our previous solver can be converted to deal with the 3D registration problem. Rather, the computation procedure is studied for further simplification of computing without complex numbers support. Experimental results show that the proposed solver does not loose accuracy and robustness but improves the execution speed to a large extent by almost \%50 to \%80, on both personal computer and embedded processor.

* 7 pages, 2 figures, 4 tables
##### A novel total variation model based on kernel functions and its application

Nov 19, 2017
Zhizheng Liang, Lei Zhang, Jin Liu, Yong Zhou

The total variation (TV) model and its related variants have already been proposed for image processing in previous literature. In this paper a novel total variation model based on kernel functions is proposed. In this novel model, we first map each pixel value of an image into a Hilbert space by using a nonlinear map, and then define a coupled image of an original image in order to construct a kernel function. Finally, the proposed model is solved in a kernel function space instead of in the projecting space from a nonlinear map. For the proposed model, we theoretically show under what conditions the mapping image is in the space of bounded variation when the original image is in the space of bounded variation. It is also found that the proposed model further extends the generalized TV model and the information from three different channels of color images can be fused by adopting various kernel functions. A series of experiments on some gray and color images are carried out to demonstrate the effectiveness of the proposed model.

* 22 pages, 5 figures, 2 tables
##### Global and Local Information Based Deep Network for Skin Lesion Segmentation

Mar 16, 2017
Jin Qi, Miao Le, Chunming Li, Ping Zhou

With a large influx of dermoscopy images and a growing shortage of dermatologists, automatic dermoscopic image analysis plays an essential role in skin cancer diagnosis. In this paper, a new deep fully convolutional neural network (FCNN) is proposed to automatically segment melanoma out of skin images by end-to-end learning with only pixels and labels as inputs. Our proposed FCNN is capable of using both local and global information to segment melanoma by adopting skipping layers. The public benchmark database consisting of 150 validation images, 600 test images and 2000 training images in the melanoma detection challenge 2017 at International Symposium Biomedical Imaging 2017 is used to test the performance of our algorithm. All large size images (for example, $4000\times 6000$ pixels) are reduced to much smaller images with $384\times 384$ pixels (more than 10 times smaller). We got and submitted preliminary results to the challenge without any pre or post processing. The performance of our proposed method could be further improved by data augmentation and by avoiding image size reduction.

* 4 pages, 3 figures. ISIC2017
##### Addict Free -- A Smart and Connected Relapse Intervention Mobile App

Dec 02, 2019
Zhou Yang, Vinay Jayachandra Reddy, Rashmi Kesidi, Fang Jin

It is widely acknowledged that addiction relapse is highly associated with spatial-temporal factors such as some specific places or time periods. Current studies suggest that those factors can be utilized for better relapse interventions, however, there is no relapse prevention application that makes use of those factors. In this paper, we introduce a mobile app called "Addict Free", which records user profiles, tracks relapse history and summarizes recovering statistics to help users better understand their recovering situations. Also, this app builds a relapse recovering community, which allows users to ask for advice and encouragement, and share relapse prevention experience. Moreover, machine learning algorithms that ingest spatial and temporal factors are utilized to predict relapse, based on which helpful addiction diversion activities are recommended by a recovering recommendation algorithm. By interacting with users, this app targets at providing smart suggestions that aim to stop relapse, especially for alcohol and tobacco addiction users.

* 4 pages