Research papers and code for "Jun Yue":
In this paper, we present GASG21 (Grassmannian Adaptive Stochastic Gradient for $L_{2,1}$ norm minimization), an adaptive stochastic gradient algorithm to robustly recover the low-rank subspace from a large matrix. In the presence of column outliers, we reformulate the batch mode matrix $L_{2,1}$ norm minimization with rank constraint problem as a stochastic optimization approach constrained on Grassmann manifold. For each observed data vector, the low-rank subspace $\mathcal{S}$ is updated by taking a gradient step along the geodesic of Grassmannian. In order to accelerate the convergence rate of the stochastic gradient method, we choose to adaptively tune the constant step-size by leveraging the consecutive gradients. Furthermore, we demonstrate that with proper initialization, the K-subspaces extension, K-GASG21, can robustly cluster a large number of corrupted data vectors into a union of subspaces. Numerical experiments on synthetic and real data demonstrate the efficiency and accuracy of the proposed algorithms even with heavy column outliers corruption.

* 13 pages, 12 figures and 6 tables
Click to Read Paper and Get Code
Due to that the existing traffic facilities can hardly be extended, developing traffic signal control methods is the most important way to improve the traffic efficiency of modern roundabouts. This paper proposes a novel traffic signal controller with two fuzzy layers for signalizing the roundabout. The outer layer of the controller computes urgency degrees of all the phase subsets and then activates the most urgent subset. This mechanism helps to instantly respond to the current traffic condition of the roundabout so as to improve real-timeness. The inner layer of the controller computes extension time of the current phase. If the extension value is larger than a threshold value, the current phase is maintained; otherwise the next phase in the running phase subset (selected by the outer layer) is activated. The inner layer adopts well-designed phase sequences, which helps to smooth the traffic flows and to avoid traffic jam. In general, the proposed traffic signal controller is capable of improving real-timeness as well as reducing traffic congestion. Moreover, an offline particle swarm optimization (PSO) algorithm is developed to optimize the membership functions adopted in the proposed controller. By using optimal membership functions, the performance of the controller can be further improved. Simulation results demonstrate that the proposed controller outperforms previous traffic signal controllers in terms of improving the traffic efficiency of modern roundabouts.

Click to Read Paper and Get Code
Cluster analysis and outlier detection are strongly coupled tasks in data mining area. Cluster structure can be easily destroyed by few outliers; on the contrary, the outliers are defined by the concept of cluster, which are recognized as the points belonging to none of the clusters. However, most existing studies handle them separately. In light of this, we consider the joint cluster analysis and outlier detection problem, and propose the Clustering with Outlier Removal (COR) algorithm. Generally speaking, the original space is transformed into the binary space via generating basic partitions in order to define clusters. Then an objective function based Holoentropy is designed to enhance the compactness of each cluster with a few outliers removed. With further analyses on the objective function, only partial of the problem can be handled by K-means optimization. To provide an integrated solution, an auxiliary binary matrix is nontrivally introduced so that COR completely and efficiently solves the challenging problem via a unified K-means- - with theoretical supports. Extensive experimental results on numerous data sets in various domains demonstrate the effectiveness and efficiency of COR significantly over the rivals including K-means- - and other state-of-the-art outlier detection methods in terms of cluster validity and outlier detection. Some key factors in COR are further analyzed for practical use. Finally, an application on flight trajectory is provided to demonstrate the effectiveness of COR in the real-world scenario.

Click to Read Paper and Get Code
Multiview learning problem refers to the problem of learning a classifier from multiple view data. In this data set, each data points is presented by multiple different views. In this paper, we propose a novel method for this problem. This method is based on two assumptions. The first assumption is that each data point has an intact feature vector, and each view is obtained by a linear transformation from the intact vector. The second assumption is that the intact vectors are discriminative, and in the intact space, we have a linear classifier to separate the positive class from the negative class. We define an intact vector for each data point, and a view-conditional transformation matrix for each view, and propose to reconstruct the multiple view feature vectors by the product of the corresponding intact vectors and transformation matrices. Moreover, we also propose a linear classifier in the intact space, and learn it jointly with the intact vectors. The learning problem is modeled by a minimization problem, and the objective function is composed of a Cauchy error estimator-based view-conditional reconstruction term over all data points and views, and a classification error term measured by hinge loss over all the intact vectors of all the data points. Some regularization terms are also imposed to different variables in the objective function. The minimization problem is solve by an iterative algorithm using alternate optimization strategy and gradient descent algorithm. The proposed algorithm shows it advantage in the compression to other multiview learning algorithms on benchmark data sets.

Click to Read Paper and Get Code
Stochastic gradient methods are dominant in nonconvex optimization especially for deep models but have low asymptotical convergence due to the fixed smoothness. To address this problem, we propose a simple yet effective method for improving stochastic gradient methods named predictive local smoothness (PLS). First, we create a convergence condition to build a learning rate which varies adaptively with local smoothness. Second, the local smoothness can be predicted by the latest gradients. Third, we use the adaptive learning rate to update the stochastic gradients for exploring linear convergence rates. By applying the PLS method, we implement new variants of three popular algorithms: PLS-stochastic gradient descent (PLS-SGD), PLS-accelerated SGD (PLS-AccSGD), and PLS-AMSGrad. Moreover, we provide much simpler proofs to ensure their linear convergence. Empirical results show that the variants have better performance gains than the popular algorithms, such as, faster convergence and alleviating explosion and vanish of gradients.

* 14 pages, 7 figures
Click to Read Paper and Get Code
Recently, deep convolutional neural networks (CNNs) have obtained promising results in image processing tasks including super-resolution (SR). However, most CNN-based SR methods treat low-resolution (LR) inputs and features equally across channels, rarely notice the loss of information flow caused by the activation function and fail to leverage the representation ability of CNNs. In this letter, we propose a novel single-image super-resolution (SISR) algorithm named Wider Channel Attention Network (WCAN) for remote sensing images. Firstly, the channel attention mechanism is used to adaptively recalibrate the importance of each channel at the middle of the wider attention block (WAB). Secondly, we propose the Local Memory Connection (LMC) to enhance the information flow. Finally, the features within each WAB are fused to take advantage of the network's representation capability and further improve information and gradient flow. Analytic experiments on a public remote sensing data set (UC Merced) show that our WCAN achieves better accuracy and visual improvements against most state-of-the-art methods.

* This work is proposed for remote sensing images, but the idea of the whole paper do not foucs on the characteristics of remote sensing images. The content of the article does not match the title. In this case, we want to do some experiments on the natural images to verify the three tricks in our work
Click to Read Paper and Get Code
In this paper, we propose a novel method for highly efficient follicular segmentation of thyroid cytopathological WSIs. Firstly, we propose a hybrid segmentation architecture, which integrates a classifier into Deeplab V3 by adding a branch. A large amount of the WSI segmentation time is saved by skipping the irrelevant areas using the classification branch. Secondly, we merge the low scale fine features into the original atrous spatial pyramid pooling (ASPP) in Deeplab V3 to accurately represent the details in cytopathological images. Thirdly, our hybrid model is trained by a criterion-oriented adaptive loss function, which leads the model converging much faster. Experimental results on a collection of thyroid patches demonstrate that the proposed model reaches 80.9% on the segmentation accuracy. Besides, 93% time is reduced for the WSI segmentation by using our proposed method, and the WSI-level accuracy achieves 53.4%.

Click to Read Paper and Get Code
Bronchoscopy inspection as a follow-up procedure from the radiological imaging plays a key role in lung disease diagnosis and determining treatment plans for the patients. Doctors needs to make a decision whether to biopsy the patients timely when performing bronchoscopy. However, the doctors also needs to be very selective with biopsies as biopsies may cause uncontrollable bleeding of the lung tissue which is life-threaten. To help doctors to be more selective on biopsies and provide a second opinion on diagnosis, in this work, we propose a computer-aided diagnosis (CAD) system for lung diseases including cancers and tuberculosis (TB). The system is developed based on transfer learning. We propose a novel transfer learning method: sentential fine-tuning . Compared to traditional fine-tuning methods, our methods achieves the best performance. We obtained a overall accuracy of 77.0% a dataset of 81 normal cases, 76 tuberculosis cases and 277 lung cancer cases while the other traditional transfer learning methods achieve an accuracy of 73% and 68%. . The detection accuracy of our method for cancers, TB and normal cases are 87%, 54% and 91% respectively. This indicates that the CAD system has potential to improve lung disease diagnosis accuracy in bronchoscopy and it also might be used to be more selective with biopsies.

Click to Read Paper and Get Code
Facial landmark localization is a very crucial step in numerous face related applications, such as face recognition, facial pose estimation, face image synthesis, etc. However, previous competitions on facial landmark localization (i.e., the 300-W, 300-VW and Menpo challenges) aim to predict 68-point landmarks, which are incompetent to depict the structure of facial components. In order to overcome this problem, we construct a challenging dataset, named JD-landmark. Each image is manually annotated with 106-point landmarks. This dataset covers large variations on pose and expression, which brings a lot of difficulties to predict accurate landmarks. We hold a 106-point facial landmark localization competition1 on this dataset in conjunction with IEEE International Conference on Multimedia and Expo (ICME) 2019. The purpose of this competition is to discover effective and robust facial landmark localization approaches.

* Accepted at ICME2019 Grand Challenge
Click to Read Paper and Get Code
Future works in scientific articles are valuable for researchers and they can guide researchers to new research directions or ideas. In this paper, we mine the future works in scientific articles in order to 1) provide an insight for future work analysis and 2) facilitate researchers to search and browse future works in a research area. First, we study the problem of future work extraction and propose a regular expression based method to address the problem. Second, we define four different categories for the future works by observing the data and investigate the multi-class future work classification problem. Third, we apply the extraction method and the classification model to a paper dataset in the computer science field and conduct a further analysis of the future works. Finally, we design a prototype system to search and demonstrate the future works mined from the scientific papers. Our evaluation results show that our extraction method can get high precision and recall values and our classification model can also get good results and it outperforms several baseline models. Further analysis of the future work sentences also indicates interesting results.

Click to Read Paper and Get Code
This paper presents a human-robot trust integrated task allocation and motion planning framework for multi-robot systems (MRS) in performing a set of tasks concurrently. A set of task specifications in parallel are conjuncted with MRS to synthesize a task allocation automaton. Each transition of the task allocation automaton is associated with the total trust value of human in corresponding robots. Here, the human-robot trust model is constructed with a dynamic Bayesian network (DBN) by considering individual robot performance, safety coefficient, human cognitive workload and overall evaluation of task allocation. Hence, a task allocation path with maximum encoded human-robot trust can be searched based on the current trust value of each robot in the task allocation automaton. Symbolic motion planning (SMP) is implemented for each robot after they obtain the sequence of actions. The task allocation path can be intermittently updated with this DBN based trust model. The overall strategy is demonstrated by a simulation with 5 robots and 3 parallel subtask automata.

Click to Read Paper and Get Code
Predicting the popularity of online videos is important for video streaming content providers. This is a challenging problem because of the following two reasons. First, the problem is both "wide" and "deep". That is, it not only depends on a wide range of features, but also be highly non-linear and complex. Second, multiple competitors may be involved. In this paper, we propose a general prediction model using the multi-task learning (MTL) module and the relation network (RN) module, where MTL can reduce over-fitting and RN can model the relations of multiple competitors. Experimental results show that our proposed approach significantly increases the accuracy on predicting the total view counts of TV series with RN and MTL modules.

Click to Read Paper and Get Code
This paper presents a generalized integrated framework of semi-automatic surgical template design. Several algorithms were implemented including the mesh segmentation, offset surface generation, collision detection, ruled surface generation, etc., and a special software named TemDesigner was developed. With a simple user interface, a customized template can be semi- automatically designed according to the preoperative plan. Firstly, mesh segmentation with signed scalar of vertex is utilized to partition the inner surface from the input surface mesh based on the indicated point loop. Then, the offset surface of the inner surface is obtained through contouring the distance field of the inner surface, and segmented to generate the outer surface. Ruled surface is employed to connect inner and outer surfaces. Finally, drilling tubes are generated according to the preoperative plan through collision detection and merging. It has been applied to the template design for various kinds of surgeries, including oral implantology, cervical pedicle screw insertion, iliosacral screw insertion and osteotomy, demonstrating the efficiency, functionality and generality of our method.

* Scientific Reports 6, Article number: 20280, 2016
* 18 pages, 16 figures, 2 tables, 36 references
Click to Read Paper and Get Code
Grid R-CNN is a well-performed objection detection framework. It transforms the traditional box offset regression problem into a grid point estimation problem. With the guidance of the grid points, it can obtain high-quality localization results. However, the speed of Grid R-CNN is not so satisfactory. In this technical report we present Grid R-CNN Plus, a better and faster version of Grid R-CNN. We have made several updates that significantly speed up the framework and simultaneously improve the accuracy. On COCO dataset, the Res50-FPN based Grid R-CNN Plus detector achieves an mAP of 40.4%, outperforming the baseline on the same model by 3.0 points with similar inference time. Code is available at https://github.com/STVIR/Grid-R-CNN .

Click to Read Paper and Get Code
In this paper, we propose a deep reinforcement learning (DRL) based mobility load balancing (MLB) algorithm along with a two-layer architecture to solve the large-scale load balancing problem for ultra-dense networks (UDNs). Our contribution is three-fold. First, this work proposes a two-layer architecture to solve the large-scale load balancing problem in a self-organized manner. The proposed architecture can alleviate the global traffic variations by dynamically grouping small cells into self-organized clusters according to their historical loads, and further adapt to local traffic variations through intra-cluster load balancing afterwards. Second, for the intra-cluster load balancing, this paper proposes an off-policy DRL-based MLB algorithm to autonomously learn the optimal MLB policy under an asynchronous parallel learning framework, without any prior knowledge assumed over the underlying UDN environments. Moreover, the algorithm enables joint exploration with multiple behavior policies, such that the traditional MLB methods can be used to guide the learning process thereby improving the learning efficiency and stability. Third, this work proposes an offline-evaluation based safeguard mechanism to ensure that the online system can always operate with the optimal and well-trained MLB policy, which not only stabilizes the online performance but also enables the exploration beyond current policies to make full use of machine learning in a safe way. Empirical results verify that the proposed framework outperforms the existing MLB methods in general UDN environments featured with irregular network topologies, coupled interferences, and random user movements, in terms of the load balancing performance.

Click to Read Paper and Get Code
The cloud radio access network (C-RAN) is a promising paradigm to meet the stringent requirements of the fifth generation (5G) wireless systems. Meanwhile, wireless traffic prediction is a key enabler for C-RANs to improve both the spectrum efficiency and energy efficiency through load-aware network managements. This paper proposes a scalable Gaussian process (GP) framework as a promising solution to achieve large-scale wireless traffic prediction in a cost-efficient manner. Our contribution is three-fold. First, to the best of our knowledge, this paper is the first to empower GP regression with the alternating direction method of multipliers (ADMM) for parallel hyper-parameter optimization in the training phase, where such a scalable training framework well balances the local estimation in baseband units (BBUs) and information consensus among BBUs in a principled way for large-scale executions. Second, in the prediction phase, we fuse local predictions obtained from the BBUs via a cross-validation based optimal strategy, which demonstrates itself to be reliable and robust for general regression tasks. Moreover, such a cross-validation based optimal fusion strategy is built upon a well acknowledged probabilistic model to retain the valuable closed-form GP inference properties. Third, we propose a C-RAN based scalable wireless prediction architecture, where the prediction accuracy and the time consumption can be balanced by tuning the number of the BBUs according to the real-time system demands. Experimental results show that our proposed scalable GP model can outperform the state-of-the-art approaches considerably, in terms of wireless traffic prediction performance.

Click to Read Paper and Get Code
This paper proposes a novel object detection framework named Grid R-CNN, which adopts a grid guided localization mechanism for accurate object detection. Different from the traditional regression based methods, the Grid R-CNN captures the spatial information explicitly and enjoys the position sensitive property of fully convolutional architecture. Instead of using only two independent points, we design a multi-point supervision formulation to encode more clues in order to reduce the impact of inaccurate prediction of specific points. To take the full advantage of the correlation of points in a grid, we propose a two-stage information fusion strategy to fuse feature maps of neighbor grid points. The grid guided localization approach is easy to be extended to different state-of-the-art detection frameworks. Grid R-CNN leads to high quality object localization, and experiments demonstrate that it achieves a 4.1% AP gain at IoU=0.8 and a 10.0% AP gain at IoU=0.9 on COCO benchmark compared to Faster R-CNN with Res50 backbone and FPN architecture.

Click to Read Paper and Get Code
In this paper, we present a hierarchical path planning framework called SG-RL (subgoal graphs-reinforcement learning), to plan rational paths for agents maneuvering in continuous and uncertain environments. By "rational", we mean (1) efficient path planning to eliminate first-move lags; (2) collision-free and smooth for agents with kinematic constraints satisfied. SG-RL works in a two-level manner. At the first level, SG-RL uses a geometric path-planning method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract paths, also called subgoal sequences. At the second level, SG-RL uses an RL method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal motion-planning policies which can generate kinematically feasible and collision-free trajectories between adjacent subgoals. The first advantage of the proposed method is that SSG can solve the limitations of sparse reward and local minima trap for RL agents; thus, LSPI can be used to generate paths in complex environments. The second advantage is that, when the environment changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI can deal with uncertainties by exploiting its generalization ability to handle changes in environments. Simulation experiments in representative scenarios demonstrate that, compared with existing methods, SG-RL can work well on large-scale maps with relatively low action-switching frequencies and shorter path lengths, and SG-RL can deal with small changes in environments. We further demonstrate that the design of reward functions and the types of training environments are important factors for learning feasible policies.

* 20 pages
Click to Read Paper and Get Code
While the volume of scholarly publications has increased at a frenetic pace, accessing and consuming the useful candidate papers, in very large digital libraries, is becoming an essential and challenging task for scholars. Unfortunately, because of language barrier, some scientists (especially the junior ones or graduate students who do not master other languages) cannot efficiently locate the publications hosted in a foreign language repository. In this study, we propose a novel solution, cross-language citation recommendation via Hierarchical Representation Learning on Heterogeneous Graph (HRLHG), to address this new problem. HRLHG can learn a representation function by mapping the publications, from multilingual repositories, to a low-dimensional joint embedding space from various kinds of vertexes and relations on a heterogeneous graph. By leveraging both global (task specific) plus local (task independent) information as well as a novel supervised hierarchical random walk algorithm, the proposed method can optimize the publication representations by maximizing the likelihood of locating the important cross-language neighborhoods on the graph. Experiment results show that the proposed method can not only outperform state-of-the-art baseline models, but also improve the interpretability of the representation model for cross-language citation recommendation task.

* The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR 2018), 635--644
Click to Read Paper and Get Code