Research papers and code for "Junbin Gao":
In retailer management, the Newsvendor problem has widely attracted attention as one of basic inventory models. In the traditional approach to solving this problem, it relies on the probability distribution of the demand. In theory, if the probability distribution is known, the problem can be considered as fully solved. However, in any real world scenario, it is almost impossible to even approximate or estimate a better probability distribution for the demand. In recent years, researchers start adopting machine learning approach to learn a demand prediction model by using other feature information. In this paper, we propose a supervised learning that optimizes the demand quantities for products based on feature information. We demonstrate that the original Newsvendor loss function as the training objective outperforms the recently suggested quadratic loss function. The new algorithm has been assessed on both the synthetic data and real-world data, demonstrating better performance.

Click to Read Paper and Get Code
Matrix completion has a long-time history of usage as the core technique of recommender systems. In particular, 1-bit matrix completion, which considers the prediction as a ``Recommended'' or ``Not Recommended'' question, has proved its significance and validity in the field. However, while customers and products aggregate into interacted clusters, state-of-the-art model-based 1-bit recommender systems do not take the consideration of grouping bias. To tackle the gap, this paper introduced Group-Specific 1-bit Matrix Completion (GS1MC) by first-time consolidating group-specific effects into 1-bit recommender systems under the low-rank latent variable framework. Additionally, to empower GS1MC even when grouping information is unobtainable, Cluster Developing Matrix Completion (CDMC) was proposed by integrating the sparse subspace clustering technique into GS1MC. Namely, CDMC allows clustering users/items and to leverage their group effects into matrix completion at the same time. Experiments on synthetic and real-world data show that GS1MC outperforms the current 1-bit matrix completion methods. Meanwhile, it is compelling that CDMC can successfully capture items' genre features only based on sparse binary user-item interactive data. Notably, GS1MC provides a new insight to incorporate and evaluate the efficacy of clustering methods while CDMC can be served as a new tool to explore unrevealed social behavior or market phenomenon.

* 16 Pages
Click to Read Paper and Get Code
Traditional Recurrent Neural Networks assume vectorized data as inputs. However many data from modern science and technology come in certain structures such as tensorial time series data. To apply the recurrent neural networks for this type of data, a vectorisation process is necessary, while such a vectorisation leads to the loss of the precise information of the spatial or longitudinal dimensions. In addition, such a vectorized data is not an optimum solution for learning the representation of the longitudinal data. In this paper, we propose a new variant of tensorial neural networks which directly take tensorial time series data as inputs. We call this new variant as Tensorial Recurrent Neural Network (TRNN). The proposed TRNN is based on tensor Tucker decomposition.

Click to Read Paper and Get Code
In this paper we present Collaborative Low-Rank Subspace Clustering. Given multiple observations of a phenomenon we learn a unified representation matrix. This unified matrix incorporates the features from all the observations, thus increasing the discriminative power compared with learning the representation matrix on each observation separately. Experimental evaluation shows that our method outperforms subspace clustering on separate observations and the state of the art collaborative learning algorithm.

Click to Read Paper and Get Code
Sparse Subspace Clustering (SSC) has been used extensively for subspace identification tasks due to its theoretical guarantees and relative ease of implementation. However SSC has quadratic computation and memory requirements with respect to the number of input data points. This burden has prohibited SSCs use for all but the smallest datasets. To overcome this we propose a new method, k-SSC, that screens out a large number of data points to both reduce SSC to linear memory and computational requirements. We provide theoretical analysis for the bounds of success for k-SSC. Our experiments show that k-SSC exceeds theoretical expectations and outperforms existing SSC approximations by maintaining the classification performance of SSC. Furthermore in the spirit of reproducible research we have publicly released the source code for k-SSC

Click to Read Paper and Get Code
Traditional neural networks assume vectorial inputs as the network is arranged as layers of single line of computing units called neurons. This special structure requires the non-vectorial inputs such as matrices to be converted into vectors. This process can be problematic. Firstly, the spatial information among elements of the data may be lost during vectorisation. Secondly, the solution space becomes very large which demands very special treatments to the network parameters and high computational cost. To address these issues, we propose matrix neural networks (MatNet), which takes matrices directly as inputs. Each neuron senses summarised information through bilinear mapping from lower layer units in exactly the same way as the classic feed forward neural networks. Under this structure, back prorogation and gradient descent combination can be utilised to obtain network parameters efficiently. Furthermore, it can be conveniently extended for multimodal inputs. We apply MatNet to MNIST handwritten digits classification and image super resolution tasks to show its effectiveness. Without too much tweaking MatNet achieves comparable performance as the state-of-the-art methods in both tasks with considerably reduced complexity.

* 20 pages, 5 figures
Click to Read Paper and Get Code
We propose Ordered Subspace Clustering (OSC) to segment data drawn from a sequentially ordered union of subspaces. Similar to Sparse Subspace Clustering (SSC) we formulate the problem as one of finding a sparse representation but include an additional penalty term to take care of sequential data. We test our method on data drawn from infrared hyper spectral, video and motion capture data. Experiments show that our method, OSC, outperforms the state of the art methods: Spatial Subspace Clustering (SpatSC), Low-Rank Representation (LRR) and SSC.

Click to Read Paper and Get Code
Smartphones have become the ultimate 'personal' computer, yet despite this, general-purpose data-mining and knowledge discovery tools for mobile devices are surprisingly rare. DataLearner is a new data-mining application designed specifically for Android devices that imports the Weka data-mining engine and augments it with algorithms developed by Charles Sturt University. Moreover, DataLearner can be expanded with additional algorithms. Combined, DataLearner delivers 40 classification, clustering and association rule mining algorithms for model training and evaluation without need for cloud computing resources or network connectivity. It provides the same classification accuracy as PCs and laptops, while doing so with acceptable processing speed and consuming negligible battery life. With its ability to provide easy-to-use data-mining on a phone-size screen, DataLearner is a new portable, self-contained data-mining tool for remote, personalised and learning applications alike. DataLearner features four elements - this paper, the app available on Google Play, the GPL3-licensed source code on GitHub and a short video on YouTube.

* 15 pages, 5 figures, 3 tables
Click to Read Paper and Get Code
Recovering intrinsic low dimensional subspaces from data distributed on them is a key preprocessing step to many applications. In recent years, there has been a lot of work that models subspace recovery as low rank minimization problems. We find that some representative models, such as Robust Principal Component Analysis (R-PCA), Robust Low Rank Representation (R-LRR), and Robust Latent Low Rank Representation (R-LatLRR), are actually deeply connected. More specifically, we discover that once a solution to one of the models is obtained, we can obtain the solutions to other models in closed-form formulations. Since R-PCA is the simplest, our discovery makes it the center of low rank subspace recovery models. Our work has two important implications. First, R-PCA has a solid theoretical foundation. Under certain conditions, we could find better solutions to these low rank models at overwhelming probabilities, although these models are non-convex. Second, we can obtain significantly faster algorithms for these models by solving R-PCA first. The computation cost can be further cut by applying low complexity randomized algorithms, e.g., our novel $\ell_{2,1}$ filtering algorithm, to R-PCA. Experiments verify the advantages of our algorithms over other state-of-the-art ones that are based on the alternating direction method.

* Submitted to Neural Computation
Click to Read Paper and Get Code
A general framework of least squares support vector machine with low rank kernels, referred to as LR-LSSVM, is introduced in this paper. The special structure of low rank kernels with a controlled model size brings sparsity as well as computational efficiency to the proposed model. Meanwhile, a two-step optimization algorithm with three different criteria is proposed and various experiments are carried out using the example of the so-call robust RBF kernel to validate the model. The experiment results show that the performance of the proposed algorithm is comparable or superior to several existing kernel machines.

* 2019 IJCNN
Click to Read Paper and Get Code
Video-based person re-identification (re-id) is a central application in surveillance systems with significant concern in security. Matching persons across disjoint camera views in their video fragments is inherently challenging due to the large visual variations and uncontrolled frame rates. There are two steps crucial to person re-id, namely discriminative feature learning and metric learning. However, existing approaches consider the two steps independently, and they do not make full use of the temporal and spatial information in videos. In this paper, we propose a Siamese attention architecture that jointly learns spatiotemporal video representations and their similarity metrics. The network extracts local convolutional features from regions of each frame, and enhance their discriminative capability by focusing on distinct regions when measuring the similarity with another pedestrian video. The attention mechanism is embedded into spatial gated recurrent units to selectively propagate relevant features and memorize their spatial dependencies through the network. The model essentially learns which parts (\emph{where}) from which frames (\emph{when}) are relevant and distinctive for matching persons and attaches higher importance therein. The proposed Siamese model is end-to-end trainable to jointly learn comparable hidden representations for paired pedestrian videos and their similarity value. Extensive experiments on three benchmark datasets show the effectiveness of each component of the proposed deep network while outperforming state-of-the-art methods.

* Appearing in IEEE Transactions on Multimedia. arXiv admin note: text overlap with arXiv:1606.01609
Click to Read Paper and Get Code
Person re-identification (re-ID) requires rapid, flexible yet discriminant representations to quickly generalize to unseen observations on-the-fly and recognize the same identity across disjoint camera views. Recent effective methods are developed in a pair-wise similarity learning system to detect a fixed set of features from distinct regions which are mapped to their vector embeddings for the distance measuring. However, the most relevant and crucial parts of each image are detected independently without referring to the dependency conditioned on one and another. Also, these region based methods rely on spatial manipulation to position the local features in comparable similarity measuring. To combat these limitations, in this paper we introduce the Deep Co-attention based Comparators (DCCs) that fuse the co-dependent representations of the paired images so as to focus on the relevant parts of both images and produce their \textit{relative representations}. Given a pair of pedestrian images to be compared, the proposed model mimics the foveation of human eyes to detect distinct regions concurrent on both images, namely co-dependent features, and alternatively attend to relevant regions to fuse them into the similarity learning. Our comparator is capable of producing dynamic representations relative to a particular sample every time, and thus well-suited to the case of re-identifying pedestrians on-the-fly. We perform extensive experiments to provide the insights and demonstrate the effectiveness of the proposed DCCs in person re-ID. Moreover, our approach has achieved the state-of-the-art performance on three benchmark data sets: DukeMTMC-reID \cite{DukeMTMC}, CUHK03 \cite{FPNN}, and Market-1501 \cite{Market1501}.

Click to Read Paper and Get Code
Matching pedestrians across disjoint camera views, known as person re-identification (re-id), is a challenging problem that is of importance to visual recognition and surveillance. Most existing methods exploit local regions within spatial manipulation to perform matching in local correspondence. However, they essentially extract \emph{fixed} representations from pre-divided regions for each image and perform matching based on the extracted representation subsequently. For models in this pipeline, local finer patterns that are crucial to distinguish positive pairs from negative ones cannot be captured, and thus making them underperformed. In this paper, we propose a novel deep multiplicative integration gating function, which answers the question of \emph{what-and-where to match} for effective person re-id. To address \emph{what} to match, our deep network emphasizes common local patterns by learning joint representations in a multiplicative way. The network comprises two Convolutional Neural Networks (CNNs) to extract convolutional activations, and generates relevant descriptors for pedestrian matching. This thus, leads to flexible representations for pair-wise images. To address \emph{where} to match, we combat the spatial misalignment by performing spatially recurrent pooling via a four-directional recurrent neural network to impose spatial dependency over all positions with respect to the entire image. The proposed network is designed to be end-to-end trainable to characterize local pairwise feature interactions in a spatially aligned manner. To demonstrate the superiority of our method, extensive experiments are conducted over three benchmark data sets: VIPeR, CUHK03 and Market-1501.

* Published at Pattern Recognition, Elsevier
Click to Read Paper and Get Code
Person re-identification (re-id) aims to match pedestrians observed by disjoint camera views. It attracts increasing attention in computer vision due to its importance to surveillance system. To combat the major challenge of cross-view visual variations, deep embedding approaches are proposed by learning a compact feature space from images such that the Euclidean distances correspond to their cross-view similarity metric. However, the global Euclidean distance cannot faithfully characterize the ideal similarity in a complex visual feature space because features of pedestrian images exhibit unknown distributions due to large variations in poses, illumination and occlusion. Moreover, intra-personal training samples within a local range are robust to guide deep embedding against uncontrolled variations, which however, cannot be captured by a global Euclidean distance. In this paper, we study the problem of person re-id by proposing a novel sampling to mine suitable \textit{positives} (i.e. intra-class) within a local range to improve the deep embedding in the context of large intra-class variations. Our method is capable of learning a deep similarity metric adaptive to local sample structure by minimizing each sample's local distances while propagating through the relationship between samples to attain the whole intra-class minimization. To this end, a novel objective function is proposed to jointly optimize similarity metric learning, local positive mining and robust deep embedding. This yields local discriminations by selecting local-ranged positive samples, and the learned features are robust to dramatic intra-class variations. Experiments on benchmarks show state-of-the-art results achieved by our method.

* Published on Pattern Recognition
Click to Read Paper and Get Code
In machine learning it is common to interpret each data point as a vector in Euclidean space. However the data may actually be functional i.e.\ each data point is a function of some variable such as time and the function is discretely sampled. The naive treatment of functional data as traditional multivariate data can lead to poor performance since the algorithms are ignoring the correlation in the curvature of each function. In this paper we propose a tractable method to cluster functional data or curves by adapting the Euclidean Low-Rank Representation (LRR) to the curve manifold. Experimental evaluation on synthetic and real data reveals that this method massively outperforms prior clustering methods in both speed and accuracy when clustering functional data.

* arXiv admin note: substantial text overlap with arXiv:1601.00732
Click to Read Paper and Get Code
The plenty information from multiple views data as well as the complementary information among different views are usually beneficial to various tasks, e.g., clustering, classification, de-noising. Multi-view subspace clustering is based on the fact that the multi-view data are generated from a latent subspace. To recover the underlying subspace structure, the success of the sparse and/or low-rank subspace clustering has been witnessed recently. Despite some state-of-the-art subspace clustering approaches can numerically handle multi-view data, by simultaneously exploring all possible pairwise correlation within views, the high order statistics is often disregarded which can only be captured by simultaneously utilizing all views. As a consequence, the clustering performance for multi-view data is compromised. To address this issue, in this paper, a novel multi-view clustering method is proposed by using \textit{t-product} in third-order tensor space. Based on the circular convolution operation, multi-view data can be effectively represented by a \textit{t-linear} combination with sparse and low-rank penalty using "self-expressiveness". Our extensive experimental results on facial, object, digits image and text data demonstrate that the proposed method outperforms the state-of-the-art methods in terms of many criteria.

Click to Read Paper and Get Code
In machine learning it is common to interpret each data point as a vector in Euclidean space. However the data may actually be functional i.e.\ each data point is a function of some variable such as time and the function is discretely sampled. The naive treatment of functional data as traditional multivariate data can lead to poor performance since the algorithms are ignoring the correlation in the curvature of each function. In this paper we propose a method to analyse subspace structure of the functional data by using the state of the art Low-Rank Representation (LRR). Experimental evaluation on synthetic and real data reveals that this method massively outperforms conventional LRR in tasks concerning functional data.

Click to Read Paper and Get Code
The paper addresses the problem of optimizing a class of composite functions on Riemannian manifolds and a new first order optimization algorithm (FOA) with a fast convergence rate is proposed. Through the theoretical analysis for FOA, it has been proved that the algorithm has quadratic convergence. The experiments in the matrix completion task show that FOA has better performance than other first order optimization methods on Riemannian manifolds. A fast subspace pursuit method based on FOA is proposed to solve the low-rank representation model based on augmented Lagrange method on the low rank matrix variety. Experimental results on synthetic and real data sets are presented to demonstrate that both FOA and SP-RPRG(ALM) can achieve superior performance in terms of faster convergence and higher accuracy.

Click to Read Paper and Get Code
A l1-norm penalized orthogonal forward regression (l1-POFR) algorithm is proposed based on the concept of leaveone- out mean square error (LOOMSE). Firstly, a new l1-norm penalized cost function is defined in the constructed orthogonal space, and each orthogonal basis is associated with an individually tunable regularization parameter. Secondly, due to orthogonal computation, the LOOMSE can be analytically computed without actually splitting the data set, and moreover a closed form of the optimal regularization parameter in terms of minimal LOOMSE is derived. Thirdly, a lower bound for regularization parameters is proposed, which can be used for robust LOOMSE estimation by adaptively detecting and removing regressors to an inactive set so that the computational cost of the algorithm is significantly reduced. Illustrative examples are included to demonstrate the effectiveness of this new l1-POFR approach.

Click to Read Paper and Get Code
In this paper, we present a novel low rank representation (LRR) algorithm for data lying on the manifold of square root densities. Unlike traditional LRR methods which rely on the assumption that the data points are vectors in the Euclidean space, our new algorithm is designed to incorporate the intrinsic geometric structure and geodesic distance of the manifold. Experiments on several computer vision datasets showcase its noise robustness and superior performance on classification and subspace clustering compared to other state-of-the-art approaches.

Click to Read Paper and Get Code