In retailer management, the Newsvendor problem has widely attracted attention as one of basic inventory models. In the traditional approach to solving this problem, it relies on the probability distribution of the demand. In theory, if the probability distribution is known, the problem can be considered as fully solved. However, in any real world scenario, it is almost impossible to even approximate or estimate a better probability distribution for the demand. In recent years, researchers start adopting machine learning approach to learn a demand prediction model by using other feature information. In this paper, we propose a supervised learning that optimizes the demand quantities for products based on feature information. We demonstrate that the original Newsvendor loss function as the training objective outperforms the recently suggested quadratic loss function. The new algorithm has been assessed on both the synthetic data and real-world data, demonstrating better performance.

Click to Read Paper
Traditional Recurrent Neural Networks assume vectorized data as inputs. However many data from modern science and technology come in certain structures such as tensorial time series data. To apply the recurrent neural networks for this type of data, a vectorisation process is necessary, while such a vectorisation leads to the loss of the precise information of the spatial or longitudinal dimensions. In addition, such a vectorized data is not an optimum solution for learning the representation of the longitudinal data. In this paper, we propose a new variant of tensorial neural networks which directly take tensorial time series data as inputs. We call this new variant as Tensorial Recurrent Neural Network (TRNN). The proposed TRNN is based on tensor Tucker decomposition.

Click to Read Paper
In this paper we present Collaborative Low-Rank Subspace Clustering. Given multiple observations of a phenomenon we learn a unified representation matrix. This unified matrix incorporates the features from all the observations, thus increasing the discriminative power compared with learning the representation matrix on each observation separately. Experimental evaluation shows that our method outperforms subspace clustering on separate observations and the state of the art collaborative learning algorithm.

Click to Read Paper
Sparse Subspace Clustering (SSC) has been used extensively for subspace identification tasks due to its theoretical guarantees and relative ease of implementation. However SSC has quadratic computation and memory requirements with respect to the number of input data points. This burden has prohibited SSCs use for all but the smallest datasets. To overcome this we propose a new method, k-SSC, that screens out a large number of data points to both reduce SSC to linear memory and computational requirements. We provide theoretical analysis for the bounds of success for k-SSC. Our experiments show that k-SSC exceeds theoretical expectations and outperforms existing SSC approximations by maintaining the classification performance of SSC. Furthermore in the spirit of reproducible research we have publicly released the source code for k-SSC

Click to Read Paper
Traditional neural networks assume vectorial inputs as the network is arranged as layers of single line of computing units called neurons. This special structure requires the non-vectorial inputs such as matrices to be converted into vectors. This process can be problematic. Firstly, the spatial information among elements of the data may be lost during vectorisation. Secondly, the solution space becomes very large which demands very special treatments to the network parameters and high computational cost. To address these issues, we propose matrix neural networks (MatNet), which takes matrices directly as inputs. Each neuron senses summarised information through bilinear mapping from lower layer units in exactly the same way as the classic feed forward neural networks. Under this structure, back prorogation and gradient descent combination can be utilised to obtain network parameters efficiently. Furthermore, it can be conveniently extended for multimodal inputs. We apply MatNet to MNIST handwritten digits classification and image super resolution tasks to show its effectiveness. Without too much tweaking MatNet achieves comparable performance as the state-of-the-art methods in both tasks with considerably reduced complexity.

* 20 pages, 5 figures
Click to Read Paper
We propose Ordered Subspace Clustering (OSC) to segment data drawn from a sequentially ordered union of subspaces. Similar to Sparse Subspace Clustering (SSC) we formulate the problem as one of finding a sparse representation but include an additional penalty term to take care of sequential data. We test our method on data drawn from infrared hyper spectral, video and motion capture data. Experiments show that our method, OSC, outperforms the state of the art methods: Spatial Subspace Clustering (SpatSC), Low-Rank Representation (LRR) and SSC.

Click to Read Paper
Recovering intrinsic low dimensional subspaces from data distributed on them is a key preprocessing step to many applications. In recent years, there has been a lot of work that models subspace recovery as low rank minimization problems. We find that some representative models, such as Robust Principal Component Analysis (R-PCA), Robust Low Rank Representation (R-LRR), and Robust Latent Low Rank Representation (R-LatLRR), are actually deeply connected. More specifically, we discover that once a solution to one of the models is obtained, we can obtain the solutions to other models in closed-form formulations. Since R-PCA is the simplest, our discovery makes it the center of low rank subspace recovery models. Our work has two important implications. First, R-PCA has a solid theoretical foundation. Under certain conditions, we could find better solutions to these low rank models at overwhelming probabilities, although these models are non-convex. Second, we can obtain significantly faster algorithms for these models by solving R-PCA first. The computation cost can be further cut by applying low complexity randomized algorithms, e.g., our novel $\ell_{2,1}$ filtering algorithm, to R-PCA. Experiments verify the advantages of our algorithms over other state-of-the-art ones that are based on the alternating direction method.

* Submitted to Neural Computation
Click to Read Paper
A general framework of least squares support vector machine with low rank kernels, referred to as LR-LSSVM, is introduced in this paper. The special structure of low rank kernels with a controlled model size brings sparsity as well as computational efficiency to the proposed model. Meanwhile, a two-step optimization algorithm with three different criteria is proposed and various experiments are carried out using the example of the so-call robust RBF kernel to validate the model. The experiment results show that the performance of the proposed algorithm is comparable or superior to several existing kernel machines.

* 2019 IJCNN
Click to Read Paper
Video-based person re-identification (re-id) is a central application in surveillance systems with significant concern in security. Matching persons across disjoint camera views in their video fragments is inherently challenging due to the large visual variations and uncontrolled frame rates. There are two steps crucial to person re-id, namely discriminative feature learning and metric learning. However, existing approaches consider the two steps independently, and they do not make full use of the temporal and spatial information in videos. In this paper, we propose a Siamese attention architecture that jointly learns spatiotemporal video representations and their similarity metrics. The network extracts local convolutional features from regions of each frame, and enhance their discriminative capability by focusing on distinct regions when measuring the similarity with another pedestrian video. The attention mechanism is embedded into spatial gated recurrent units to selectively propagate relevant features and memorize their spatial dependencies through the network. The model essentially learns which parts (\emph{where}) from which frames (\emph{when}) are relevant and distinctive for matching persons and attaches higher importance therein. The proposed Siamese model is end-to-end trainable to jointly learn comparable hidden representations for paired pedestrian videos and their similarity value. Extensive experiments on three benchmark datasets show the effectiveness of each component of the proposed deep network while outperforming state-of-the-art methods.

* Appearing in IEEE Transactions on Multimedia. arXiv admin note: text overlap with arXiv:1606.01609
Click to Read Paper
Person re-identification (re-ID) requires rapid, flexible yet discriminant representations to quickly generalize to unseen observations on-the-fly and recognize the same identity across disjoint camera views. Recent effective methods are developed in a pair-wise similarity learning system to detect a fixed set of features from distinct regions which are mapped to their vector embeddings for the distance measuring. However, the most relevant and crucial parts of each image are detected independently without referring to the dependency conditioned on one and another. Also, these region based methods rely on spatial manipulation to position the local features in comparable similarity measuring. To combat these limitations, in this paper we introduce the Deep Co-attention based Comparators (DCCs) that fuse the co-dependent representations of the paired images so as to focus on the relevant parts of both images and produce their \textit{relative representations}. Given a pair of pedestrian images to be compared, the proposed model mimics the foveation of human eyes to detect distinct regions concurrent on both images, namely co-dependent features, and alternatively attend to relevant regions to fuse them into the similarity learning. Our comparator is capable of producing dynamic representations relative to a particular sample every time, and thus well-suited to the case of re-identifying pedestrians on-the-fly. We perform extensive experiments to provide the insights and demonstrate the effectiveness of the proposed DCCs in person re-ID. Moreover, our approach has achieved the state-of-the-art performance on three benchmark data sets: DukeMTMC-reID \cite{DukeMTMC}, CUHK03 \cite{FPNN}, and Market-1501 \cite{Market1501}.

Click to Read Paper
Matching pedestrians across disjoint camera views, known as person re-identification (re-id), is a challenging problem that is of importance to visual recognition and surveillance. Most existing methods exploit local regions within spatial manipulation to perform matching in local correspondence. However, they essentially extract \emph{fixed} representations from pre-divided regions for each image and perform matching based on the extracted representation subsequently. For models in this pipeline, local finer patterns that are crucial to distinguish positive pairs from negative ones cannot be captured, and thus making them underperformed. In this paper, we propose a novel deep multiplicative integration gating function, which answers the question of \emph{what-and-where to match} for effective person re-id. To address \emph{what} to match, our deep network emphasizes common local patterns by learning joint representations in a multiplicative way. The network comprises two Convolutional Neural Networks (CNNs) to extract convolutional activations, and generates relevant descriptors for pedestrian matching. This thus, leads to flexible representations for pair-wise images. To address \emph{where} to match, we combat the spatial misalignment by performing spatially recurrent pooling via a four-directional recurrent neural network to impose spatial dependency over all positions with respect to the entire image. The proposed network is designed to be end-to-end trainable to characterize local pairwise feature interactions in a spatially aligned manner. To demonstrate the superiority of our method, extensive experiments are conducted over three benchmark data sets: VIPeR, CUHK03 and Market-1501.

* Published at Pattern Recognition, Elsevier
Click to Read Paper
Person re-identification (re-id) aims to match pedestrians observed by disjoint camera views. It attracts increasing attention in computer vision due to its importance to surveillance system. To combat the major challenge of cross-view visual variations, deep embedding approaches are proposed by learning a compact feature space from images such that the Euclidean distances correspond to their cross-view similarity metric. However, the global Euclidean distance cannot faithfully characterize the ideal similarity in a complex visual feature space because features of pedestrian images exhibit unknown distributions due to large variations in poses, illumination and occlusion. Moreover, intra-personal training samples within a local range are robust to guide deep embedding against uncontrolled variations, which however, cannot be captured by a global Euclidean distance. In this paper, we study the problem of person re-id by proposing a novel sampling to mine suitable \textit{positives} (i.e. intra-class) within a local range to improve the deep embedding in the context of large intra-class variations. Our method is capable of learning a deep similarity metric adaptive to local sample structure by minimizing each sample's local distances while propagating through the relationship between samples to attain the whole intra-class minimization. To this end, a novel objective function is proposed to jointly optimize similarity metric learning, local positive mining and robust deep embedding. This yields local discriminations by selecting local-ranged positive samples, and the learned features are robust to dramatic intra-class variations. Experiments on benchmarks show state-of-the-art results achieved by our method.

* Published on Pattern Recognition
Click to Read Paper
In machine learning it is common to interpret each data point as a vector in Euclidean space. However the data may actually be functional i.e.\ each data point is a function of some variable such as time and the function is discretely sampled. The naive treatment of functional data as traditional multivariate data can lead to poor performance since the algorithms are ignoring the correlation in the curvature of each function. In this paper we propose a tractable method to cluster functional data or curves by adapting the Euclidean Low-Rank Representation (LRR) to the curve manifold. Experimental evaluation on synthetic and real data reveals that this method massively outperforms prior clustering methods in both speed and accuracy when clustering functional data.

* arXiv admin note: substantial text overlap with arXiv:1601.00732
Click to Read Paper
The plenty information from multiple views data as well as the complementary information among different views are usually beneficial to various tasks, e.g., clustering, classification, de-noising. Multi-view subspace clustering is based on the fact that the multi-view data are generated from a latent subspace. To recover the underlying subspace structure, the success of the sparse and/or low-rank subspace clustering has been witnessed recently. Despite some state-of-the-art subspace clustering approaches can numerically handle multi-view data, by simultaneously exploring all possible pairwise correlation within views, the high order statistics is often disregarded which can only be captured by simultaneously utilizing all views. As a consequence, the clustering performance for multi-view data is compromised. To address this issue, in this paper, a novel multi-view clustering method is proposed by using \textit{t-product} in third-order tensor space. Based on the circular convolution operation, multi-view data can be effectively represented by a \textit{t-linear} combination with sparse and low-rank penalty using "self-expressiveness". Our extensive experimental results on facial, object, digits image and text data demonstrate that the proposed method outperforms the state-of-the-art methods in terms of many criteria.

Click to Read Paper
In machine learning it is common to interpret each data point as a vector in Euclidean space. However the data may actually be functional i.e.\ each data point is a function of some variable such as time and the function is discretely sampled. The naive treatment of functional data as traditional multivariate data can lead to poor performance since the algorithms are ignoring the correlation in the curvature of each function. In this paper we propose a method to analyse subspace structure of the functional data by using the state of the art Low-Rank Representation (LRR). Experimental evaluation on synthetic and real data reveals that this method massively outperforms conventional LRR in tasks concerning functional data.

Click to Read Paper
The paper addresses the problem of optimizing a class of composite functions on Riemannian manifolds and a new first order optimization algorithm (FOA) with a fast convergence rate is proposed. Through the theoretical analysis for FOA, it has been proved that the algorithm has quadratic convergence. The experiments in the matrix completion task show that FOA has better performance than other first order optimization methods on Riemannian manifolds. A fast subspace pursuit method based on FOA is proposed to solve the low-rank representation model based on augmented Lagrange method on the low rank matrix variety. Experimental results on synthetic and real data sets are presented to demonstrate that both FOA and SP-RPRG(ALM) can achieve superior performance in terms of faster convergence and higher accuracy.

Click to Read Paper
A l1-norm penalized orthogonal forward regression (l1-POFR) algorithm is proposed based on the concept of leaveone- out mean square error (LOOMSE). Firstly, a new l1-norm penalized cost function is defined in the constructed orthogonal space, and each orthogonal basis is associated with an individually tunable regularization parameter. Secondly, due to orthogonal computation, the LOOMSE can be analytically computed without actually splitting the data set, and moreover a closed form of the optimal regularization parameter in terms of minimal LOOMSE is derived. Thirdly, a lower bound for regularization parameters is proposed, which can be used for robust LOOMSE estimation by adaptively detecting and removing regressors to an inactive set so that the computational cost of the algorithm is significantly reduced. Illustrative examples are included to demonstrate the effectiveness of this new l1-POFR approach.

Click to Read Paper
In this paper, we present a novel low rank representation (LRR) algorithm for data lying on the manifold of square root densities. Unlike traditional LRR methods which rely on the assumption that the data points are vectors in the Euclidean space, our new algorithm is designed to incorporate the intrinsic geometric structure and geodesic distance of the manifold. Experiments on several computer vision datasets showcase its noise robustness and superior performance on classification and subspace clustering compared to other state-of-the-art approaches.

Click to Read Paper
Sparse subspace clustering (SSC), as one of the most successful subspace clustering methods, has achieved notable clustering accuracy in computer vision tasks. However, SSC applies only to vector data in Euclidean space. As such, there is still no satisfactory approach to solve subspace clustering by ${\it self-expressive}$ principle for symmetric positive definite (SPD) matrices which is very useful in computer vision. In this paper, by embedding the SPD matrices into a Reproducing Kernel Hilbert Space (RKHS), a kernel subspace clustering method is constructed on the SPD manifold through an appropriate Log-Euclidean kernel, termed as kernel sparse subspace clustering on the SPD Riemannian manifold (KSSCR). By exploiting the intrinsic Riemannian geometry within data, KSSCR can effectively characterize the geodesic distance between SPD matrices to uncover the underlying subspace structure. Experimental results on two famous database demonstrate that the proposed method achieves better clustering results than the state-of-the-art approaches.

Click to Read Paper
Dimensionality reduction for high-order tensors is a challenging problem. In conventional approaches, higher order tensors are `vectorized` via Tucker decomposition to obtain lower order tensors. This will destroy the inherent high-order structures or resulting in undesired tensors, respectively. This paper introduces a probabilistic vectorial dimensionality reduction model for tensorial data. The model represents a tensor by employing a linear combination of same order basis tensors, thus it offers a mechanism to directly reduce a tensor to a vector. Under this expression, the projection base of the model is based on the tensor CandeComp/PARAFAC (CP) decomposition and the number of free parameters in the model only grows linearly with the number of modes rather than exponentially. A Bayesian inference has been established via the variational EM approach. A criterion to set the parameters (factor number of CP decomposition and the number of extracted features) is empirically given. The model outperforms several existing PCA-based methods and CP decomposition on several publicly available databases in terms of classification and clustering accuracy.

* Submiting to TNNLS
Click to Read Paper