Models, code, and papers for "Cheng Zhu":

Characteristic Regularisation for Super-Resolving Face Images

Dec 30, 2019
Zhiyi Cheng, Xiatian Zhu, Shaogang Gong

Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery. Such SR models, although strong at handling artificial LR images, often suffer from significant performance drop on genuine LR test data. Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data as well as cycle consistency loss formulation. However, this renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution. Importantly, this makes the end-to-end model training ineffective due to the difficulty of back-propagating gradients through two concatenated CNNs. To solve this problem, we formulate a method that joins the advantages of conventional SR and UDA models. Specifically, we separate and control the optimisations for characteristics consistifying and image super-resolving by introducing Characteristic Regularisation (CR) between them. This task split makes the model training more effective and computationally tractable. Extensive evaluations demonstrate the performance superiority of our method over state-of-the-art SR and UDA models on both genuine and artificial LR facial imagery data.

* Accepted by WACV2020 

  Access Model/Code and Paper
Super-resolution based generative adversarial network using visual perceptual loss function

Apr 24, 2019
Xuan Zhu, Yue Cheng, Rongzhi Wang

In recent years, perceptual-quality driven super-resolution methods show satisfactory results. However, super-resolved images have uncertain texture details and unpleasant artifact. We build a novel perceptual loss function composed of morphological components adversarial loss and color adversarial loss and salient content loss to ameliorate these problems. The adversarial loss is applied to constrain color and morphological components distribution of super-resolved images and the salient content loss highlights the perceptual similarity of feature-rich regions. Experiments show that proposed method achieves significant improvements in terms of perceptual index and visual quality compared with the state-of-the-art methods.

* 11 pages, 7 figures, 1 table 

  Access Model/Code and Paper
Low-Resolution Face Recognition

Nov 21, 2018
Zhiyi Cheng, Xiatian Zhu, Shaogang Gong

Whilst recent face-recognition (FR) techniques have made significant progress on recognising constrained high-resolution web images, the same cannot be said on natively unconstrained low-resolution images at large scales. In this work, we examine systematically this under-studied FR problem, and introduce a novel Complement Super-Resolution and Identity (CSRI) joint deep learning method with a unified end-to-end network architecture. We further construct a new large-scale dataset TinyFace of native unconstrained low-resolution face images from selected public datasets, because none benchmark of this nature exists in the literature. With extensive experiments we show there is a significant gap between the reported FR performances on popular benchmarks and the results on TinyFace, and the advantages of the proposed CSRI over a variety of state-of-the-art FR and super-resolution deep models on solving this largely ignored FR scenario. The TinyFace dataset is released publicly at:

  Access Model/Code and Paper
Surveillance Face Recognition Challenge

Aug 29, 2018
Zhiyi Cheng, Xiatian Zhu, Shaogang Gong

Face recognition (FR) is one of the most extensively investigated problems in computer vision. Significant progress in FR has been made due to the recent introduction of the larger scale FR challenges, particularly with constrained social media web images, e.g. high-resolution photos of celebrity faces taken by professional photo-journalists. However, the more challenging FR in unconstrained and low-resolution surveillance images remains largely under-studied. To facilitate more studies on developing FR models that are effective and robust for low-resolution surveillance facial images, we introduce a new Surveillance Face Recognition Challenge, which we call the QMUL-SurvFace benchmark. This new benchmark is the largest and more importantly the only true surveillance FR benchmark to our best knowledge, where low-resolution images are not synthesised by artificial down-sampling of native high-resolution images. This challenge contains 463,507 face images of 15,573 distinct identities captured in real-world uncooperative surveillance scenes over wide space and time. As a consequence, it presents an extremely challenging FR benchmark. We benchmark the FR performance on this challenge using five representative deep learning face recognition models, in comparison to existing benchmarks. We show that the current state of the arts are still far from being satisfactory to tackle the under-investigated surveillance FR problem in practical forensic scenarios. Face recognition is generally more difficult in an open-set setting which is typical for surveillance scenarios, owing to a large number of non-target people (distractors) appearing open spaced scenes. This is evidently so that on the new Surveillance FR Challenge, the top-performing CentreFace deep learning FR model on the MegaFace benchmark can now only achieve 13.2% success rate (at Rank-20) at a 10% false alarm rate.

* The QMUL-SurvFace challenge is publicly available at 

  Access Model/Code and Paper
Self-Adaptive Network Pruning

Oct 20, 2019
Jinting Chen, Zhaocheng Zhu, Cheng Li, Yuming Zhao

Deep convolutional neural networks have been proved successful on a wide range of tasks, yet they are still hindered by their large computation cost in many industrial scenarios. In this paper, we propose to reduce such cost for CNNs through a self-adaptive network pruning method (SANP). Our method introduces a general Saliency-and-Pruning Module (SPM) for each convolutional layer, which learns to predict saliency scores and applies pruning for each channel. Given a total computation budget, SANP adaptively determines the pruning strategy with respect to each layer and each sample, such that the average computation cost meets the budget. This design allows SANP to be more efficient in computation, as well as more robust to datasets and backbones. Extensive experiments on 2 datasets and 3 backbones show that SANP surpasses state-of-the-art methods in both classification accuracy and pruning rate.

* Published as a conference paper at ICONIP 2019 
* 10 pages, 5 figures, conference 

  Access Model/Code and Paper
High-dimensional Gaussian graphical model for network-linked data

Jul 04, 2019
Tianxi Li, Cheng Qian, Elizaveta Levina, Ji Zhu

Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations are independent and identically distributed. At the same time, observations connected by a network are becoming increasingly common, and tend to violate these assumptions. Here we develop a Gaussian graphical model for observations connected by a network with potentially different mean vectors, varying smoothly over the network. We propose an efficient estimation algorithm and demonstrate its effectiveness on both simulated and real data, obtaining meaningful interpretable results on a statistics coauthorship network. We also prove that our method estimates both the inverse covariance matrix and the corresponding graph structure correctly under the assumption of network "cohesion", which refers to the empirically observed phenomenon of network neighbors sharing similar traits.

  Access Model/Code and Paper
Explaining Latent Factor Models for Recommendation with Influence Functions

Nov 20, 2018
Weiyu Cheng, Yanyan Shen, Yanmin Zhu, Linpeng Huang

Latent factor models (LFMs) such as matrix factorization achieve the state-of-the-art performance among various Collaborative Filtering (CF) approaches for recommendation. Despite the high recommendation accuracy of LFMs, a critical issue to be resolved is the lack of explainability. Extensive efforts have been made in the literature to incorporate explainability into LFMs. However, they either rely on auxiliary information which may not be available in practice, or fail to provide easy-to-understand explanations. In this paper, we propose a fast influence analysis method named FIA, which successfully enforces explicit neighbor-style explanations to LFMs with the technique of influence functions stemmed from robust statistics. We first describe how to employ influence functions to LFMs to deliver neighbor-style explanations. Then we develop a novel influence computation algorithm for matrix factorization with high efficiency. We further extend it to the more general neural collaborative filtering and introduce an approximation algorithm to accelerate influence analysis over neural network models. Experimental results on real datasets demonstrate the correctness, efficiency and usefulness of our proposed method.

  Access Model/Code and Paper
A convergence framework for inexact nonconvex and nonsmooth algorithms and its applications to several iterations

Aug 10, 2018
Tao Sun, Hao Jiang, Lizhi Cheng, Wei Zhu

In this paper, we consider the convergence of an abstract inexact nonconvex and nonsmooth algorithm. We promise a pseudo sufficient descent condition and a pseudo relative error condition, which are both related to an auxiliary sequence, for the algorithm; and a continuity condition is assumed to hold. In fact, a lot of classical inexact nonconvex and nonsmooth algorithms allow these three conditions. Under a special kind of summable assumption on the auxiliary sequence, we prove the sequence generated by the general algorithm converges to a critical point of the objective function if being assumed Kurdyka- Lojasiewicz property. The core of the proofs lies in building a new Lyapunov function, whose successive difference provides a bound for the successive difference of the points generated by the algorithm. And then, we apply our findings to several classical nonconvex iterative algorithms and derive the corresponding convergence results

  Access Model/Code and Paper
Understanding and Enhancing the Transferability of Adversarial Examples

Feb 27, 2018
Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E

State-of-the-art deep neural networks are known to be vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs. Moreover, the perturbations can \textit{transfer across models}: adversarial examples generated for a specific model will often mislead other unseen models. Consequently the adversary can leverage it to attack deployed systems without any query, which severely hinder the application of deep learning, especially in the areas where security is crucial. In this work, we systematically study how two classes of factors that might influence the transferability of adversarial examples. One is about model-specific factors, including network architecture, model capacity and test accuracy. The other is the local smoothness of loss function for constructing adversarial examples. Based on these understanding, a simple but effective strategy is proposed to enhance transferability. We call it variance-reduced attack, since it utilizes the variance-reduced gradient to generate adversarial example. The effectiveness is confirmed by a variety of experiments on both CIFAR-10 and ImageNet datasets.

* 15 pages 

  Access Model/Code and Paper
High-dimensional Mixed Graphical Models

Aug 19, 2016
Jie Cheng, Tianxi Li, Elizaveta Levina, Ji Zhu

While graphical models for continuous data (Gaussian graphical models) and discrete data (Ising models) have been extensively studied, there is little work on graphical models linking both continuous and discrete variables (mixed data), which are common in many scientific applications. We propose a novel graphical model for mixed data, which is simple enough to be suitable for high-dimensional data, yet flexible enough to represent all possible graph structures. We develop a computationally efficient regression-based algorithm for fitting the model by focusing on the conditional log-likelihood of each variable given the rest. The parameters have a natural group structure, and sparsity in the fitted graph is attained by incorporating a group lasso penalty, approximated by a weighted $\ell_1$ penalty for computational efficiency. We demonstrate the effectiveness of our method through an extensive simulation study and apply it to a music annotation data set (CAL500), obtaining a sparse and interpretable graphical model relating the continuous features of the audio signal to categorical variables such as genre, emotions, and usage associated with particular songs. While we focus on binary discrete variables, we also show that the proposed methodology can be easily extended to general discrete variables.

  Access Model/Code and Paper
Accurate Urban Road Centerline Extraction from VHR Imagery via Multiscale Segmentation and Tensor Voting

Feb 25, 2016
Guangliang Cheng, Feiyun Zhu, Shiming Xiang, Chunhong Pan

It is very useful and increasingly popular to extract accurate road centerlines from very-high-resolution (VHR) re- mote sensing imagery for various applications, such as road map generation and updating etc. There are three shortcomings of current methods: (a) Due to the noise and occlusions (owing to vehicles and trees), most road extraction methods bring in heterogeneous classification results; (b) Morphological thinning algorithm is widely used to extract road centerlines, while it pro- duces small spurs around the centerlines; (c) Many methods are ineffective to extract centerlines around the road intersections. To address the above three issues, we propose a novel method to ex- tract smooth and complete road centerlines via three techniques: the multiscale joint collaborative representation (MJCR) & graph cuts (GC), tensor voting (TV) & non-maximum suppression (NMS) and fitting based connection algorithm. Specifically, a MJCR-GC based road area segmentation method is proposed by incorporating mutiscale features and spatial information. In this way, a homogenous road segmentation result is achieved. Then, to obtain a smooth and correct road centerline network, a TV-NMS based centerline extraction method is introduced. This method not only extracts smooth road centerlines, but also connects the discontinuous road centerlines. Finally, to overcome the ineffectiveness of current methods in the road intersection, a fitting based road centerline connection algorithm is proposed. As a result, we can get a complete road centerline network. Extensive experiments on two datasets demonstrate that our method achieves higher quantitative results, as well as more satisfactory visual performances by comparing with state-of-the- art methods.

* 25 pages, 11 figures 

  Access Model/Code and Paper
Tripartite Graph Clustering for Dynamic Sentiment Analysis on Social Media

Jun 12, 2014
Linhong Zhu, Aram Galstyan, James Cheng, Kristina Lerman

The growing popularity of social media (e.g, Twitter) allows users to easily share information with each other and influence others by expressing their own sentiments on various subjects. In this work, we propose an unsupervised \emph{tri-clustering} framework, which analyzes both user-level and tweet-level sentiments through co-clustering of a tripartite graph. A compelling feature of the proposed framework is that the quality of sentiment clustering of tweets, users, and features can be mutually improved by joint clustering. We further investigate the evolution of user-level sentiments and latent feature vectors in an online framework and devise an efficient online algorithm to sequentially update the clustering of tweets, users and features with newly arrived data. The online framework not only provides better quality of both dynamic user-level and tweet-level sentiment analysis, but also improves the computational and storage efficiency. We verified the effectiveness and efficiency of the proposed approaches on the November 2012 California ballot Twitter data.

* A short version is in Proceeding of the 2014 ACM SIGMOD International Conference on Management of data 

  Access Model/Code and Paper
Sparse Ising Models with Covariates

Sep 27, 2012
Jie Cheng, Elizaveta Levina, Pei Wang, Ji Zhu

There has been a lot of work fitting Ising models to multivariate binary data in order to understand the conditional dependency relationships between the variables. However, additional covariates are frequently recorded together with the binary data, and may influence the dependence relationships. Motivated by such a dataset on genomic instability collected from tumor samples of several types, we propose a sparse covariate dependent Ising model to study both the conditional dependency within the binary data and its relationship with the additional covariates. This results in subject-specific Ising models, where the subject's covariates influence the strength of association between the genes. As in all exploratory data analysis, interpretability of results is important, and we use L1 penalties to induce sparsity in the fitted graphs and in the number of selected covariates. Two algorithms to fit the model are proposed and compared on a set of simulated data, and asymptotic results are established. The results on the tumor dataset and their biological significance are discussed in detail.

* 32 pages (including 5 pages of appendix), 3 figures, 2 tables 

  Access Model/Code and Paper
An Amendment of Fast Subspace Tracking Methods

Feb 24, 2012
Zhu Cheng, Zhan Wang, Haitao Liu, Majid Ahmadi

Tuning stepsize between convergence rate and steady state error level or stability is a problem in some subspace tracking schemes. Methods in DPM and OJA class may show sparks in their steady state error sometimes, even with a rather small stepsize. By a study on the schemes' updating formula, it is found that the update only happens in a specific plane but not all the subspace basis. Through an analysis on relationship between the vectors in that plane, an amendment as needed is made on the algorithm routine to fix the problem by constricting the stepsize at every update step. The simulation confirms elimination of the sparks.

* 4 pages, 3 figures 

  Access Model/Code and Paper
VFlow: More Expressive Generative Flows with Variational Data Augmentation

Feb 22, 2020
Jianfei Chen, Cheng Lu, Biqi Chenli, Jun Zhu, Tian Tian

Generative flows are promising tractable models for density modeling that define probabilistic distributions with invertible transformations. However, tractability imposes architectural constraints on generative flows, making them less expressive than other types of generative models. In this work, we study a previously overlooked constraint that all the intermediate representations must have the same dimensionality with the original data due to invertibility, limiting the width of the network. We tackle this constraint by augmenting the data with some extra dimensions and jointly learning a generative flow for augmented data as well as the distribution of augmented dimensions under a variational inference framework. Our approach, VFlow, is a generalization of generative flows and therefore always performs better. Combining with existing generative flows, VFlow achieves a new state-of-the-art 2.98 bits per dimension on the CIFAR-10 dataset and is more compact than previous models to reach similar modeling quality.

  Access Model/Code and Paper
Stack-VS: Stacked Visual-Semantic Attention for Image Caption Generation

Sep 05, 2019
Wei Wei, Ling Cheng, Xianling Mao, Guangyou Zhou, Feida Zhu

Recently, automatic image caption generation has been an important focus of the work on multimodal translation task. Existing approaches can be roughly categorized into two classes, i.e., top-down and bottom-up, the former transfers the image information (called as visual-level feature) directly into a caption, and the later uses the extracted words (called as semanticlevel attribute) to generate a description. However, previous methods either are typically based one-stage decoder or partially utilize part of visual-level or semantic-level information for image caption generation. In this paper, we address the problem and propose an innovative multi-stage architecture (called as Stack-VS) for rich fine-gained image caption generation, via combining bottom-up and top-down attention models to effectively handle both visual-level and semantic-level information of an input image. Specifically, we also propose a novel well-designed stack decoder model, which is constituted by a sequence of decoder cells, each of which contains two LSTM-layers work interactively to re-optimize attention weights on both visual-level feature vectors and semantic-level attribute embeddings for generating a fine-gained image caption. Extensive experiments on the popular benchmark dataset MSCOCO show the significant improvements on different evaluation metrics, i.e., the improvements on BLEU-4/CIDEr/SPICE scores are 0.372, 1.226 and 0.216, respectively, as compared to the state-of-the-arts.

* 12 pages, 7 figures 

  Access Model/Code and Paper
Multivariate Regression with Grossly Corrupted Observations: A Robust Approach and its Applications

Jan 11, 2017
Xiaowei Zhang, Chi Xu, Yu Zhang, Tingshao Zhu, Li Cheng

This paper studies the problem of multivariate linear regression where a portion of the observations is grossly corrupted or is missing, and the magnitudes and locations of such occurrences are unknown in priori. To deal with this problem, we propose a new approach by explicitly consider the error source as well as its sparseness nature. An interesting property of our approach lies in its ability of allowing individual regression output elements or tasks to possess their unique noise levels. Moreover, despite working with a non-smooth optimization problem, our approach still guarantees to converge to its optimal solution. Experiments on synthetic data demonstrate the competitiveness of our approach compared with existing multivariate regression models. In addition, empirically our approach has been validated with very promising results on two exemplar real-world applications: The first concerns the prediction of \textit{Big-Five} personality based on user behaviors at social network sites (SNSs), while the second is 3D human hand pose estimation from depth images. The implementation of our approach and comparison methods as well as the involved datasets are made publicly available in support of the open-source and reproducible research initiatives.

  Access Model/Code and Paper
Towards Arbitrary-View Face Alignment by Recommendation Trees

Nov 20, 2015
Shizhan Zhu, Cheng Li, Chen Change Loy, Xiaoou Tang

Learning to simultaneously handle face alignment of arbitrary views, e.g. frontal and profile views, appears to be more challenging than we thought. The difficulties lay in i) accommodating the complex appearance-shape relations exhibited in different views, and ii) encompassing the varying landmark point sets due to self-occlusion and different landmark protocols. Most existing studies approach this problem via training multiple viewpoint-specific models, and conduct head pose estimation for model selection. This solution is intuitive but the performance is highly susceptible to inaccurate head pose estimation. In this study, we address this shortcoming through learning an Ensemble of Model Recommendation Trees (EMRT), which is capable of selecting optimal model configuration without prior head pose estimation. The unified framework seamlessly handles different viewpoints and landmark protocols, and it is trained by optimising directly on landmark locations, thus yielding superior results on arbitrary-view face alignment. This is the first study that performs face alignment on the full AFLWdataset with faces of different views including profile view. State-of-the-art performances are also reported on MultiPIE and AFW datasets containing both frontaland profile-view faces.

* This is our original submission to ICCV 2015 

  Access Model/Code and Paper
Transferring Landmark Annotations for Cross-Dataset Face Alignment

Sep 02, 2014
Shizhan Zhu, Cheng Li, Chen Change Loy, Xiaoou Tang

Dataset bias is a well known problem in object recognition domain. This issue, nonetheless, is rarely explored in face alignment research. In this study, we show that dataset plays an integral part of face alignment performance. Specifically, owing to face alignment dataset bias, training on one database and testing on another or unseen domain would lead to poor performance. Creating an unbiased dataset through combining various existing databases, however, is non-trivial as one has to exhaustively re-label the landmarks for standardisation. In this work, we propose a simple and yet effective method to bridge the disparate annotation spaces between databases, making datasets fusion possible. We show extensive results on combining various popular databases (LFW, AFLW, LFPW, HELEN) for improved cross-dataset and unseen data alignment.

* Shizhan Zhu and Cheng Li share equal contributions 

  Access Model/Code and Paper
Strongly Convex Programming for Exact Matrix Completion and Robust Principal Component Analysis

Jan 05, 2012
Hui Zhang, Jian-Feng Cai, Lizhi Cheng, Jubo Zhu

The common task in matrix completion (MC) and robust principle component analysis (RPCA) is to recover a low-rank matrix from a given data matrix. These problems gained great attention from various areas in applied sciences recently, especially after the publication of the pioneering works of Cand`es et al.. One fundamental result in MC and RPCA is that nuclear norm based convex optimizations lead to the exact low-rank matrix recovery under suitable conditions. In this paper, we extend this result by showing that strongly convex optimizations can guarantee the exact low-rank matrix recovery as well. The result in this paper not only provides sufficient conditions under which the strongly convex models lead to the exact low-rank matrix recovery, but also guides us on how to choose suitable parameters in practical algorithms.

* 17 pages 

  Access Model/Code and Paper