Models, code, and papers for "Yue Wu":

Facial Landmark Detection: a Literature Survey

May 15, 2018
Yue Wu, Qiang Ji

The locations of the fiducial facial landmark points around facial components and facial contour capture the rigid and non-rigid facial deformations due to head movements and facial expressions. They are hence important for various facial analysis tasks. Many facial landmark detection algorithms have been developed to automatically detect those key points over the years, and in this paper, we perform an extensive review of them. We classify the facial landmark detection algorithms into three major categories: holistic methods, Constrained Local Model (CLM) methods, and the regression-based methods. They differ in the ways to utilize the facial appearance and shape information. The holistic methods explicitly build models to represent the global facial appearance and shape information. The CLMs explicitly leverage the global shape model but build the local appearance models. The regression-based methods implicitly capture facial shape and appearance information. For algorithms within each category, we discuss their underlying theories as well as their differences. We also compare their performances on both controlled and in the wild benchmark datasets, under varying facial expressions, head poses, and occlusion. Based on the evaluations, we point out their respective strengths and weaknesses. There is also a separate section to review the latest deep learning-based algorithms. The survey also includes a listing of the benchmark databases and existing software. Finally, we identify future research directions, including combining methods in different categories to leverage their respective strengths to solve landmark detection "in-the-wild".

* International Journal on Computer Vision, 2017 

  Click for Model/Code and Paper
Constrained Joint Cascade Regression Framework for Simultaneous Facial Action Unit Recognition and Facial Landmark Detection

Sep 23, 2017
Yue Wu, Qiang Ji

Cascade regression framework has been shown to be effective for facial landmark detection. It starts from an initial face shape and gradually predicts the face shape update from the local appearance features to generate the facial landmark locations in the next iteration until convergence. In this paper, we improve upon the cascade regression framework and propose the Constrained Joint Cascade Regression Framework (CJCRF) for simultaneous facial action unit recognition and facial landmark detection, which are two related face analysis tasks, but are seldomly exploited together. In particular, we first learn the relationships among facial action units and face shapes as a constraint. Then, in the proposed constrained joint cascade regression framework, with the help from the constraint, we iteratively update the facial landmark locations and the action unit activation probabilities until convergence. Experimental results demonstrate that the intertwined relationships of facial action units and face shapes boost the performances of both facial action unit recognition and facial landmark detection. The experimental results also demonstrate the effectiveness of the proposed method comparing to the state-of-the-art works.

* International Conference on Computer Vision and Pattern Recognition, 2016 

  Click for Model/Code and Paper
Constrained Deep Transfer Feature Learning and its Applications

Sep 23, 2017
Yue Wu, Qiang Ji

Feature learning with deep models has achieved impressive results for both data representation and classification for various vision tasks. Deep feature learning, however, typically requires a large amount of training data, which may not be feasible for some application domains. Transfer learning can be one of the approaches to alleviate this problem by transferring data from data-rich source domain to data-scarce target domain. Existing transfer learning methods typically perform one-shot transfer learning and often ignore the specific properties that the transferred data must satisfy. To address these issues, we introduce a constrained deep transfer feature learning method to perform simultaneous transfer learning and feature learning by performing transfer learning in a progressively improving feature space iteratively in order to better narrow the gap between the target domain and the source domain for effective transfer of the data from the source domain to target domain. Furthermore, we propose to exploit the target domain knowledge and incorporate such prior knowledge as a constraint during transfer learning to ensure that the transferred data satisfies certain properties of the target domain. To demonstrate the effectiveness of the proposed constrained deep transfer feature learning method, we apply it to thermal feature learning for eye detection by transferring from the visible domain. We also applied the proposed method for cross-view facial expression recognition as a second application. The experimental results demonstrate the effectiveness of the proposed method for both applications.

* International Conference on Computer Vision and Pattern Recognition, 2016 

  Click for Model/Code and Paper
Robust Facial Landmark Detection under Significant Head Poses and Occlusion

Sep 23, 2017
Yue Wu, Qiang Ji

There have been tremendous improvements for facial landmark detection on general "in-the-wild" images. However, it is still challenging to detect the facial landmarks on images with severe occlusion and images with large head poses (e.g. profile face). In fact, the existing algorithms usually can only handle one of them. In this work, we propose a unified robust cascade regression framework that can handle both images with severe occlusion and images with large head poses. Specifically, the method iteratively predicts the landmark occlusions and the landmark locations. For occlusion estimation, instead of directly predicting the binary occlusion vectors, we introduce a supervised regression method that gradually updates the landmark visibility probabilities in each iteration to achieve robustness. In addition, we explicitly add occlusion pattern as a constraint to improve the performance of occlusion prediction. For landmark detection, we combine the landmark visibility probabilities, the local appearances, and the local shapes to iteratively update their positions. The experimental results show that the proposed method is significantly better than state-of-the-art works on images with severe occlusion and images with large head poses. It is also comparable to other methods on general "in-the-wild" images.

* International Conference on Computer Vision, 2015 

  Click for Model/Code and Paper
Enhancing Model Interpretability and Accuracy for Disease Progression Prediction via Phenotype-Based Patient Similarity Learning

Sep 26, 2019
Yue Wang, Tong Wu, Yunlong Wang, Gao Wang

Models have been proposed to extract temporal patterns from longitudinal electronic health records (EHR) for clinical predictive models. However, the common relations among patients (e.g., receiving the same medical treatments) were rarely considered. In this paper, we propose to learn patient similarity features as phenotypes from the aggregated patient-medical service matrix using non-negative matrix factorization. On real-world medical claim data, we show that the learned phenotypes are coherent within each group, and also explanatory and indicative of targeted diseases. We conducted experiments to predict the diagnoses for Chronic Lymphocytic Leukemia (CLL) patients. Results show that the phenotype-based similarity features can improve prediction over multiple baselines, including logistic regression, random forest, convolutional neural network, and more.

* 12 pages, accepted by Pacific Symposium on Biocomputing (PSB) 2020 

  Click for Model/Code and Paper
Deep Learning Based Autoencoder for Interference Channel

Feb 18, 2019
Dehao Wu, Maziar Nekovee, Yue Wang

Deep learning (DL) based autoencoder has shown great potential to significantly enhance the physical layer performance. In this paper, we present a DL based autoencoder for interference channel. Based on a characterization of a k-user Gaussian interference channel, where the interferences are classified as different levels from weak to very strong interferences based on a coupling parameter {\alpha}, a DL neural network (NN) based autoencoder is designed to train the data set and decode the received signals. The performance such a DL autoencoder for different interference scenarios are studied, with {\alpha} known or partially known, where we assume that {\alpha} is predictable but with a varying up to 10\% at the training stage. The results demonstrate that DL based approach has a significant capability to mitigate the effect induced by a poor signal-to-noise ratio (SNR) and a high interference-to-noise ratio (INR). However, the enhancement depends on the knowledge of {\alpha} as well as the interference levels. The proposed DL approach performs well with {\alpha} up to 10\% offset for weak interference level. For strong and very strong interference channel, the offset of {\alpha} needs to be constrained to less than 5\% and 2\%, respectively, to maintain similar performance as {\alpha} is known.

* 6 pages, 10 figures, IEEE WiOpt 2019 submitted. arXiv admin note: text overlap with arXiv:1710.05312, arXiv:1206.0197 by other authors 

  Click for Model/Code and Paper
Simultaneous Facial Landmark Detection, Pose and Deformation Estimation under Facial Occlusion

Sep 23, 2017
Yue Wu, Chao Gou, Qiang Ji

Facial landmark detection, head pose estimation, and facial deformation analysis are typical facial behavior analysis tasks in computer vision. The existing methods usually perform each task independently and sequentially, ignoring their interactions. To tackle this problem, we propose a unified framework for simultaneous facial landmark detection, head pose estimation, and facial deformation analysis, and the proposed model is robust to facial occlusion. Following a cascade procedure augmented with model-based head pose estimation, we iteratively update the facial landmark locations, facial occlusion, head pose and facial de- formation until convergence. The experimental results on benchmark databases demonstrate the effectiveness of the proposed method for simultaneous facial landmark detection, head pose and facial deformation estimation, even if the images are under facial occlusion.

* International Conference on Computer Vision and Pattern Recognition, 2017 

  Click for Model/Code and Paper
A Hierarchical Probabilistic Model for Facial Feature Detection

Sep 18, 2017
Yue Wu, Ziheng Wang, Qiang Ji

Facial feature detection from facial images has attracted great attention in the field of computer vision. It is a nontrivial task since the appearance and shape of the face tend to change under different conditions. In this paper, we propose a hierarchical probabilistic model that could infer the true locations of facial features given the image measurements even if the face is with significant facial expression and pose. The hierarchical model implicitly captures the lower level shape variations of facial components using the mixture model. Furthermore, in the higher level, it also learns the joint relationship among facial components, the facial expression, and the pose information through automatic structure learning and parameter estimation of the probabilistic model. Experimental results on benchmark databases demonstrate the effectiveness of the proposed hierarchical probabilistic model.

* IEEE Conference on Computer Vision and Pattern Recognition, 2014 

  Click for Model/Code and Paper
Facial Feature Tracking under Varying Facial Expressions and Face Poses based on Restricted Boltzmann Machines

Sep 18, 2017
Yue Wu, Zuoguan Wang, Qiang Ji

Facial feature tracking is an active area in computer vision due to its relevance to many applications. It is a nontrivial task, since faces may have varying facial expressions, poses or occlusions. In this paper, we address this problem by proposing a face shape prior model that is constructed based on the Restricted Boltzmann Machines (RBM) and their variants. Specifically, we first construct a model based on Deep Belief Networks to capture the face shape variations due to varying facial expressions for near-frontal view. To handle pose variations, the frontal face shape prior model is incorporated into a 3-way RBM model that could capture the relationship between frontal face shapes and non-frontal face shapes. Finally, we introduce methods to systematically combine the face shape prior models with image measurements of facial feature points. Experiments on benchmark databases show that with the proposed method, facial feature points can be tracked robustly and accurately even if faces have significant facial expressions and poses.

* IEEE Conference on Computer Vision and Pattern Recognition, 2013 

  Click for Model/Code and Paper
Deep Matching and Validation Network -- An End-to-End Solution to Constrained Image Splicing Localization and Detection

May 27, 2017
Yue Wu, Wael AbdAlmageed, Prem Natarajan

Image splicing is a very common image manipulation technique that is sometimes used for malicious purposes. A splicing detec- tion and localization algorithm usually takes an input image and produces a binary decision indicating whether the input image has been manipulated, and also a segmentation mask that corre- sponds to the spliced region. Most existing splicing detection and localization pipelines suffer from two main shortcomings: 1) they use handcrafted features that are not robust against subsequent processing (e.g., compression), and 2) each stage of the pipeline is usually optimized independently. In this paper we extend the formulation of the underlying splicing problem to consider two input images, a query image and a potential donor image. Here the task is to estimate the probability that the donor image has been used to splice the query image, and obtain the splicing masks for both the query and donor images. We introduce a novel deep convolutional neural network architecture, called Deep Matching and Validation Network (DMVN), which simultaneously localizes and detects image splicing. The proposed approach does not depend on handcrafted features and uses raw input images to create deep learned representations. Furthermore, the DMVN is end-to-end op- timized to produce the probability estimates and the segmentation masks. Our extensive experiments demonstrate that this approach outperforms state-of-the-art splicing detection methods by a large margin in terms of both AUC score and speed.

* 9 pages, 10 figures 

  Click for Model/Code and Paper
Machine Learning for Exam Triage

Apr 30, 2018
Xinyu Guan, Jessica Lee, Peter Wu, Yue Wu

In this project, we extend the state-of-the-art CheXNet (Rajpurkar et al. [2017]) by making use of the additional non-image features in the dataset. Our model produced better AUROC scores than the original CheXNet.


  Click for Model/Code and Paper
James-Stein Type Center Pixel Weights for Non-Local Means Image Denoising

Nov 07, 2012
Yue Wu, Brian Tracey, Joseph P. Noonan

Non-Local Means (NLM) and variants have been proven to be effective and robust in many image denoising tasks. In this letter, we study the parameter selection problem of center pixel weights (CPW) in NLM. Our key contributions are: 1) we give a novel formulation of the CPW problem from the statistical shrinkage perspective; 2) we introduce the James-Stein type CPWs for NLM; and 3) we propose a new adaptive CPW that is locally tuned for each image pixel. Our experimental results showed that compared to existing CPW solutions, the new proposed CPWs are more robust and effective under various noise levels. In particular, the NLM with the James-Stein type CPWs attain higher means with smaller variances in terms of the peak signal and noise ratio, implying they improve the NLM robustness and make it less sensitive to parameter selection.


  Click for Model/Code and Paper
A New Randomness Evaluation Method with Applications to Image Shuffling and Encryption

Nov 07, 2012
Yue Wu, Sos Agaian, Joseph P. Noonan

This letter discusses the problem of testing the degree of randomness within an image, particularly for a shuffled or encrypted image. Its key contributions are: 1) a mathematical model of perfectly shuffled images; 2) the derivation of the theoretical distribution of pixel differences; 3) a new $Z$-test based approach to differentiate whether or not a test image is perfectly shuffled; and 4) a randomized algorithm to unbiasedly evaluate the degree of randomness within a given image. Simulation results show that the proposed method is robust and effective in evaluating the degree of randomness within an image, and may often be more suitable for image applications than commonly used testing schemes designed for binary data like NIST 800-22. The developed method may be also useful as a first step in determining whether or not a shuffling or encryption scheme is suitable for a particular cryptographic application.


  Click for Model/Code and Paper
Unified Adversarial Invariance

May 07, 2019
Ayush Jaiswal, Yue Wu, Wael AbdAlmageed, Premkumar Natarajan

We present a unified invariance framework for supervised neural networks that can induce independence to nuisance factors of data without using any nuisance annotations, but can additionally use labeled information about biasing factors to force their removal from the latent embedding for making fair predictions. Invariance to nuisance is achieved by learning a split representation of data through competitive training between the prediction task and a reconstruction task coupled with disentanglement, whereas that to biasing factors is brought about by penalizing the network if the latent embedding contains any information about them. We describe an adversarial instantiation of this framework and provide analysis of its working. Our model outperforms previous works at inducing invariance to nuisance factors without using any labeled information about such variables, and achieves state-of-the-art performance at learning independence to biasing factors in fairness settings.

* In Submission to T-PAMI. arXiv admin note: substantial text overlap with arXiv:1809.10083 

  Click for Model/Code and Paper
Spatially Constrained Generative Adversarial Networks for Conditional Image Generation

May 07, 2019
Songyao Jiang, Hongfu Liu, Yue Wu, Yun Fu

Image generation has raised tremendous attention in both academic and industrial areas, especially for the conditional and target-oriented image generation, such as criminal portrait and fashion design. Although the current studies have achieved preliminary results along this direction, they always focus on class labels as the condition where spatial contents are randomly generated from latent vectors. Edge details are usually blurred since spatial information is difficult to preserve. In light of this, we propose a novel Spatially Constrained Generative Adversarial Network (SCGAN), which decouples the spatial constraints from the latent vector and makes these constraints feasible as additional controllable signals. To enhance the spatial controllability, a generator network is specially designed to take a semantic segmentation, a latent vector and an attribute-level label as inputs step by step. Besides, a segmentor network is constructed to impose spatial constraints on the generator. Experimentally, we provide both visual and quantitative results on CelebA and DeepFashion datasets, and demonstrate that the proposed SCGAN is very effective in controlling the spatial contents as well as generating high-quality images.


  Click for Model/Code and Paper
Bidirectional Conditional Generative Adversarial Networks

Nov 03, 2018
Ayush Jaiswal, Wael AbdAlmageed, Yue Wu, Premkumar Natarajan

Conditional Generative Adversarial Networks (cGANs) are generative models that can produce data samples ($x$) conditioned on both latent variables ($z$) and known auxiliary information ($c$). We propose the Bidirectional cGAN (BiCoGAN), which effectively disentangles $z$ and $c$ in the generation process and provides an encoder that learns inverse mappings from $x$ to both $z$ and $c$, trained jointly with the generator and the discriminator. We present crucial techniques for training BiCoGANs, which involve an extrinsic factor loss along with an associated dynamically-tuned importance weight. As compared to other encoder-based cGANs, BiCoGANs encode $c$ more accurately, and utilize $z$ and $c$ more effectively and in a more disentangled way to generate samples.

* To appear in Proceedings of ACCV 2018 

  Click for Model/Code and Paper
CapsuleGAN: Generative Adversarial Capsule Network

Oct 02, 2018
Ayush Jaiswal, Wael AbdAlmageed, Yue Wu, Premkumar Natarajan

We present Generative Adversarial Capsule Network (CapsuleGAN), a framework that uses capsule networks (CapsNets) instead of the standard convolutional neural networks (CNNs) as discriminators within the generative adversarial network (GAN) setting, while modeling image data. We provide guidelines for designing CapsNet discriminators and the updated GAN objective function, which incorporates the CapsNet margin loss, for training CapsuleGAN models. We show that CapsuleGAN outperforms convolutional-GAN at modeling image data distribution on MNIST and CIFAR-10 datasets, evaluated on the generative adversarial metric and at semi-supervised image classification.

* To appear in Proceedings of ECCV Workshop on Brain Driven Computer Vision (BDCV) 2018 

  Click for Model/Code and Paper
Unsupervised Adversarial Invariance

Sep 26, 2018
Ayush Jaiswal, Yue Wu, Wael AbdAlmageed, Premkumar Natarajan

Data representations that contain all the information about target variables but are invariant to nuisance factors benefit supervised learning algorithms by preventing them from learning associations between these factors and the targets, thus reducing overfitting. We present a novel unsupervised invariance induction framework for neural networks that learns a split representation of data through competitive training between the prediction task and a reconstruction task coupled with disentanglement, without needing any labeled information about nuisance factors or domain knowledge. We describe an adversarial instantiation of this framework and provide analysis of its working. Our unsupervised model outperforms state-of-the-art methods, which are supervised, at inducing invariance to inherent nuisance factors, effectively using synthetic data augmentation to learn invariance, and domain adaptation. Our method can be applied to any prediction task, eg., binary/multi-class classification or regression, without loss of generality.

* To appear in Proceedings of NIPS 2018 

  Click for Model/Code and Paper
Deep Multimodal Image-Repurposing Detection

Aug 20, 2018
Ekraam Sabir, Wael AbdAlmageed, Yue Wu, Prem Natarajan

Nefarious actors on social media and other platforms often spread rumors and falsehoods through images whose metadata (e.g., captions) have been modified to provide visual substantiation of the rumor/falsehood. This type of modification is referred to as image repurposing, in which often an unmanipulated image is published along with incorrect or manipulated metadata to serve the actor's ulterior motives. We present the Multimodal Entity Image Repurposing (MEIR) dataset, a substantially challenging dataset over that which has been previously available to support research into image repurposing detection. The new dataset includes location, person, and organization manipulations on real-world data sourced from Flickr. We also present a novel, end-to-end, deep multimodal learning model for assessing the integrity of an image by combining information extracted from the image with related information from a knowledge base. The proposed method is compared against state-of-the-art techniques on existing datasets as well as MEIR, where it outperforms existing methods across the board, with AUC improvement up to 0.23.

* To be published at ACM Multimeda 2018 (orals) 

  Click for Model/Code and Paper
Seismic-Net: A Deep Densely Connected Neural Network to Detect Seismic Events

Jan 17, 2018
Yue Wu, Youzuo Lin, Zheng Zhou, Andrew Delorey

One of the risks of large-scale geologic carbon sequestration is the potential migration of fluids out of the storage formations. Accurate and fast detection of this fluids migration is not only important but also challenging, due to the large subsurface uncertainty and complex governing physics. Traditional leakage detection and monitoring techniques rely on geophysical observations including seismic. However, the resulting accuracy of these methods is limited because of indirect information they provide requiring expert interpretation, therefore yielding in-accurate estimates of leakage rates and locations. In this work, we develop a novel machine-learning detection package, named "Seismic-Net", which is based on the deep densely connected neural network. To validate the performance of our proposed leakage detection method, we employ our method to a natural analog site at Chimay\'o, New Mexico. The seismic events in the data sets are generated because of the eruptions of geysers, which is due to the leakage of $\mathrm{CO}_\mathrm{2}$. In particular, we demonstrate the efficacy of our Seismic-Net by formulating our detection problem as an event detection problem with time series data. A fixed-length window is slid throughout the time series data and we build a deep densely connected network to classify each window to determine if a geyser event is included. Through our numerical tests, we show that our model achieves precision/recall as high as 0.889/0.923. Therefore, our Seismic-Net has a great potential for detection of $\mathrm{CO}_\mathrm{2}$ leakage.


  Click for Model/Code and Paper