Models, code, and papers for "Zhan Li":

Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction

Apr 05, 2018
Huangying Zhan, Ravi Garg, Chamara Saroj Weerasekera, Kejie Li, Harsh Agarwal, Ian Reid

Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner. Recent approaches to single view depth estimation explore the possibility of learning without full supervision via minimizing photometric error. In this paper, we explore the use of stereo sequences for learning depth and visual odometry. The use of stereo sequences enables the use of both spatial (between left-right pairs) and temporal (forward backward) photometric warp error, and constrains the scene depth and camera motion to be in a common, real-world scale. At test time our framework is able to estimate single view depth and two-view odometry from a monocular sequence. We also show how we can improve on a standard photometric warp loss by considering a warp of deep features. We show through extensive experiments that: (i) jointly training for single view depth and visual odometry improves depth prediction because of the additional constraint imposed on depths and achieves competitive results for visual odometry; (ii) deep feature-based warping loss improves upon simple photometric warp loss for both single view depth estimation and visual odometry. Our method outperforms existing learning based methods on the KITTI driving dataset in both tasks. The source code is available at https://github.com/Huangying-Zhan/Depth-VO-Feat

* 8 pages, 6 figures, CVPR 2018 

  Click for Model/Code and Paper
Contour Detection in Cassini ISS images based on Hierarchical Extreme Learning Machine and Dense Conditional Random Field

Aug 22, 2019
Xiqi Yang, Qingfeng Zhang, Zhan Li

In Cassini ISS (Imaging Science Subsystem) images, contour detection is often performed on disk-resolved object to accurately locate their center. Thus, the contour detection is a key problem. Traditional edge detection methods, such as Canny and Roberts, often extract the contour with too much interior details and noise. Although the deep convolutional neural network has been applied successfully in many image tasks, such as classification and object detection, it needs more time and computer resources. In the paper, a contour detection algorithm based on H-ELM (Hierarchical Extreme Learning Machine) and DenseCRF (Dense Conditional Random Field) is proposed for Cassini ISS images. The experimental results show that this algorithm's performance is better than both traditional machine learning methods such as SVM, ELM and even deep convolutional neural network. And the extracted contour is closer to the actual contour. Moreover, it can be trained and tested quickly on the general configuration of PC, so can be applied to contour detection for Cassini ISS images.


  Click for Model/Code and Paper
Color Image Enhancement Method Based on Weighted Image Guided Filtering

Dec 24, 2018
Qi Mu, Yanyan Wei, Zhanli Li

A novel color image enhancement method is proposed based on Retinex to enhance color images under non-uniform illumination or poor visibility conditions. Different from the conventional Retinex algorithms, the Weighted Guided Image Filter is used as a surround function instead of the Gaussian filter to estimate the background illumination, which can overcome the drawbacks of local blur and halo artifact that may appear by Gaussian filter. To avoid color distortion, the image is converted to the HSI color model, and only the intensity channel is enhanced. Then a linear color restoration algorithm is adopted to convert the enhanced intensity image back to the RGB color model, which ensures the hue is constant and undistorted. Experimental results show that the proposed method is effective to enhance both color and gray images with low exposure and non-uniform illumination, resulting in better visual quality than traditional method. At the same time, the objective evaluation indicators are also superior to the conventional methods. In addition, the efficiency of the proposed method is also improved thanks to the linear color restoration algorithm.

* 15 pages 

  Click for Model/Code and Paper
Generic Vehicle Tracking Framework Capable of Handling Occlusions Based on Modified Mixture Particle Filter

Sep 20, 2018
Jiachen Li, Wei Zhan, Masayoshi Tomizuka

Accurate and robust tracking of surrounding road participants plays an important role in autonomous driving. However, there is usually no prior knowledge of the number of tracking targets due to object emergence, object disappearance and false alarms. To overcome this challenge, we propose a generic vehicle tracking framework based on modified mixture particle filter, which can make the number of tracking targets adaptive to real-time observations and track all the vehicles within sensor range simultaneously in a uniform architecture without explicit data association. Each object corresponds to a mixture component whose distribution is non-parametric and approximated by particle hypotheses. Most tracking approaches employ vehicle kinematic models as the prediction model. However, it is hard for these models to make proper predictions when sensor measurements are lost or become low quality due to partial or complete occlusions. Moreover, these models are incapable of forecasting sudden maneuvers. To address these problems, we propose to incorporate learning-based behavioral models instead of pure vehicle kinematic models to realize prediction in the prior update of recursive Bayesian state estimation. Two typical driving scenarios including lane keeping and lane change are demonstrated to verify the effectiveness and accuracy of the proposed framework as well as the advantages of employing learning-based models.

* Presented in 2018 IEEE Intelligent Vehicles Symposium (IV) 

  Click for Model/Code and Paper
An Evasion and Counter-Evasion Study in Malicious Websites Detection

Aug 08, 2014
Li Xu, Zhenxin Zhan, Shouhuai Xu, Keyin Ye

Malicious websites are a major cyber attack vector, and effective detection of them is an important cyber defense task. The main defense paradigm in this regard is that the defender uses some kind of machine learning algorithms to train a detection model, which is then used to classify websites in question. Unlike other settings, the following issue is inherent to the problem of malicious websites detection: the attacker essentially has access to the same data that the defender uses to train its detection models. This 'symmetry' can be exploited by the attacker, at least in principle, to evade the defender's detection models. In this paper, we present a framework for characterizing the evasion and counter-evasion interactions between the attacker and the defender, where the attacker attempts to evade the defender's detection models by taking advantage of this symmetry. Within this framework, we show that an adaptive attacker can make malicious websites evade powerful detection models, but proactive training can be an effective counter-evasion defense mechanism. The framework is geared toward the popular detection model of decision tree, but can be adapted to accommodate other classifiers.


  Click for Model/Code and Paper
Generic Tracking and Probabilistic Prediction Framework and Its Application in Autonomous Driving

Aug 23, 2019
Jiachen Li, Wei Zhan, Yeping Hu, Masayoshi Tomizuka

Accurately tracking and predicting behaviors of surrounding objects are key prerequisites for intelligent systems such as autonomous vehicles to achieve safe and high-quality decision making and motion planning. However, there still remain challenges for multi-target tracking due to object number fluctuation and occlusion. To overcome these challenges, we propose a constrained mixture sequential Monte Carlo (CMSMC) method in which a mixture representation is incorporated in the estimated posterior distribution to maintain multi-modality. Multiple targets can be tracked simultaneously within a unified framework without explicit data association between observations and tracking targets. The framework can incorporate an arbitrary prediction model as the implicit proposal distribution of the CMSMC method. An example in this paper is a learning-based model for hierarchical time-series prediction, which consists of a behavior recognition module and a state evolution module. Both modules in the proposed model are generic and flexible so as to be applied to a class of time-series prediction problems where behaviors can be separated into different levels. Finally, the proposed framework is applied to a numerical case study as well as a task of on-road vehicle tracking, behavior recognition, and prediction in highway scenarios. Instead of only focusing on forecasting trajectory of a single entity, we jointly predict continuous motions for interactive entities simultaneously. The proposed approaches are evaluated from multiple aspects, which demonstrate great potential for intelligent vehicular systems and traffic surveillance systems.

* IEEE Transactions on Intelligent Transportation Systems 

  Click for Model/Code and Paper
Coordination and Trajectory Prediction for Vehicle Interactions via Bayesian Generative Modeling

May 02, 2019
Jiachen Li, Hengbo Ma, Wei Zhan, Masayoshi Tomizuka

Coordination recognition and subtle pattern prediction of future trajectories play a significant role when modeling interactive behaviors of multiple agents. Due to the essential property of uncertainty in the future evolution, deterministic predictors are not sufficiently safe and robust. In order to tackle the task of probabilistic prediction for multiple, interactive entities, we propose a coordination and trajectory prediction system (CTPS), which has a hierarchical structure including a macro-level coordination recognition module and a micro-level subtle pattern prediction module which solves a probabilistic generation task. We illustrate two types of representation of the coordination variable: categorized and real-valued, and compare their effects and advantages based on empirical studies. We also bring the ideas of Bayesian deep learning into deep generative models to generate diversified prediction hypotheses. The proposed system is tested on multiple driving datasets in various traffic scenarios, which achieves better performance than baseline approaches in terms of a set of evaluation metrics. The results also show that using categorized coordination can better capture multi-modality and generate more diversified samples than the real-valued coordination, while the latter can generate prediction hypotheses with smaller errors with a sacrifice of sample diversity. Moreover, employing neural networks with weight uncertainty is able to generate samples with larger variance and diversity.

* Accepted by 2019 IEEE Intelligent Vehicles Symposium (IV) 

  Click for Model/Code and Paper
Generic Probabilistic Interactive Situation Recognition and Prediction: From Virtual to Real

Sep 09, 2018
Jiachen Li, Hengbo Ma, Wei Zhan, Masayoshi Tomizuka

Accurate and robust recognition and prediction of traffic situation plays an important role in autonomous driving, which is a prerequisite for risk assessment and effective decision making. Although there exist a lot of works dealing with modeling driver behavior of a single object, it remains a challenge to make predictions for multiple highly interactive agents that react to each other simultaneously. In this work, we propose a generic probabilistic hierarchical recognition and prediction framework which employs a two-layer Hidden Markov Model (TLHMM) to obtain the distribution of potential situations and a learning-based dynamic scene evolution model to sample a group of future trajectories. Instead of predicting motions of a single entity, we propose to get the joint distribution by modeling multiple interactive agents as a whole system. Moreover, due to the decoupling property of the layered structure, our model is suitable for knowledge transfer from simulation to real world applications as well as among different traffic scenarios, which can reduce the computational efforts of training and the demand for a large data amount. A case study of highway ramp merging scenario is demonstrated to verify the effectiveness and accuracy of the proposed framework.

* Accepted by The 21st IEEE International Conference on Intelligent Transportation Systems (2018 IEEE ITSC) 

  Click for Model/Code and Paper
Listen, Attend, Spell and Adapt: Speaker Adapted Sequence-to-Sequence ASR

Jul 08, 2019
Felix Weninger, Jesús Andrés-Ferrer, Xinwei Li, Puming Zhan

Sequence-to-sequence (seq2seq) based ASR systems have shown state-of-the-art performances while having clear advantages in terms of simplicity. However, comparisons are mostly done on speaker independent (SI) ASR systems, though speaker adapted conventional systems are commonly used in practice for improving robustness to speaker and environment variations. In this paper, we apply speaker adaptation to seq2seq models with the goal of matching the performance of conventional ASR adaptation. Specifically, we investigate Kullback-Leibler divergence (KLD) as well as Linear Hidden Network (LHN) based adaptation for seq2seq ASR, using different amounts (up to 20 hours) of adaptation data per speaker. Our SI models are trained on large amounts of dictation data and achieve state-of-the-art results. We obtained 25% relative word error rate (WER) improvement with KLD adaptation of the seq2seq model vs. 18.7% gain from acoustic model adaptation in the conventional system. We also show that the WER of the seq2seq model decreases log-linearly with the amount of adaptation data. Finally, we analyze adaptation based on the minimum WER criterion and adapting the language model (LM) for score fusion with the speaker adapted seq2seq model, which result in further improvements of the seq2seq system performance.

* To appear in INTERSPEECH 2019 

  Click for Model/Code and Paper
Identifying Emotion from Natural Walking

Sep 10, 2015
Liqing Cui, Shun Li, Wan Zhang, Zhan Zhang, Tingshao Zhu

Emotion identification from gait aims to automatically determine persons affective state, it has attracted a great deal of interests and offered immense potential value in action tendency, health care, psychological detection and human-computer(robot) interaction.In this paper, we propose a new method of identifying emotion from natural walking, and analyze the relevance between the traits of walking and affective states. After obtaining the pure acceleration data of wrist and ankle, we set a moving average filter window with different sizes w, then extract 114 features including time-domain, frequency-domain, power and distribution features from each data slice, and run principal component analysis (PCA) to reduce dimension. In experiments, we train SVM, Decision Tree, multilayerperception, Random Tree and Random Forest classification models, and compare the classification accuracy on data of wrist and ankle with respect to different w. The performance of emotion identification on acceleration data of ankle is better than wrist.Comparing different classification models' results, SVM has best accuracy of identifying anger and happy could achieve 90:31% and 89:76% respectively, and identification ratio of anger-happy is 87:10%.The anger-neutral-happy classification reaches 85%-78%-78%.The results show that it is capable of identifying personal emotional states through the gait of walking.


  Click for Model/Code and Paper
Unsupervised Multi-stream Highlight detection for the Game "Honor of Kings"

Oct 14, 2019
Li Wang, Zixun Sun, Wentao Yao, Hui Zhan, Chengwei Zhu

With the increasing popularity of E-sport live, Highlight Flashback has been a critical functionality of live platforms, which aggregates the overall exciting fighting scenes in a few seconds. In this paper, we introduce a novel training strategy without any additional annotation to automatically generate highlights for game video live. Considering that the existing manual edited clips contain more highlights than long game live videos, we perform pair-wise ranking constraints across clips from edited and long live videos. A multi-stream framework is also proposed to fuse spatial, temporal as well as audio features extracted from videos. To evaluate our method, we test on long game live videos with an average length of about 15 minutes. Extensive experimental results on videos demonstrate its satisfying performance on highlights generation and effectiveness by the fusion of three streams.


  Click for Model/Code and Paper
ACFM: A Dynamic Spatial-Temporal Network for Traffic Prediction

Sep 02, 2019
Lingbo Liu, Jiajie Zhen, Guanbin Li, Geng Zhan, Liang Lin

As a crucial component in intelligent transportation systems, crowd flow prediction has recently attracted widespread research interest in the field of artificial intelligence (AI) with the increasing availability of large-scale traffic mobility data. Its key challenge lies in how to integrate diverse factors (such as temporal laws and spatial dependencies) to infer the evolution trend of crowd flow. To address this problem, we propose a unified neural network called Attentive Crowd Flow Machine (ACFM), which can effectively learn the spatial-temporal feature representations of crowd flow with an attention mechanism. In particular, our ACFM is composed of two progressive ConvLSTM units connected with a convolutional layer. Specifically, the first LSTM unit takes normal crowd flow features as input and generates a hidden state at each time-step, which is further fed into the connected convolutional layer for spatial attention map inference. The second LSTM unit aims at learning the dynamic spatial-temporal representations from the attentionally weighted crowd flow features. Further, we develop two deep frameworks based on ACFM to predict citywide short-term/long-term crowd flow by adaptively incorporating the sequential and periodic data as well as other external influences. Extensive experiments on two standard benchmarks well demonstrate the superiority of the proposed method for crowd flow prediction. Moreover, to verify the generalization of our method, we also apply the customized framework to forecast the passenger pickup/dropoff demands and show its superior performance in this traffic prediction task.

* arXiv admin note: substantial text overlap with arXiv:1809.00101 

  Click for Model/Code and Paper
Towards a Fatality-Aware Benchmark of Probabilistic Reaction Prediction in Highly Interactive Driving Scenarios

Sep 10, 2018
Wei Zhan, Liting Sun, Yeping Hu, Jiachen Li, Masayoshi Tomizuka

Autonomous vehicles should be able to generate accurate probabilistic predictions for uncertain behavior of other road users. Moreover, reactive predictions are necessary in highly interactive driving scenarios to answer "what if I take this action in the future" for autonomous vehicles. There is no existing unified framework to homogenize the problem formulation, representation simplification, and evaluation metric for various prediction methods, such as probabilistic graphical models (PGM), neural networks (NN) and inverse reinforcement learning (IRL). In this paper, we formulate a probabilistic reaction prediction problem, and reveal the relationship between reaction and situation prediction problems. We employ prototype trajectories with designated motion patterns other than "intention" to homogenize the representation so that probabilities corresponding to each trajectory generated by different methods can be evaluated. We also discuss the reasons why "intention" is not suitable to serve as a motion indicator in highly interactive scenarios. We propose to use Brier score as the baseline metric for evaluation. In order to reveal the fatality of the consequences when the predictions are adopted by decision-making and planning, we propose a fatality-aware metric, which is a weighted Brier score based on the criticality of the trajectory pairs of the interacting entities. Conservatism and non-defensiveness are defined from the weighted Brier score to indicate the consequences caused by inaccurate predictions. Modified methods based on PGM, NN and IRL are provided to generate probabilistic reaction predictions in an exemplar scenario of nudging from a highway ramp. The results are evaluated by the baseline and proposed metrics to construct a mini benchmark. Analysis on the properties of each method is also provided by comparing the baseline and proposed metric scores.

* 2018 IEEE 21st International Conference on Intelligent Transportation Systems (ITSC) 

  Click for Model/Code and Paper
A Co-Prime Blur Scheme for Data Security in Video Surveillance

Mar 22, 2012
Christopher Thorpe, Feng Li, Zijia Li, Zhan Yu, David Saunders, Jingyi Yu

This paper presents a novel Coprime Blurred Pair (CBP) model for visual data-hiding for security in camera surveillance. While most previous approaches have focused on completely encrypting the video stream, we introduce a spatial encryption scheme by blurring the image/video contents to create a CBP. Our goal is to obscure detail in public video streams by blurring while allowing behavior to be recognized and to quickly deblur the stream so that details are available if behavior is recognized as suspicious. We create a CBP by blurring the same latent image with two unknown kernels. The two kernels are coprime when mapped to bivariate polynomials in the z domain. To deblur the CBP we first use the coprime constraint to approximate the kernels and sample the bivariate CBP polynomials in one dimension on the unit circle. At each sample point, we factor the 1D polynomial pair and compose the results into a 2D kernel matrix. Finally, we compute the inverse Fast Fourier Transform (FFT) of the kernel matrices to recover the coprime kernels and then the latent video stream. It is therefore only possible to deblur the video stream if a user has access to both streams. To improve the practicability of our algorithm, we implement our algorithm using a graphics processing unit (GPU) to decrypt the blurred video streams in real-time, and extensive experimental results demonstrate that our new scheme can effectively protect sensitive identity information in surveillance videos and faithfully reconstruct the unblurred video stream when two blurred sequences are available.


  Click for Model/Code and Paper
Directed-Weighting Group Lasso for Eltwise Blocked CNN Pruning

Oct 21, 2019
Ke Zhan, Shimiao Jiang, Yu Bai, Yi Li, Xu Liu, Zhuoran Xu

Eltwise layer is a commonly used structure in the multi-branch deep learning network. In a filter-wise pruning procedure, due to the specific operation of the eltwise layer, all its previous convolutional layers should vote for which filters by index should be pruned. Since only an intersection of the voted filters is pruned, the compression rate is limited. This work proposes a method called Directed-Weighting Group Lasso (DWGL), which enforces an index-wise incremental (directed) coefficient on the filterlevel group lasso items, so that the low index filters getting high activation tend to be kept while the high index ones tend to be pruned. When using DWGL, much fewer filters are retained during the voting process and the compression rate can be boosted. The paper test the proposed method on the ResNet series networks. On CIFAR-10, it achieved a 75.34% compression rate on ResNet-56 with a 0.94% error increment, and a 52.06% compression rate on ResNet-20 with a 0.72% error increment. On ImageNet, it achieved a 53% compression rate with ResNet-50 with a 0.6% error increment, speeding up the network by 2.23 times. Furthermore, it achieved a 75% compression rate on ResNet-50 with a 1.2% error increment, speeding up the network by 4 times.

* Proceedings of the British Machine Vision Conference (BMVC), 2019 

  Click for Model/Code and Paper
Extended Local Binary Patterns for Efficient and Robust Spontaneous Facial Micro-Expression Recognition

Jul 22, 2019
Chengyu Guo, Jingyun Liang, Geng Zhan, Zhong Liu, Matti Pietikäinen, Li Liu

Facial MicroExpressions (MEs) are spontaneous, involuntary facial movements when a person experiences an emotion but deliberately or unconsciously attempts to conceal his or her genuine emotions. Recently, ME recognition has attracted increasing attention due to its potential applications such as clinical diagnosis, business negotiation, interrogations and security. However, it is expensive to build large scale ME datasets, mainly due to the difficulty of naturally inducing spontaneous MEs. This limits the application of deep learning techniques which require lots of training data. In this paper, we propose a simple, efficient yet robust descriptor called Extended Local Binary Patterns on Three Orthogonal Planes (ELBPTOP) for ME recognition. ELBPTOP consists of three complementary binary descriptors: LBPTOP and two novel ones Radial Difference LBPTOP (RDLBPTOP) and Angular Difference LBPTOP (ADLBPTOP), which explore the local second order information along radial and angular directions contained in ME video sequences. ELBPTOP is a novel ME descriptor inspired by the unique and subtle facial movements. It is computationally efficient and only marginally increases the cost of computing LBPTOP, yet is extremely effective for ME recognition. In addition, by firstly introducing Whitened Principal Component Analysis (WPCA) to ME recognition, we can further obtain more compact and discriminative feature representations, and achieve significantly computational savings. Extensive experimental evaluation on three popular spontaneous ME datasets SMIC, CASMEII and SAMM show that our proposed ELBPTOP approach significantly outperforms previous state of the art on all three evaluated datasets. Our proposed ELBPTOP achieves 73.94% on CASMEII, which is 6.6% higher than state of the art on this dataset. More impressively, ELBPTOP increases recognition accuracy from 44.7% to 63.44% on the SAMM dataset.


  Click for Model/Code and Paper
Dual Residual Network for Accurate Human Activity Recognition

Mar 13, 2019
Jun Long, WuQing Sun, Zhan Yang, Osolo Ian Raymond, Bin Li

Human Activity Recognition (HAR) using deep neural network has become a hot topic in human-computer interaction. Machine can effectively identify human naturalistic activities by learning from a large collection of sensor data. Activity recognition is not only an interesting research problem, but also has many real-world practical applications. Based on the success of residual networks in achieving a high level of aesthetic representation of the automatic learning, we propose a novel \textbf{D}ual \textbf{R}esidual \textbf{N}etwork, named DRN. DRN is implemented using two identical path frameworks consisting of (1) a short time window, which is used to capture spatial features, and (2) a long time window, which is used to capture fine temporal features. The long time window path can be made very lightweight by reducing its channel capacity, yet still being able to learn useful temporal representations for activity recognition. In this paper, we mainly focus on proposing a new model to improve the accuracy of HAR. In order to demonstrate the effectiveness of DRN model, we carried out extensive experiments and compared with conventional recognition methods (HC, CBH, CBS) and learning-based methods (AE, MLP, CNN, LSTM, Hybrid, ResNet). The benchmark datasets (OPPORTUNITY, UniMiB-SHAR) were adopted by our experiments. Results from our experiments show that our model is effective in recognizing human activities via wearable datasets. We discuss the influence of networks parameters on performance to provide insights about its optimization.

* submitted to Information 

  Click for Model/Code and Paper
Efficient Online Hyperparameter Optimization for Kernel Ridge Regression with Applications to Traffic Time Series Prediction

Nov 01, 2018
Hongyuan Zhan, Gabriel Gomes, Xiaoye S. Li, Kamesh Madduri, Kesheng Wu

Computational efficiency is an important consideration for deploying machine learning models for time series prediction in an online setting. Machine learning algorithms adjust model parameters automatically based on the data, but often require users to set additional parameters, known as hyperparameters. Hyperparameters can significantly impact prediction accuracy. Traffic measurements, typically collected online by sensors, are serially correlated. Moreover, the data distribution may change gradually. A typical adaptation strategy is periodically re-tuning the model hyperparameters, at the cost of computational burden. In this work, we present an efficient and principled online hyperparameter optimization algorithm for Kernel Ridge regression applied to traffic prediction problems. In tests with real traffic measurement data, our approach requires as little as one-seventh of the computation time of other tuning methods, while achieving better or similar prediction accuracy.

* H. Zhan, G. Gomes, X. S. Li, K. Madduri, and K. Wu. Efficient Online Hyperparameter Learning for Traffic Flow Prediction. In 2018 IEEE 21th International Conference on Intelligent Transportation Systems (ITSC), pages 1-6. IEEE, 2018 
* An extended version of "Efficient Online Hyperparameter Learning for Traffic Flow Prediction" published in The 21st IEEE International Conference on Intelligent Transportation Systems (ITSC 2018) 

  Click for Model/Code and Paper
Sleep-deprived Fatigue Pattern Analysis using Large-Scale Selfies from Social Med

Feb 22, 2018
Xuefeng Peng, Jiebo Luo, Catherine Glenn, Li-Kai Chi, Jingyao Zhan

The complexities of fatigue have drawn much attention from researchers across various disciplines. Short-term fatigue may cause safety issue while driving; thus, dynamic systems were designed to track driver fatigue. Long-term fatigue could lead to chronic syndromes, and eventually affect individuals physical and psychological health. Traditional methodologies of evaluating fatigue not only require sophisticated equipment but also consume enormous time. In this paper, we attempt to develop a novel and efficient method to predict individual's fatigue rate by scrutinizing human facial cues. Our goal is to predict fatigue rate based on a selfie. To associate the fatigue rate with user behaviors, we have collected nearly 1-million timeline posts from 10,480 users on Instagram. We first detect all the faces and identify their demographics using automatic algorithms. Next, we investigate the fatigue distribution by weekday over different age, gender, and ethnic groups. This work represents a promising way to assess sleep-deprived fatigue, and our study provides a viable and efficient computational framework for user fatigue modeling in large-scale via social media.

* Special Session on Intelligent Data Mining, IEEE Big Data Conference, Boston, MA, December 2017 

  Click for Model/Code and Paper
LoadCNN: A Efficient Green Deep Learning Model for Day-ahead Individual Resident Load Forecasting

Aug 01, 2019
Yunyou Huang, Nana Wang, Tianshu Hao, Wanling Gao, Cheng Huang, Jianqing Li, Jianfeng Zhan

Accurate day-ahead individual resident load forecasting is very important to various applications of smart grid. As a powerful machine learning technology, deep learning has shown great advantages in load forecasting task. However, deep learning is a computationally-hungry method, requires a plenty of training time and results in considerable energy consumed and a plenty of CO2 emitted. This aggravates the energy crisis and incurs a substantial cost to the environment. As a result, the deep learning methods are difficult to be popularized and applied in the real smart grid environment. In this paper, to reduce training time, energy consumed and CO2 emitted, we propose a efficient green model based on convolutional neural network, namely LoadCNN, for next-day load forecasting of individual resident. The training time, energy consumption, and CO2 emissions of LoadCNN are only approximately 1/70 of the corresponding indicators of other state-of-the-art models. Meanwhile, it achieves state-of-the-art performance in terms of prediction accuracy. LoadCNN is the first load forecasting model which simultaneously considers prediction accuracy, training time, energy efficiency and environment costs. It is a efficient green model that is able to be quickly, cost-effectively and environmental-friendly deployed in a realistic smart grid environment.


  Click for Model/Code and Paper