Models, code, and papers for "Mohammad H":

##### GumBolt: Extending Gumbel trick to Boltzmann priors

May 18, 2018
Amir H. Khoshaman, Mohammad H. Amin

Boltzmann machines (BMs) are appealing candidates for powerful priors in variational autoencoders (VAEs), as they are capable of capturing nontrivial and multi-modal distributions over discrete variables. However, indifferentiability of the discrete units prohibits using the reparameterization trick, essential for low-noise back propagation. The Gumbel trick resolves this problem in a consistent way by relaxing the variables and distributions, but it is incompatible with BM priors. Here, we propose the GumBolt, a model that extends the Gumbel trick to BM priors in VAEs. GumBolt is significantly simpler than the recently proposed methods with BM prior and outperforms them by a considerable margin. It achieves state-of-the-art performance on permutation invariant MNIST and OMNIGLOT datasets in the scope of models with only discrete latent variables. Moreover, the performance can be further improved by allowing multi-sampled (importance-weighted) estimation of log-likelihood in training, which was not possible with previous models.

* 12 pages, 2 Figures, 2 Tables
##### Facial Affect Estimation in the Wild Using Deep Residual and Convolutional Networks

May 22, 2017

Automated affective computing in the wild is a challenging task in the field of computer vision. This paper presents three neural network-based methods proposed for the task of facial affect estimation submitted to the First Affect-in-the-Wild challenge. These methods are based on Inception-ResNet modules redesigned specifically for the task of facial affect estimation. These methods are: Shallow Inception-ResNet, Deep Inception-ResNet, and Inception-ResNet with LSTMs. These networks extract facial features in different scales and simultaneously estimate both the valence and arousal in each frame. Root Mean Square Error (RMSE) rates of 0.4 and 0.3 are achieved for the valence and arousal respectively with corresponding Concordance Correlation Coefficient (CCC) rates of 0.04 and 0.29 using Deep Inception-ResNet method.

* To appear in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
##### Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks

May 22, 2017

Deep Neural Networks (DNNs) have shown to outperform traditional methods in various visual recognition tasks including Facial Expression Recognition (FER). In spite of efforts made to improve the accuracy of FER systems using DNN, existing methods still are not generalizable enough in practical applications. This paper proposes a 3D Convolutional Neural Network method for FER in videos. This new network architecture consists of 3D Inception-ResNet layers followed by an LSTM unit that together extracts the spatial relations within facial images as well as the temporal relations between different frames in the video. Facial landmark points are also used as inputs to our network which emphasize on the importance of facial components rather than the facial regions that may not contribute significantly to generating facial expressions. Our proposed method is evaluated using four publicly available databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.

* To appear in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
##### Spatio-Temporal Facial Expression Recognition Using Convolutional Neural Networks and Conditional Random Fields

Apr 24, 2017

Automated Facial Expression Recognition (FER) has been a challenging task for decades. Many of the existing works use hand-crafted features such as LBP, HOG, LPQ, and Histogram of Optical Flow (HOF) combined with classifiers such as Support Vector Machines for expression recognition. These methods often require rigorous hyperparameter tuning to achieve good results. Recently Deep Neural Networks (DNN) have shown to outperform traditional methods in visual object recognition. In this paper, we propose a two-part network consisting of a DNN-based architecture followed by a Conditional Random Field (CRF) module for facial expression recognition in videos. The first part captures the spatial relation within facial images using convolutional layers followed by three Inception-ResNet modules and two fully-connected layers. To capture the temporal relation between the image frames, we use linear chain CRF in the second part of our network. We evaluate our proposed network on three publicly available databases, viz. CK+, MMI, and FERA. Experiments are performed in subject-independent and cross-database manners. Our experimental results show that cascading the deep network architecture with the CRF module considerably increases the recognition of facial expressions in videos and in particular it outperforms the state-of-the-art methods in the cross-database experiments and yields comparable results in the subject-independent experiments.

* To appear in 12th IEEE Conference on Automatic Face and Gesture Recognition Workshop
##### Bidirectional Warping of Active Appearance Model

Nov 20, 2015

Active Appearance Model (AAM) is a commonly used method for facial image analysis with applications in face identification and facial expression recognition. This paper proposes a new approach based on image alignment for AAM fitting called bidirectional warping. Previous approaches warp either the input image or the appearance template. We propose to warp both the input image, using incremental update by an affine transformation, and the appearance template, using an inverse compositional approach. Our experimental results on Multi-PIE face database show that the bidirectional approach outperforms state-of-the-art inverse compositional fitting approaches in extracting landmark points of faces with shape and pose variations.

* 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
##### Binary Sine Cosine Algorithms for Feature Selection from Medical Data

Nov 15, 2019

A well-constructed classification model highly depends on input feature subsets from a dataset, which may contain redundant, irrelevant, or noisy features. This challenge can be worse while dealing with medical datasets. The main aim of feature selection as a pre-processing task is to eliminate these features and select the most effective ones. In the literature, metaheuristic algorithms show a successful performance to find optimal feature subsets. In this paper, two binary metaheuristic algorithms named S-shaped binary Sine Cosine Algorithm (SBSCA) and V-shaped binary Sine Cosine Algorithm (VBSCA) are proposed for feature selection from the medical data. In these algorithms, the search space remains continuous, while a binary position vector is generated by two transfer functions S-shaped and V-shaped for each solution. The proposed algorithms are compared with four latest binary optimization algorithms over five medical datasets from the UCI repository. The experimental results confirm that using both bSCA variants enhance the accuracy of classification on these medical datasets compared to four other algorithms.

##### Deep-learning PDEs with unlabeled data and hardwiring physics laws

Apr 13, 2019
S. Mohammad H. Hashemi, Demetri Psaltis

Providing fast and accurate solutions to partial differential equations is a problem of continuous interest to the fields of applied mathematics and physics. With the recent advances in machine learning, the adoption learning techniques in this domain is being eagerly pursued. We build upon earlier works on linear and homogeneous PDEs, and develop convolutional deep neural networks that can accurately solve nonlinear and non-homogeneous equations without the need for labeled data. The architecture of these networks is readily accessible for scientific disciplines who deal with PDEs and know the basics of deep learning.

##### AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Oct 09, 2017

Automated affective computing in the wild setting is a challenging problem in computer vision. Existing annotated databases of facial expressions in the wild are small and mostly cover discrete emotions (aka the categorical model). There are very limited annotated facial databases for affective computing in the continuous dimensional model (e.g., valence and arousal). To meet this need, we collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet). AffectNet contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1250 emotion related keywords in six different languages. About half of the retrieved images were manually annotated for the presence of seven discrete facial expressions and the intensity of valence and arousal. AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models. Two baseline deep neural networks are used to classify images in the categorical model and predict the intensity of valence and arousal. Various evaluation metrics show that our deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expression recognition systems.

* IEEE Transactions on Affective Computing, 2017
##### Bottleneck Conditional Density Estimation

Jun 30, 2017

We introduce a new framework for training deep generative models for high-dimensional conditional density estimation. The Bottleneck Conditional Density Estimator (BCDE) is a variant of the conditional variational autoencoder (CVAE) that employs layer(s) of stochastic variables as the bottleneck between the input $x$ and target $y$, where both are high-dimensional. Crucially, we propose a new hybrid training method that blends the conditional generative model with a joint generative model. Hybrid blending is the key to effective training of the BCDE, which avoids overfitting and provides a novel mechanism for leveraging unlabeled data. We show that our hybrid training procedure enables models to achieve competitive results in the MNIST quadrant prediction task in the fully-supervised setting, and sets new benchmarks in the semi-supervised regime for MNIST, SVHN, and CelebA.

##### NoiseOut: A Simple Way to Prune Neural Networks

Nov 18, 2016

Neural networks are usually over-parameterized with significant redundancy in the number of required neurons which results in unnecessary computation and memory usage at inference time. One common approach to address this issue is to prune these big networks by removing extra neurons and parameters while maintaining the accuracy. In this paper, we propose NoiseOut, a fully automated pruning algorithm based on the correlation between activations of neurons in the hidden layers. We prove that adding additional output neurons with entirely random targets results into a higher correlation between neurons which makes pruning by NoiseOut even more efficient. Finally, we test our method on various networks and datasets. These experiments exhibit high pruning rates while maintaining the accuracy of the original network.

##### Going Deeper in Facial Expression Recognition using Deep Neural Networks

Nov 12, 2015
Ali Mollahosseini, David Chan, Mohammad H. Mahoor

Automated Facial Expression Recognition (FER) has remained a challenging and interesting problem. Despite efforts made in developing various methods for FER, existing approaches traditionally lack generalizability when applied to unseen images or those that are captured in wild setting. Most of the existing approaches are based on engineered features (e.g. HOG, LBPH, and Gabor) where the classifier's hyperparameters are tuned to give best recognition accuracies across a single database, or a small collection of similar databases. Nevertheless, the results are not significant when they are applied to novel data. This paper proposes a deep neural network architecture to address the FER problem across multiple well-known standard face datasets. Specifically, our network consists of two convolutional layers each followed by max pooling and then four Inception layers. The network is a single component architecture that takes registered facial images as the input and classifies them into either of the six basic or the neutral expressions. We conducted comprehensive experiments on seven publically available facial expression databases, viz. MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER2013. The results of proposed architecture are comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks and in both accuracy and training time.

* IEEE Winter Conference on Applications of Computer Vision (WACV), 2016
* To be appear in IEEE Winter Conference on Applications of Computer Vision (WACV), 2016 {Accepted in first round submission}
##### Minimax Optimal Sparse Signal Recovery with Poisson Statistics

We are motivated by problems that arise in a number of applications such as Online Marketing and Explosives detection, where the observations are usually modeled using Poisson statistics. We model each observation as a Poisson random variable whose mean is a sparse linear superposition of known patterns. Unlike many conventional problems observations here are not identically distributed since they are associated with different sensing modalities. We analyze the performance of a Maximum Likelihood (ML) decoder, which for our Poisson setting involves a non-linear optimization but yet is computationally tractable. We derive fundamental sample complexity bounds for sparse recovery when the measurements are contaminated with Poisson noise. In contrast to the least-squares linear regression setting with Gaussian noise, we observe that in addition to sparsity, the scale of the parameters also fundamentally impacts $\ell_2$ error in the Poisson setting. We show tightness of our upper bounds both theoretically and experimentally. In particular, we derive a minimax matching lower bound on the mean-squared error and show that our constrained ML decoder is minimax optimal for this regime.

* Submitted to IEEE Trans. on Signal Processing. arXiv admin note: substantial text overlap with arXiv:1307.4666
##### Automated Classification of L/R Hand Movement EEG Signals using Advanced Feature Extraction and Machine Learning

Dec 10, 2013
Mohammad H. Alomari, Aya Samaha, Khaled AlKamha

In this paper, we propose an automated computer platform for the purpose of classifying Electroencephalography (EEG) signals associated with left and right hand movements using a hybrid system that uses advanced feature extraction techniques and machine learning algorithms. It is known that EEG represents the brain activity by the electrical voltage fluctuations along the scalp, and Brain-Computer Interface (BCI) is a device that enables the use of the brain neural activity to communicate with others or to control machines, artificial limbs, or robots without direct physical movements. In our research work, we aspired to find the best feature extraction method that enables the differentiation between left and right executed fist movements through various classification algorithms. The EEG dataset used in this research was created and contributed to PhysioNet by the developers of the BCI2000 instrumentation system. Data was preprocessed using the EEGLAB MATLAB toolbox and artifacts removal was done using AAR. Data was epoched on the basis of Event-Related (De) Synchronization (ERD/ERS) and movement-related cortical potentials (MRCP) features. Mu/beta rhythms were isolated for the ERD/ERS analysis and delta rhythms were isolated for the MRCP analysis. The Independent Component Analysis (ICA) spatial filter was applied on related channels for noise reduction and isolation of both artifactually and neutrally generated EEG sources. The final feature vector included the ERD, ERS, and MRCP features in addition to the mean, power and energy of the activations of the resulting independent components of the epoched feature datasets. The datasets were inputted into two machine-learning algorithms: Neural Networks (NNs) and Support Vector Machines (SVMs). Intensive experiments were carried out and optimum classification performances of 89.8 and 97.1 were obtained using NN and SVM, respectively.

* International Journal of Advanced Computer Science and Applications (ijacsa) 07/2013; 4(6):207-212
* 6 pages, 4 figures
##### Multibiometric: Feature Level Fusion Using FKP Multi-Instance biometric

Oct 02, 2012

This paper proposed the use of multi-instance feature level fusion as a means to improve the performance of Finger Knuckle Print (FKP) verification. A log-Gabor filter has been used to extract the image local orientation information, and represent the FKP features. Experiments are performed using the FKP database, which consists of 7,920 images. Results indicate that the multi-instance verification approach outperforms higher performance than using any single instance. The influence on biometric performance using feature level fusion under different fusion rules have been demonstrated in this paper.

* IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 3, July 2012
* 8 pages paper
##### Recurrent Neural Network-based Model for Accelerated Trajectory Analysis in AIMD Simulations

The presented work demonstrates the training of recurrent neural networks (RNNs) from distributions of atom coordinates in solid state structures that were obtained using ab initio molecular dynamics (AIMD) simulations. AIMD simulations on solid state structures are treated as a multi-variate time-series problem. By referring interactions between atoms over the simulation time to temporary correlations among them, RNNs find patterns in the multi-variate time-dependent data, which enable forecasting trajectory paths and potential energy profiles. Two types of RNNs, namely gated recurrent unit and long short-term memory networks, are considered. The model is described and compared against a baseline AIMD simulation on an iridium oxide slab. Findings demonstrate that both networks can potentially be harnessed for accelerated statistical sampling in computational materials research.

* 10 pages, 6 figures, 1 table
##### Parametic Classification of Handvein Patterns Based on Texture Features

Mar 21, 2019
Harbi AlMahafzah, Mohammad Imranand, Supreetha Gowda H. D.

In this paper, we have developed Biometric recognition system adopting hand based modality Handvein, which has the unique pattern for each individual and it is impossible to counterfeit and fabricate as it is an internal feature. We have opted in choosing feature extraction algorithms such as LBP-visual descriptor ,LPQ-blur insensitive texture operator, Log-Gabor-Texture descriptor. We have chosen well known classifiers such as KNN and SVM for classification. We have experimented and tabulated results of single algorithm recognition rate for Handvein under different distance measures and kernel options. The feature level fusion is carried out which increased the performance level.

* 8 pages, International Conference on Electrical, Electronics, Materials and Applied Science (ICEEMAS). AIP: Proceedings International Conference on Electrical, Electronics, Materials and Applied Science (ICEEMAS),22nd and 23rd December 2017
##### Bounded Residual Gradient Networks (BReG-Net) for Facial Affect Computing

Mar 05, 2019

Residual-based neural networks have shown remarkable results in various visual recognition tasks including Facial Expression Recognition (FER). Despite the tremendous efforts have been made to improve the performance of FER systems using DNNs, existing methods are not generalizable enough for practical applications. This paper introduces Bounded Residual Gradient Networks (BReG-Net) for facial expression recognition, in which the shortcut connection between the input and the output of the ResNet module is replaced with a differentiable function with a bounded gradient. This configuration prevents the network from facing the vanishing or exploding gradient problem. We show that utilizing such non-linear units will result in shallower networks with better performance. Further, by using a weighted loss function which gives a higher priority to less represented categories, we can achieve an overall better recognition rate. The results of our experiments show that BReG-Nets outperform state-of-the-art methods on three publicly available facial databases in the wild, on both the categorical and dimensional models of affect.

* To appear in 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019)
##### Meta Learning Deep Visual Words for Fast Video Object Segmentation

Dec 04, 2018
Harkirat Singh Behl, Mohammad Najafi, Philip H. S. Torr

Meta learning has attracted a lot of attention recently. In this paper, we propose a fast and novel meta learning based method for video object segmentation that quickly adapts to new domains without any fine-tuning. The proposed model performs segmentation by matching pixels to object parts. The model represents object parts using deep visual words, and meta learns them with the objective of minimizing the object segmentation loss. This is however not straightforward as no ground-truth information is available for the object parts. We tackle this problem by iteratively performing unsupervised learning of the deep visual words, followed by supervised learning of the segmentation problem, given the visual words. Our experiments show that the proposed method performs on-par with state-of-the-art methods, while being computationally much more efficient.

* The first two authors have contributed equally and assert joint first authorship
##### Necessary and Sufficient Conditions for Novel Word Detection in Separable Topic Models

The simplicial condition and other stronger conditions that imply it have recently played a central role in developing polynomial time algorithms with provable asymptotic consistency and sample complexity guarantees for topic estimation in separable topic models. Of these algorithms, those that rely solely on the simplicial condition are impractical while the practical ones need stronger conditions. In this paper, we demonstrate, for the first time, that the simplicial condition is a fundamental, algorithm-independent, information-theoretic necessary condition for consistent separable topic estimation. Furthermore, under solely the simplicial condition, we present a practical quadratic-complexity algorithm based on random projections which consistently detects all novel words of all topics using only up to second-order empirical word moments. This algorithm is amenable to distributed implementation making it attractive for 'big-data' scenarios involving a network of large distributed databases.