Models, code, and papers for "Omar M":
We explore the family of methods "PAC-Bayes with Backprop" (PBB) to train probabilistic neural networks by minimizing PAC-Bayes bounds. We present two training objectives, one derived from a previously known PAC-Bayes bound, and a second one derived from a novel PAC-Bayes bound. Both training objectives are evaluated on MNIST and on various UCI data sets. Our experiments show two striking observations: we obtain competitive test set error estimates (~1.4% on MNIST) and at the same time we compute non-vacuous bounds with much tighter values (~2.3% on MNIST) than previous results. These observations suggest that neural nets trained by PBB may lead to self-bounding learning, where the available data can be used to simultaneously learn a predictor and certify its risk, with no need to follow a data-splitting protocol.
Recognizing a piece of writing as a poem or prose is usually easy for the majority of people; however, only specialists can determine which meter a poem belongs to. In this paper, we build Recurrent Neural Network (RNN) models that can classify poems according to their meters from plain text. The input text is encoded at the character level and directly fed to the models without feature handcrafting. This is a step forward for machine understanding and synthesis of languages in general, and Arabic language in particular. Among the 16 poem meters of Arabic and the 4 meters of English the networks were able to correctly classify poem with an overall accuracy of 96.38\% and 82.31\% respectively. The poem datasets used to conduct this research were massive, over 1.5 million of verses, and were crawled from different nontechnical sources, almost Arabic and English literature sites, and in different heterogeneous and unstructured formats. These datasets are now made publicly available in clean, structured, and documented format for other future research. To the best of the authors' knowledge, this research is the first to address classifying poem meters in a machine learning approach, in general, and in RNN featureless based approach, in particular. In addition, the dataset is the first publicly available dataset ready for the purpose of future computational research.
Text classification plays a vital role today especially with the intensive use of social networking media. Recently, different architectures of convolutional neural networks have been used for text classification in which one-hot vector, and word embedding methods are commonly used. This paper presents a new language independent word encoding method for text classification. The proposed model converts raw text data to low-level feature dimension with minimal or no preprocessing steps by using a new approach called binary unique number of word "BUNOW". BUNOW allows each unique word to have an integer ID in a dictionary that is represented as a k-dimensional vector of its binary equivalent. The output vector of this encoding is fed into a convolutional neural network (CNN) model for classification. Moreover, the proposed model reduces the neural network parameters, allows faster computation with few network layers, where a word is atomic representation the document as in word level, and decrease memory consumption for character level representation. The provided CNN model is able to work with other languages or multi-lingual text without the need for any changes in the encoding method. The model outperforms the character level and very deep character level CNNs models in terms of accuracy, network parameters, and memory consumption; the results show total classification accuracy 91.99% and error 8.01% using AG's News dataset compared to the state of art methods that have total classification accuracy 91.45% and error 8.55%, in addition to the reduction in input feature vector and neural network parameters by 62% and 34%, respectively.
Mobile ground robots operating on unstructured terrain must predict which areas of the environment they are able to pass in order to plan feasible paths. We address traversability estimation as a heightmap classification problem: we build a convolutional neural network that, given an image representing the heightmap of a terrain patch, predicts whether the robot will be able to traverse such patch from left to right. The classifier is trained for a specific robot model (wheeled, tracked, legged, snake-like) using simulation data on procedurally generated training terrains; the trained classifier can be applied to unseen large heightmaps to yield oriented traversability maps, and then plan traversable paths. We extensively evaluate the approach in simulation on six real-world elevation datasets, and run a real-robot validation in one indoor and one outdoor environment.
We introduce a general self-supervised approach to predict the future outputs of a short-range sensor (such as a proximity sensor) given the current outputs of a long-range sensor (such as a camera); we assume that the former is directly related to some piece of information to be perceived (such as the presence of an obstacle in a given position), whereas the latter is information-rich but hard to interpret directly. We instantiate and implement the approach on a small mobile robot to detect obstacles at various distances using the video stream of the robot's forward-pointing camera, by training a convolutional neural network on automatically-acquired datasets. We quantitatively evaluate the quality of the predictions on unseen scenarios, qualitatively evaluate robustness to different operating conditions, and demonstrate usage as the sole input of an obstacle-avoidance controller. We additionally instantiate the approach on a different simulated scenario with complementary characteristics, to exemplify the generality of our contribution.
Currently, the most common motion representation for action recognition is optical flow. Optical flow is based on particle tracking which adheres to a Lagrangian perspective on dynamics. In contrast to the Lagrangian perspective, the Eulerian model of dynamics does not track, but describes local changes. For video, an Eulerian phase-based motion representation, using complex steerable filters, has been successfully employed recently for motion magnification and video frame interpolation. Inspired by these previous works, here, we proposes learning Eulerian motion representations in a deep architecture for action recognition. We learn filters in the complex domain in an end-to-end manner. We design these complex filters to resemble complex Gabor filters, typically employed for phase-information extraction. We propose a phase-information extraction module, based on these complex filters, that can be used in any network architecture for extracting Eulerian representations. We experimentally analyze the added value of Eulerian motion representations, as extracted by our proposed phase extraction module, and compare with existing motion representations based on optical flow, on the UCF101 dataset.
Breast cancer is the most common cancer and is the leading cause of cancer death among women worldwide. Detection of breast cancer, while it is still small and confined to the breast, provides the best chance of effective treatment. Computer Aided Detection (CAD) systems that detect cancer from mammograms will help in reducing the human errors that lead to missing breast carcinoma. Literature is rich of scientific papers for methods of CAD design, yet with no complete system architecture to deploy those methods. On the other hand, commercial CADs are developed and deployed only to vendors' mammography machines with no availability to public access. This paper presents a complete CAD; it is complete since it combines, on a hand, the rigor of algorithm design and assessment (method), and, on the other hand, the implementation and deployment of a system architecture for public accessibility (system). (1) We develop a novel algorithm for image enhancement so that mammograms acquired from any digital mammography machine look qualitatively of the same clarity to radiologists' inspection; and is quantitatively standardized for the detection algorithms. (2) We develop novel algorithms for masses and microcalcifications detection with accuracy superior to both literature results and the majority of approved commercial systems. (3) We design, implement, and deploy a system architecture that is computationally effective to allow for deploying these algorithms to cloud for public access.
This manuscript presents control of a high-DOF fully actuated lower-limb exoskeleton for paraplegic individuals. The key novelty is the ability for the user to walk without the use of crutches or other external means of stabilization. We harness the power of modern optimization techniques and supervised machine learning to develop a smooth feedback control policy that provides robust velocity regulation and perturbation rejection. Preliminary evaluation of the stability and robustness of the proposed approach is demonstrated through the Gazebo simulation environment. In addition, preliminary experimental results with (complete) paraplegic individuals are included for the previous version of the controller.
This report describes eighteen projects that explored how commercial cloud computing services can be utilized for scientific computation at national laboratories. These demonstrations ranged from deploying proprietary software in a cloud environment to leveraging established cloud-based analytics workflows for processing scientific datasets. By and large, the projects were successful and collectively they suggest that cloud computing can be a valuable computational resource for scientific computation at national laboratories.
To date, pavement management software products and studies on optimizing the prioritization of pavement maintenance and rehabilitation (M&R) have been mainly focused on three parameters; the pre-treatment pavement condition, the rehabilitation cost, and the available budget. Yet, the role of the candidate projects' spatial characteristics in the decision-making process has not been deeply considered. Such a limitation, predominately, allows the recommended M&R projects' schedule to involve simultaneously running but spatially scattered construction sites, which are very challenging to monitor and manage. This study introduces a novel approach to integrate pavement segments' spatial coordinates into the M&R prioritization analysis. The introduced approach aims at combining the pavement segments with converged spatial coordinates to be repaired in the same timeframe without compromising the allocated budget levels or the overall target Pavement Condition Index (PCI). Such a combination would result in minimizing the routing of crews, materials and other equipment among the construction sites and would provide better collaborations and communications between the pavement maintenance teams. Proposed herein is a novel spatial clustering algorithm that automatically finds the projects within a certain budget and spatial constrains. The developed algorithm was successfully validated using 1,800 pavement maintenance projects from two real-life examples of the City of Milton, GA and the City of Tyler, TX.
In the panoply of pattern classification techniques, few enjoy the intuitive appeal and simplicity of the nearest neighbor rule: given a set of samples in some metric domain space whose value under some function is known, we estimate the function anywhere in the domain by giving the value of the nearest sample per the metric. More generally, one may use the modal value of the m nearest samples, where m is a fixed positive integer (although m=1 is known to be admissible in the sense that no larger value is asymptotically superior in terms of prediction error). The nearest neighbor rule is nonparametric and extremely general, requiring in principle only that the domain be a metric space. The classic paper on the technique, proving convergence under independent, identically-distributed (iid) sampling, is due to Cover and Hart (1967). Because taking samples is costly, there has been much research in recent years on selective sampling, in which each sample is selected from a pool of candidates ranked by a heuristic; the heuristic tries to guess which candidate would be the most "informative" sample. Lindenbaum et al. (2004) apply selective sampling to the nearest neighbor rule, but their approach sacrifices the austere generality of Cover and Hart; furthermore, their heuristic algorithm is complex and computationally expensive. Here we report recent results that enable selective sampling in the original Cover-Hart setting. Our results pose three selection heuristics and prove that their nearest neighbor rule predictions converge to the true pattern. Two of the algorithms are computationally cheap, with complexity growing linearly in the number of samples. We believe that these results constitute an important advance in the art.
Quantitative characterization of disease progression using longitudinal data can provide long-term predictions for the pathological stages of individuals. This work studies robust modeling of Alzheimer's disease progression using parametric methods. The proposed method linearly maps individual's chronological age to a disease progression score (DPS) and robustly fits a constrained generalized logistic function to the longitudinal dynamic of a biomarker as a function of the DPS using M-estimation. Robustness of the estimates is quantified using bootstrapping via Monte Carlo resampling, and the inflection points are used to temporally order the modeled biomarkers in the disease course. Moreover, kernel density estimation is applied to the obtained DPSs for clinical status prediction using a Bayesian classifier. Different M-estimators and logistic functions, including a new generalized type proposed in this study called modified Stannard, are evaluated on the ADNI database for robust modeling of volumetric MRI and PET biomarkers, as well as neuropsychological tests. The results show that the modified Stannard function fitted using the modified Huber loss achieves the best modeling performance with a mean of median absolute errors (MMAE) of 0.059 across all biomarkers and bootstraps. In addition, applied to the ADNI test set, this model achieves a multi-class area under the ROC curve (MAUC) of 0.87 in clinical status prediction, and it significantly outperforms an analogous state-of-the-art method with a biomarker modeling MMAE of 0.059 vs. 0.061 (p < 0.001). Finally, the experiments show that the proposed model, trained using abundant ADNI data, generalizes well to data from the independent NACC database, where both modeling and diagnostic performance are significantly improved (p < 0.001) compared with using a model trained using relatively sparse NACC data.
Quanta image sensor (QIS) is to be the next generation image sensor after CCD and CMOS. To enable such technology, significant progress was made over the past five years to advance both the device and image reconstruction algorithms. In this paper, we discuss color imaging using QIS, in particular how to design color filter arrays. Designing color filter arrays for QIS is challenging because at the pixel pitch of 1.1$\mu$m, maximizing the light efficiency while suppressing aliasing and crosstalk are conflicting tasks. We present an optimization-based framework to design color filter arrays for very small pixels. The new framework unifies several mainstream color filter array design frameworks by offering generality and flexibility. Compared to the existing frameworks which can only handle one or two design criteria, the new framework can simultaneously handle luminance gain, chrominance gain, cross-talk, anti-aliasing, manufacturability and orthogonality. Extensive experimental comparisons demonstrate the effectiveness and generality of the framework.
This paper explores sequential modeling of polyphonic music with deep neural networks. While recent breakthroughs have focussed on network architecture, we demonstrate that the representation of the sequence can make an equally significant contribution to the performance of the model as measured by validation set loss. By extracting salient features inherent to the dataset, the model can either be conditioned on these features or trained to predict said features as extra components of the sequences being modeled. We show that training a neural network to predict a seemingly more complex sequence, with extra features included in the series being modeled, can improve overall model performance significantly. We first introduce TonicNet, a GRU-based model trained to initially predict the chord at a given time-step before then predicting the notes of each voice at that time-step, in contrast with the typical approach of predicting only the notes. We then evaluate TonicNet on the canonical JSB Chorales dataset and obtain state-of-the-art results.
For more than half a century, moments have attracted lot ot interest in the pattern recognition community.The moments of a distribution (an object) provide several of its characteristics as center of gravity, orientation, disparity, volume. Moments can be used to define invariant characteristics to some transformations that an object can undergo, commonly called moment invariants. This work provides a simple and systematic formalism to compute geometric moment invariants in n-dimensional space.
There are concerns that neural language models may preserve some of the stereotypes of the underlying societies that generate the large corpora needed to train these models. For example, gender bias is a significant problem when generating text, and its unintended memorization could impact the user experience of many applications (e.g., the smart-compose feature in Gmail). In this paper, we introduce a novel architecture that decouples the representation learning of a neural model from its memory management role. This architecture allows us to update a memory module with an equal ratio across gender types addressing biased correlations directly in the latent space. We experimentally show that our approach can mitigate the gender bias amplification in the automatic generation of articles news while providing similar perplexity values when extending the Sequence2Sequence architecture.
Unlike other languages, the Arabic language has a morphological complexity which makes the Arabic sentiment analysis is a challenging task. Moreover, the presence of the dialects in the Arabic texts have made the sentiment analysis task is more challenging, due to the absence of specific rules that govern the writing or speaking system. Generally, one of the problems of sentiment analysis is the high dimensionality of the feature vector. To resolve this problem, many feature selection methods have been proposed. In contrast to the dialectal Arabic language, these selection methods have been investigated widely for the English language. This work investigated the effect of feature selection methods and their combinations on dialectal Arabic sentiment classification. The feature selection methods are Information Gain (IG), Correlation, Support Vector Machine (SVM), Gini Index (GI), and Chi-Square. A number of experiments were carried out on dialectical Jordanian reviews with using an SVM classifier. Furthermore, the effect of different term weighting schemes, stemmers, stop words removal, and feature models on the performance were investigated. The experimental results showed that the best performance of the SVM classifier was obtained after the SVM and correlation feature selection methods had been combined with the uni-gram model.
One of the main difficulties in sentiment analysis of the Arabic language is the presence of the colloquialism. In this paper, we examine the effect of using objective words in conjunction with sentimental words on sentiment classification for the colloquial Arabic reviews, specifically Jordanian colloquial reviews. The reviews often include both sentimental and objective words, however, the most existing sentiment analysis models ignore the objective words as they are considered useless. In this work, we created two lexicons: the first includes the colloquial sentimental words and compound phrases, while the other contains the objective words associated with values of sentiment tendency based on a particular estimation method. We used these lexicons to extract sentiment features that would be training input to the Support Vector Machines (SVM) to classify the sentiment polarity of the reviews. The reviews dataset have been collected manually from JEERAN website. The results of the experiments show that the proposed approach improves the polarity classification in comparison to two baseline models, with accuracy 95.6%.