Models, code, and papers for "Daniel A":
Simulating images representative of neurodegenerative diseases is important for predicting patient outcomes and for validation of computational models of disease progression. This capability is valuable for secondary prevention clinical trials where outcomes and screening criteria involve neuroimaging. Traditional computational methods are limited by imposing a parametric model for atrophy and are extremely resource-demanding. Recent advances in deep learning have yielded data-driven models for longitudinal studies (e.g., face ageing) that are capable of generating synthetic images in real-time. Similar solutions can be used to model trajectories of atrophy in the brain, although new challenges need to be addressed to ensure accurate disease progression modelling. Here we propose Degenerative Adversarial NeuroImage Net (DaniNet) --- a new deep learning approach that learns to emulate the effect of neurodegeneration on MRI. DaniNet uses an underlying set of Support Vector Regressors (SVRs) trained to capture the patterns of regional intensity changes that accompany disease progression. DaniNet produces whole output images, consisting of 2D-MRI slices that are constrained to match regional predictions from the SVRs. DaniNet is also able to condition the progression on non-imaging characteristics (age, diagnosis, etc.) while it maintains the unique brain morphology of individuals. Adversarial training ensures realistic brain images and smooth temporal progression. We train our model using 9652 T1-weighted (longitudinal) MRI extracted from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. We perform quantitative and qualitative evaluations on a separate test set of 1283 images (also from ADNI) demonstrating the ability of DaniNet to produce accurate and convincing synthetic images that emulate disease progression.
The recent success of deep learning together with the availability of large medical imaging datasets have enabled researchers to improve our understanding of complex chronic medical conditions such as neurodegenerative diseases. The possibility of predicting realistic and accurate images would be a breakthrough for many clinical healthcare applications. However, current image simulators designed to model neurodegenerative disease progression present limitations that preclude their utility in clinical practice. These limitations include personalization of disease progression and the ability to synthesize spatiotemporal images in high resolution. In particular, memory limitations prohibit full 3D image models, necessitating various techniques to discard spatiotemporal information, such as patch-based approaches. In this work, we introduce a novel technique to address this challenge, called Profile Weight Functions (PWF). We demonstrate its effectiveness integrated within our new deep learning framework, showing that it enables the extension to 3D of a recent state-of-the-art 2D approach. To our knowledge, we are the first to implement a personalized disease progression simulator able to predict accurate, personalised, high-resolution, 3D MRI. In particular, we trained a model of ageing and Alzheimer's disease progression using 9652 T1-weighted (longitudinal) MRI from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and validated on a separate test set of 1283 MRI (also from ADNI, random partition). We validated our model by analyzing its capability to synthesize MRI that produce accurate volumes of specific brain regions associated with neurodegeneration. Our experiments demonstrate the effectiveness of our solution to provide a 3D simulation that produces accurate and convincing synthetic MRI that emulate ageing and disease progression.
Explainability in Artificial Intelligence has been revived as a topic of active research by the need of conveying safety and trust to users in the `how' and `why' of automated decision-making. Whilst a plethora of approaches have been developed for post-hoc explainability, only a few focus on how to use domain knowledge, and how this influences the understandability of an explanation from the users' perspective. In this paper we show how ontologies help the understandability of interpretable machine learning models, such as decision trees. In particular, we build on Trepan, an algorithm that explains artificial neural networks by means of decision trees, and we extend it to include ontologies modeling domain knowledge in the process of generating explanations. We present the results of a user study that measures the understandability of decision trees in domains where explanations are critical, namely, in finance and medicine. Our study shows that decision trees taking into account domain knowledge during generation are more understandable than those generated without the use of ontologies.
Light scattered from multiple surfaces can be used to retrieve information of hidden environments. However, full three-dimensional retrieval of an object hidden from view by a wall has only been achieved with scanning systems and requires intensive computational processing of the retrieved data. Here we use a non-scanning, single-photon single-pixel detector in combination with an artificial neural network: this allows us to locate the position and to also simultaneously provide the actual identity of a hidden person, chosen from a database of people (N=3). Artificial neural networks applied to specific computational imaging problems can therefore enable novel imaging capabilities with hugely simplified hardware and processing times
Data science requires time-consuming iterative manual activities. In particular, activities such as data selection, preprocessing, transformation, and mining, highly depend on iterative trial-and-error processes that could be sped up significantly by providing quick feedback on the impact of changes. The idea of progressive data science is to compute the results of changes in a progressive manner, returning a first approximation of results quickly and allow iterative refinements until converging to a final result. Enabling the user to interact with the intermediate results allows an early detection of erroneous or suboptimal choices, the guided definition of modifications to the pipeline and their quick assessment. In this paper, we discuss the progressiveness challenges arising in different steps of the data science pipeline. We describe how changes in each step of the pipeline impact the subsequent steps and outline why progressive data science will help to make the process more effective. Computing progressive approximations of outcomes resulting from changes creates numerous research challenges, especially if the changes are made in the early steps of the pipeline. We discuss these challenges and outline first steps towards progressiveness, which, we argue, will ultimately help to significantly speed-up the overall data science process.
We propose to optimize neural networks with a uniformly-distributed random learning rate. The associated stochastic gradient descent algorithm can be approximated by continuous stochastic equations and analyzed with the Fokker-Planck formalism. In the small learning rate approximation, the training process is characterized by an effective temperature which depends on the average learning rate, the mini-batch size and the momentum of the optimization algorithm. By comparing the random learning rate protocol with cyclic and constant protocols, we suggest that the random choice is generically the best strategy in the small learning rate regime, yielding better regularization without extra computational cost. We provide supporting evidence through experiments on both shallow, fully-connected and deep, convolutional neural networks for image classification on the MNIST and CIFAR10 datasets.
We develop a normative framework for hierarchical model-based policy optimization based on applying second-order methods in the space of all possible state-action paths. The resulting natural path gradient performs policy updates in a manner which is sensitive to the long-range correlational structure of the induced stationary state-action densities. We demonstrate that the natural path gradient can be computed exactly given an environment dynamics model and depends on expressions akin to higher-order successor representations. In simulation, we show that the priorization of local policy updates in the resulting policy flow indeed reflects the intuitive state-space hierarchy in several toy problems.
We seek to align agent policy with human expert behavior in a reinforcement learning (RL) setting, without any prior knowledge about dynamics, reward function, and unsafe states. There is a human expert knowing the rewards and unsafe states based on his preference and objective, but querying that human expert is expensive. To address this challenge, we propose a new framework for imitation learning (IL) algorithm that actively and interactively learns a model of the user's reward function with efficient queries. We build an adversarial generative model of states and a successor feature (SR) model trained over transition experience collected by learning policy. Our method uses these models to select state-action pairs, asking the user to comment on the optimality or safety, and trains a adversarial neural network to predict the rewards. Different from previous papers, which are almost all based on uncertainty sampling, the key idea is to actively and efficiently select state-action pairs from both on-policy and off-policy experience, by discriminating the queried (expert) and unqueried (generated) data and maximizing the efficiency of value function learning. We call this method adversarial reward query with successor representation. We evaluate the proposed method with simulated human on a state-based 2D navigation task, robotic control tasks and the image-based video games, which have high-dimensional observation and complex state dynamics. The results show that the proposed method significantly outperforms uncertainty-based methods on learning reward models, achieving better query efficiency, where the adversarial discriminator can make the agent learn human behavior more efficiently and the SR can select states which have stronger impact on value function. Moreover, the proposed method can also learn to avoid unsafe states when training the reward model.
We present Smooth Grad-CAM++, a technique which combines two recent techniques: SMOOTHGRAD and Grad-CAM++. Smooth Grad-CAM++ has the capability of either visualizing a layer, subset of feature maps, or subset of neurons within a feature map at each instance. We experimented with few images, and we discovered that Smooth Grad-CAM++ produced more visually sharp maps with larger number of salient pixels highlighted in the given input images when compared with other methods. Smooth Grad-CAM++ will give insight into what our deep CNN models (including models trained on medical scan or imagery) learn. Hence informing decisions on creating a representative training set.
Localization of unknown faults in industrial systems is a difficult task for data-driven diagnosis methods. The classification performance of many machine learning methods relies on the quality of training data. Unknown faults, for example faults not represented in training data, can be detected using, for example, anomaly classifiers. However, mapping these unknown faults to an actual location in the real system is a non-trivial problem. In model-based diagnosis, physical-based models are used to create residuals that isolate faults by mapping model equations to faulty system components. Developing sufficiently accurate physical-based models can be a time-consuming process. Hybrid modeling methods combining physical-based methods and machine learning is one solution to design data-driven residuals for fault isolation. In this work, a set of neural network-based residuals are designed by incorporating physical insights about the system behavior in the residual model structure. The residuals are trained using only fault-free data and a simulation case study shows that they can be used to perform fault isolation and localization of unknown faults in the system.
Agent-based models are a powerful tool for studying the behaviour of complex systems that can be described in terms of multiple, interacting ``agents''. However, because of their inherently discrete and often highly non-linear nature, it is very difficult to reason about the relationship between the state of the model, on the one hand, and our observations of the real world on the other. In this paper we consider agents that have a discrete set of states that, at any instant, act with a probability that may depend on the environment or the state of other agents. Given this, we show how the mathematical apparatus of quantum field theory can be used to reason probabilistically about the state and dynamics the model, and describe an algorithm to update our belief in the state of the model in the light of new, real-world observations. Using a simple predator-prey model on a 2-dimensional spatial grid as an example, we demonstrate the assimilation of incomplete, noisy observations and show that this leads to an increase in the mutual information between the actual state of the observed system and the posterior distribution given the observations, when compared to a null model.
Natural language understanding (NLU) of text is a fundamental challenge in AI, and it has received significant attention throughout the history of NLP research. This primary goal has been studied under different tasks, such as Question Answering (QA) and Textual Entailment (TE). In this thesis, we investigate the NLU problem through the QA task and focus on the aspects that make it a challenge for the current state-of-the-art technology. This thesis is organized into three main parts: In the first part, we explore multiple formalisms to improve existing machine comprehension systems. We propose a formulation for abductive reasoning in natural language and show its effectiveness, especially in domains with limited training data. Additionally, to help reasoning systems cope with irrelevant or redundant information, we create a supervised approach to learn and detect the essential terms in questions. In the second part, we propose two new challenge datasets. In particular, we create two datasets of natural language questions where (i) the first one requires reasoning over multiple sentences; (ii) the second one requires temporal common sense reasoning. We hope that the two proposed datasets will motivate the field to address more complex problems. In the final part, we present the first formal framework for multi-step reasoning algorithms, in the presence of a few important properties of language use, such as incompleteness, ambiguity, etc. We apply this framework to prove fundamental limitations for reasoning algorithms. These theoretical results provide extra intuition into the existing empirical evidence in the field.
Urbanization is a common phenomenon in developing countries and it poses serious challenges when not managed effectively. Lack of proper planning and management may cause the encroachment of urban fabrics into reserved or special regions which in turn can lead to an unsustainable increase in population. Ineffective management and planning generally leads to depreciated standard of living, where physical hazards like traffic accidents and disease vector breeding become prevalent. In order to support urban planners and policy makers in effective planning and accurate decision making, we investigate urban land-use in sub-Saharan Africa. Land-use dynamics serves as a crucial parameter in current strategies and policies for natural resource management and monitoring. Focusing on Nairobi, we use an efficient deep learning approach with patch-based prediction to classify regions based on land-use from 2004 to 2018 on a quarterly basis. We estimate changes in land-use within this period, and using the Autoregressive Integrated Moving Average (ARIMA) model, our results forecast land-use for a given future date. Furthermore, we provide labelled land-use maps which will be helpful to urban planners.
This article studies the financial time series data processing for machine learning. It introduces the most frequent scaling methods, then compares the resulting stationarity and preservation of useful information for trend forecasting. It proposes an empirical test based on the capability to learn simple data relationship with simple models. It also speaks about the data split method specific to time series, avoiding unwanted overfitting and proposes various labelling for classification and regression.
We examine a reductions approach to fair optimization and learning where a black-box optimizer is used to learn a fair model for classification or regression [Alabi et al., 2018, Agarwal et al., 2018] and explore the creation of such fair models that adhere to data privacy guarantees (specifically differential privacy). For this approach, we consider two suites of use cases: the first is for optimizing convex performance measures of the confusion matrix (such as $G$-mean and $H$-mean); the second is for satisfying statistical definitions of algorithmic fairness (such as equalized odds, demographic parity, and the gini index of inequality). The reductions approach to fair optimization can be abstracted as the constrained group-objective optimization problem where we aim to optimize an objective that is a function of losses of individual groups, subject to some constraints. We present two differentially private algorithms: an $(\epsilon, 0)$ exponential sampling algorithm and an $(\epsilon, \delta)$ algorithm that uses a linear optimizer to incrementally move toward the best decision. We analyze the privacy and utility guarantees of these empirical risk minimization algorithms. Compared to a previous method for ensuring differential privacy subject to a relaxed form of the equalized odds fairness constraint, the $(\epsilon, \delta)$ differentially private algorithm we present provides asymptotically better sample complexity guarantees. The technique of using an approximate linear optimizer oracle to achieve privacy might be applicable to other problems not considered in this paper. Finally, we show an algorithm-agnostic lower bound on the accuracy of any solution to the problem of $(\epsilon, 0)$ or $(\epsilon, \delta)$ private constrained group-objective optimization.
This paper studies a recent proposal to use randomized value functions to drive exploration in reinforcement learning. These randomized value functions are generated by injecting random noise into the training data, making the approach compatible with many popular methods for estimating parameterized value functions. By providing a worst-case regret bound for tabular finite-horizon Markov decision processes, we show that planning with respect to these randomized value functions can induce provably efficient exploration.
In this paper, we consider the problem of learning a (first-order) theorem prover where we use a representation of beliefs in mathematical claims instead of a proof system to search for proofs. The inspiration for doing so comes from the practices of human mathematicians where a proof system is typically used after the fact to justify a sequence of intuitive steps obtained by "plausible reasoning" rather than to discover them. Towards this end, we introduce a probabilistic representation of beliefs in first-order statements based on first-order distributive normal forms (dnfs) devised by the philosopher Jaakko Hintikka. Notably, the representation supports Bayesian update and does not enforce that logically equivalent statements are assigned the same probability---otherwise, we would end up in a circular situation where we require a prover in order to assign beliefs. We then examine (1) conjecturing as (statistical) model selection and (2) an alternating-turn proving game amenable (in principle) to self-play training to learn a prover that is both complete in the limit and sound provided that players maintain "reasonable" beliefs. Dnfs have super-exponential space requirements so the ideas in this paper should be taken as conducting a thought experiment on "learning to prove". As a step towards making the ideas practical, we will comment on how abstractions can be used to control the space requirements at the cost of completeness.