Models, code, and papers for "Timothy M":

Measuring the Transferability of Adversarial Examples

Jul 14, 2019
Deyan Petrov, Timothy M. Hospedales

Adversarial examples are of wide concern due to their impact on the reliability of contemporary machine learning systems. Effective adversarial examples are mostly found via white-box attacks. However, in some cases they can be transferred across models, thus enabling them to attack black-box models. In this work we evaluate the transferability of three adversarial attacks - the Fast Gradient Sign Method, the Basic Iterative Method, and the Carlini & Wagner method, across two classes of models - the VGG class(using VGG16, VGG19 and an ensemble of VGG16 and VGG19), and the Inception class(Inception V3, Xception, Inception Resnet V2, and an ensemble of the three). We also outline the problems with the assessment of transferability in the current body of research and attempt to amend them by picking specific "strong" parameters for the attacks, and by using a L-Infinity clipping technique and the SSIM metric for the final evaluation of the attack transferability.

  Click for Model/Code and Paper
Trace Norm Regularised Deep Multi-Task Learning

Feb 17, 2017
Yongxin Yang, Timothy M. Hospedales

We propose a framework for training multiple neural networks simultaneously. The parameters from all models are regularised by the tensor trace norm, so that each neural network is encouraged to reuse others' parameters if possible -- this is the main motivation behind multi-task learning. In contrast to many deep multi-task learning models, we do not predefine a parameter sharing strategy by specifying which layers have tied parameters. Instead, our framework considers sharing for all shareable layers, and the sharing strategy is learned in a data-driven way.

* Submission to Workshop track - ICLR 2017 

  Click for Model/Code and Paper
Unifying Multi-Domain Multi-Task Learning: Tensor and Neural Network Perspectives

Nov 28, 2016
Yongxin Yang, Timothy M. Hospedales

Multi-domain learning aims to benefit from simultaneously learning across several different but related domains. In this chapter, we propose a single framework that unifies multi-domain learning (MDL) and the related but better studied area of multi-task learning (MTL). By exploiting the concept of a \emph{semantic descriptor} we show how our framework encompasses various classic and recent MDL/MTL algorithms as special cases with different semantic descriptor encodings. As a second contribution, we present a higher order generalisation of this framework, capable of simultaneous multi-task-multi-domain learning. This generalisation has two mathematically equivalent views in multi-linear algebra and gated neural networks respectively. Moreover, by exploiting the semantic descriptor, it provides neural networks the capability of zero-shot learning (ZSL), where a classifier is generated for an unseen class without any training data; as well as zero-shot domain adaptation (ZSDA), where a model is generated for an unseen domain without any training data. In practice, this framework provides a powerful yet easy to implement method that can be flexibly applied to MTL, MDL, ZSL and ZSDA.

* Invited book chapter 

  Click for Model/Code and Paper
A Unified Perspective on Multi-Domain and Multi-Task Learning

Mar 26, 2015
Yongxin Yang, Timothy M. Hospedales

In this paper, we provide a new neural-network based perspective on multi-task learning (MTL) and multi-domain learning (MDL). By introducing the concept of a semantic descriptor, this framework unifies MDL and MTL as well as encompassing various classic and recent MTL/MDL algorithms by interpreting them as different ways of constructing semantic descriptors. Our interpretation provides an alternative pipeline for zero-shot learning (ZSL), where a model for a novel class can be constructed without training data. Moreover, it leads to a new and practically relevant problem setting of zero-shot domain adaptation (ZSDA), which is the analogous to ZSL but for novel domains: A model for an unseen domain can be generated by its semantic descriptor. Experiments across this range of problems demonstrate that our framework outperforms a variety of alternatives.

* 9 pages, Accepted to ICLR 2015 Conference Track 

  Click for Model/Code and Paper
Thermodynamic-RAM Technology Stack

Mar 29, 2017
M. Alexander Nugent, Timothy W. Molter

We introduce a technology stack or specification describing the multiple levels of abstraction and specialization needed to implement a neuromorphic processor (NPU) based on the previously-described concept of AHaH Computing and integrate it into today's digital computing systems. The general purpose NPU implementation described here is called Thermodynamic-RAM (kT-RAM) and is just one of many possible architectures, each with varying advantages and trade offs. Bringing us closer to brain-like neural computation, kT-RAM will provide a general-purpose adaptive hardware resource to existing computing platforms enabling fast and low-power machine learning capabilities that are currently hampered by the separation of memory and processing, a.k.a the von Neumann bottleneck. Because understanding such a processor based on non-traditional principles can be difficult, by presenting the various levels of the stack from the bottom up, layer by layer, explaining kT-RAM becomes a much easier task. The levels of the Thermodynamic-RAM technology stack include the memristor, synapse, AHaH node, kT-RAM, instruction set, sparse spike encoding, kT-RAM emulator, and SENSE server.

  Click for Model/Code and Paper
Machine Learning with Memristors via Thermodynamic RAM

Aug 14, 2016
Timothy W. Molter, M. Alexander Nugent

Thermodynamic RAM (kT-RAM) is a neuromemristive co-processor design based on the theory of AHaH Computing and implemented via CMOS and memristors. The co-processor is a 2-D array of differential memristor pairs (synapses) that can be selectively coupled together (neurons) via the digital bit addressing of the underlying CMOS RAM circuitry. The chip is designed to plug into existing digital computers and be interacted with via a simple instruction set. Anti-Hebbian and Hebbian (AHaH) computing forms the theoretical framework from which a nature-inspired type of computing architecture is built where, unlike von Neumann architectures, memory and processor are physically combined for synaptic operations. Through exploitation of AHaH attractor states, memristor-based circuits converge to attractor basins that represents machine learning solutions such as unsupervised feature learning, supervised classification and anomaly detection. Because kT-RAM eliminates the need to shuttle bits back and forth between memory and processor and can operate at very low voltage levels, it can significantly surpass CPU, GPU, and FPGA performance for synaptic integration and learning operations. Here, we present a memristor technology developed for use in kT-RAM, in particular bi-directional incremental adaptation of conductance via short low-voltage 1.0 V, 1.0 microsecond pulses.

* CNNA 2016, 15th International Workshop on Cellular Nanoscale Networks and their Applications, Dresden, Germany, 2016, pp. 1-2 

  Click for Model/Code and Paper
Cortical Processing with Thermodynamic-RAM

Aug 14, 2014
M. Alexander Nugent, Timothy W. Molter

AHaH computing forms a theoretical framework from which a biologically-inspired type of computing architecture can be built where, unlike von Neumann systems, memory and processor are physically combined. In this paper we report on an incremental step beyond the theoretical framework of AHaH computing toward the development of a memristor-based physical neural processing unit (NPU), which we call Thermodynamic-RAM (kT-RAM). While the power consumption and speed dominance of such an NPU over von Neumann architectures for machine learning applications is well appreciated, Thermodynamic-RAM offers several advantages over other hardware approaches to adaptation and learning. Benefits include general-purpose use, a simple yet flexible instruction set and easy integration into existing digital platforms. We present a high level design of kT-RAM and a formal definition of its instruction set. We report the completion of a kT-RAM emulator and the successful port of all previous machine learning benchmark applications including unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization. Lastly, we extend a previous MNIST hand written digits benchmark application, to show that an extra step of reading the synaptic states of AHaH nodes during the train phase (healing) alone results in plasticity that improves the classifier's performance, bumping our best F1 score up to 99.5%.

  Click for Model/Code and Paper
Distance-Based Regularisation of Deep Networks for Fine-Tuning

Feb 19, 2020
Henry Gouk, Timothy M. Hospedales, Massimiliano Pontil

We investigate approaches to regularisation during fine-tuning of deep neural networks. First we provide a neural network generalisation bound based on Rademacher complexity that uses the distance the weights have moved from their initial values. This bound has no direct dependence on the number of weights and compares favourably to other bounds when applied to convolutional networks. Our bound is highly relevant for fine-tuning, because providing a network with a good initialisation based on transfer learning means that learning can modify the weights less, and hence achieve tighter generalisation. Inspired by this, we develop a simple yet effective fine-tuning algorithm that constrains the hypothesis class to a small sphere centred on the initial pre-trained weights, thus obtaining provably better generalisation performance than conventional transfer learning. Empirical evaluation shows that our algorithm works well, corroborating our theoretical results. It outperforms both state of the art fine-tuning competitors, and penalty-based alternatives that we show do not directly constrain the radius of the search space.

  Click for Model/Code and Paper
On Understanding Knowledge Graph Representation

Sep 25, 2019
Carl Allen, Ivana Balazevic, Timothy M. Hospedales

Many methods have been developed to represent knowledge graph data, which implicitly exploit low-rank latent structure in the data to encode known information and enable unknown facts to be inferred. To predict whether a relationship holds between entities, their embeddings are typically compared in the latent space following a relation-specific mapping. Whilst link prediction has steadily improved, the latent structure, and hence why such models capture semantic information, remains unexplained. We build on recent theoretical interpretation of word embeddings as a basis to consider an explicit structure for representations of relations between entities. For identifiable relation types, we are able to predict properties and justify the relative performance of leading knowledge graph representation methods, including their often overlooked ability to make independent predictions.

  Click for Model/Code and Paper
Frustratingly Easy Person Re-Identification: Generalizing Person Re-ID in Practice

May 09, 2019
Jieru Jia, Qiuqi Ruan, Timothy M. Hospedales

Contemporary person re-identification (Re-ID) methods usually require access to data from the deployment camera network during training in order to perform well. This is because contemporary Re-ID models trained on one dataset do not generalise to other camera networks due to the domain-shift between datasets. This requirement is often the bottleneck for deploying Re-ID systems in practical security or commercial applications as it may be impossible to collect this data in advance or prohibitively costly to annotate it. This paper alleviates this issue by proposing a simple baseline for domain generalizable~(DG) person re-identification. That is, to learn a Re-ID model from a set of source domains that is suitable for application to unseen datasets out-of-the-box, without any model updating. Specifically, we observe that the domain discrepancy in Re-ID is due to style and content variance across datasets and demonstrate appropriate Instance and Feature Normalization alleviates much of the resulting domain-shift in Deep Re-ID models. Instance Normalization~(IN) in early layers filters out style statistic variations and Feature Normalization~(FN) in deep layers is able to further eliminate disparity in content statistics. Compared to contemporary alternatives, this approach is extremely simple to implement, while being faster to train and test, thus making it an extremely valuable baseline for implementing Re-ID in practice. With a few lines of code, it increases the rank 1 Re-ID accuracy by 11.7\%, 28.9\%, 10.1\% and 6.3\% on the VIPeR, PRID, GRID, and i-LIDS benchmarks respectively. Source code will be made available.

* 9 pages,2 figures 

  Click for Model/Code and Paper
TuckER: Tensor Factorization for Knowledge Graph Completion

Jan 28, 2019
Ivana Balažević, Carl Allen, Timothy M. Hospedales

Knowledge graphs are structured representations of real world facts. However, they typically contain only a small subset of all possible facts. Link prediction is a task of inferring missing facts based on existing ones. We propose TuckER, a relatively simple but powerful linear model based on Tucker decomposition of the binary tensor representation of knowledge graph triples. TuckER outperforms all previous state-of-the-art models across standard link prediction datasets. We prove that TuckER is a fully expressive model, deriving the bound on its entity and relation embedding dimensionality for full expressiveness which is several orders of magnitude smaller than the bound of previous state-of-the-art models ComplEx and SimplE. We further show that several previously introduced linear models can be viewed as special cases of TuckER.

  Click for Model/Code and Paper
Hypernetwork Knowledge Graph Embeddings

Oct 18, 2018
Ivana Balazevic, Carl Allen, Timothy M. Hospedales

Knowledge graphs are large graph-structured databases of facts, which typically suffer from incompleteness. Link prediction is the task of inferring missing relations (links) between entities (nodes) in a knowledge graph. We approach this task using a hypernetwork architecture to generate convolutional layer filters specific to each relation and apply those filters to the subject entity embeddings. This architecture enables a trade-off between non-linear expressiveness and the number of parameters to learn. Our model simplifies the entity and relation embedding interactions introduced by the predecessor convolutional model, while outperforming all previous approaches to link prediction across all standard link prediction datasets.

  Click for Model/Code and Paper
Multi-Level Factorisation Net for Person Re-Identification

Apr 17, 2018
Xiaobin Chang, Timothy M. Hospedales, Tao Xiang

Key to effective person re-identification (Re-ID) is modelling discriminative and view-invariant factors of person appearance at both high and low semantic levels. Recently developed deep Re-ID models either learn a holistic single semantic level feature representation and/or require laborious human annotation of these factors as attributes. We propose Multi-Level Factorisation Net (MLFN), a novel network architecture that factorises the visual appearance of a person into latent discriminative factors at multiple semantic levels without manual annotation. MLFN is composed of multiple stacked blocks. Each block contains multiple factor modules to model latent factors at a specific level, and factor selection modules that dynamically select the factor modules to interpret the content of each input image. The outputs of the factor selection modules also provide a compact latent factor descriptor that is complementary to the conventional deeply learned features. MLFN achieves state-of-the-art results on three Re-ID datasets, as well as compelling results on the general object categorisation CIFAR-100 dataset.

* To Appear at CVPR2018 

  Click for Model/Code and Paper
Scalable and Effective Deep CCA via Soft Decorrelation

Mar 24, 2018
Xiaobin Chang, Tao Xiang, Timothy M. Hospedales

Recently the widely used multi-view learning model, Canonical Correlation Analysis (CCA) has been generalised to the non-linear setting via deep neural networks. Existing deep CCA models typically first decorrelate the feature dimensions of each view before the different views are maximally correlated in a common latent space. This feature decorrelation is achieved by enforcing an exact decorrelation constraint; these models are thus computationally expensive due to the matrix inversion or SVD operations required for exact decorrelation at each training iteration. Furthermore, the decorrelation step is often separated from the gradient descent based optimisation, resulting in sub-optimal solutions. We propose a novel deep CCA model Soft CCA to overcome these problems. Specifically, exact decorrelation is replaced by soft decorrelation via a mini-batch based Stochastic Decorrelation Loss (SDL) to be optimised jointly with the other training objectives. Extensive experiments show that the proposed soft CCA is more effective and efficient than existing deep CCA models. In addition, our SDL loss can be applied to other deep models beyond multi-view learning, and obtains superior performance compared to existing decorrelation losses.

* To Appear at CVPR2018 

  Click for Model/Code and Paper
Deep Matching Autoencoders

Nov 16, 2017
Tanmoy Mukherjee, Makoto Yamada, Timothy M. Hospedales

Increasingly many real world tasks involve data in multiple modalities or views. This has motivated the development of many effective algorithms for learning a common latent space to relate multiple domains. However, most existing cross-view learning algorithms assume access to paired data for training. Their applicability is thus limited as the paired data assumption is often violated in practice: many tasks have only a small subset of data available with pairing annotation, or even no paired data at all. In this paper we introduce Deep Matching Autoencoders (DMAE), which learn a common latent space and pairing from unpaired multi-modal data. Specifically we formulate this as a cross-domain representation learning and object matching problem. We simultaneously optimise parameters of representation learning auto-encoders and the pairing of unpaired multi-modal data. This framework elegantly spans the full regime from fully supervised, semi-supervised, and unsupervised (no paired data) multi-modal learning. We show promising results in image captioning, and on a new task that is uniquely enabled by our methodology: unsupervised classifier learning.

* 10 pages 

  Click for Model/Code and Paper
Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images

Jun 19, 2017
Zhiyuan Shi, Timothy M. Hospedales, Tao Xiang

We address the problem of localisation of objects as bounding boxes in images and videos with weak labels. This weakly supervised object localisation problem has been tackled in the past using discriminative models where each object class is localised independently from other classes. In this paper, a novel framework based on Bayesian joint topic modelling is proposed, which differs significantly from the existing ones in that: (1) All foreground object classes are modelled jointly in a single generative model that encodes multiple object co-existence so that "explaining away" inference can resolve ambiguity and lead to better learning and localisation. (2) Image backgrounds are shared across classes to better learn varying surroundings and "push out" objects of interest. (3) Our model can be learned with a mixture of weakly labelled and unlabelled data, allowing the large volume of unlabelled images on the Internet to be exploited for learning. Moreover, the Bayesian formulation enables the exploitation of various types of prior knowledge to compensate for the limited supervision offered by weakly labelled data, as well as Bayesian domain adaptation for transfer learning. Extensive experiments on the PASCAL VOC, ImageNet and YouTube-Object videos datasets demonstrate the effectiveness of our Bayesian joint model for weakly supervised object localisation.

* Accepted in IEEE Transaction on Pattern Analysis and Machine Intelligence 

  Click for Model/Code and Paper
Transferring a Semantic Representation for Person Re-Identification and Search

Jun 12, 2017
Zhiyuan Shi, Timothy M. Hospedales, Tao Xiang

Learning semantic attributes for person re-identification and description-based person search has gained increasing interest due to attributes' great potential as a pose and view-invariant representation. However, existing attribute-centric approaches have thus far underperformed state-of-the-art conventional approaches. This is due to their non-scalable need for extensive domain (camera) specific annotation. In this paper we present a new semantic attribute learning approach for person re-identification and search. Our model is trained on existing fashion photography datasets -- either weakly or strongly labelled. It can then be transferred and adapted to provide a powerful semantic description of surveillance person detections, without requiring any surveillance domain supervision. The resulting representation is useful for both unsupervised and supervised person re-identification, achieving state-of-the-art and near state-of-the-art performance respectively. Furthermore, as a semantic representation it allows description-based person search to be integrated within the same framework.

* cvpr 2015 

  Click for Model/Code and Paper
Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation

May 09, 2017
Zhiyuan Shi, Timothy M. Hospedales, Tao Xiang

We address the problem of localisation of objects as bounding boxes in images with weak labels. This weakly supervised object localisation problem has been tackled in the past using discriminative models where each object class is localised independently from other classes. We propose a novel framework based on Bayesian joint topic modelling. Our framework has three distinctive advantages over previous works: (1) All object classes and image backgrounds are modelled jointly together in a single generative model so that "explaining away" inference can resolve ambiguity and lead to better learning and localisation. (2) The Bayesian formulation of the model enables easy integration of prior knowledge about object appearance to compensate for limited supervision. (3) Our model can be learned with a mixture of weakly labelled and unlabelled data, allowing the large volume of unlabelled images on the Internet to be exploited for learning. Extensive experiments on the challenging VOC dataset demonstrate that our approach outperforms the state-of-the-art competitors.

* iccv 2013 

  Click for Model/Code and Paper
Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation

Nov 26, 2016
Xun Xu, Timothy M. Hospedales, Shaogang Gong

Zero-Shot Learning (ZSL) promises to scale visual recognition by bypassing the conventional model training requirement of annotated examples for every category. This is achieved by establishing a mapping connecting low-level features and a semantic description of the label space, referred as visual-semantic mapping, on auxiliary data. Reusing the learned mapping to project target videos into an embedding space thus allows novel-classes to be recognised by nearest neighbour inference. However, existing ZSL methods suffer from auxiliary-target domain shift intrinsically induced by assuming the same mapping for the disjoint auxiliary and target classes. This compromises the generalisation accuracy of ZSL recognition on the target data. In this work, we improve the ability of ZSL to generalise across this domain shift in both model- and data-centric ways by formulating a visual-semantic mapping with better generalisation properties and a dynamic data re-weighting method to prioritise auxiliary data that are relevant to the target classes. Specifically: (1) We introduce a multi-task visual-semantic mapping to improve generalisation by constraining the semantic mapping parameters to lie on a low-dimensional manifold, (2) We explore prioritised data augmentation by expanding the pool of auxiliary data with additional instances weighted by relevance to the target domain. The proposed new model is applied to the challenging zero-shot action recognition problem to demonstrate its advantages over existing ZSL models.

* Published in ECCV 2016 

  Click for Model/Code and Paper
Gated Neural Networks for Option Pricing: Rationality by Design

Nov 19, 2016
Yongxin Yang, Yu Zheng, Timothy M. Hospedales

We propose a neural network approach to price EU call options that significantly outperforms some existing pricing models and comes with guarantees that its predictions are economically reasonable. To achieve this, we introduce a class of gated neural networks that automatically learn to divide-and-conquer the problem space for robust and accurate pricing. We then derive instantiations of these networks that are 'rational by design' in terms of naturally encoding a valid call option surface that enforces no arbitrage principles. This integration of human insight within data-driven learning provides significantly better generalisation in pricing performance due to the encoded inductive bias in the learning, guarantees sanity in the model's predictions, and provides econometrically useful byproduct such as risk neutral density.

* Accepted to AAAI 2017. 7 pages, 5 figures 

  Click for Model/Code and Paper