The superior performance of object detectors is often established under the condition that the test samples are in the same distribution as the training data. However, in many practical applications, out-of-distribution (OOD) instances are inevitable and usually lead to uncertainty in the results. In this paper, we propose a novel, intuitive, and scalable probabilistic object detection method for OOD detection. Unlike other uncertainty-modeling methods that either require huge computational costs to infer the weight distributions or rely on model training through synthetic outlier data, our method is able to distinguish between in-distribution (ID) data and OOD data via weight parameter sampling from proposed Gaussian distributions based on pre-trained networks. We demonstrate that our Bayesian object detector can achieve satisfactory OOD identification performance by reducing the FPR95 score by up to 8.19% and increasing the AUROC score by up to 13.94% when trained on BDD100k and VOC datasets as the ID datasets and evaluated on COCO2017 dataset as the OOD dataset.
Learning in uncertain, noisy, or adversarial environments is a challenging task for deep neural networks (DNNs). We propose a new theoretically grounded and efficient approach for robust learning that builds upon Bayesian estimation and Variational Inference. We formulate the problem of density propagation through layers of a DNN and solve it using an Ensemble Density Propagation (EnDP) scheme. The EnDP approach allows us to propagate moments of the variational probability distribution across the layers of a Bayesian DNN, enabling the estimation of the mean and covariance of the predictive distribution at the output of the model. Our experiments using MNIST and CIFAR-10 datasets show a significant improvement in the robustness of the trained models to random noise and adversarial attacks.
This paper presents a scalable deep learning approach for short-term traffic prediction based on historical traffic data in a vehicular road network. Capturing the spatio-temporal relationship of the big data often requires a significant amount of computational burden or an ad-hoc design aiming for a specific type of road network. To tackle the problem, we combine a road network graph with recurrent neural networks (RNNs) to construct a structural RNN (SRNN). The SRNN employs a spatio-temporal graph to infer the interaction between adjacent road segments as well as the temporal dynamics of the time series data. The model is scalable thanks to two key aspects. First, the proposed SRNN architecture is built by using the semantic similarity of the spatio-temporal dynamic interactions of all segments. Second, we design the architecture to deal with fixed-length tensors regardless of the graph topology. With the real traffic speed data measured in the city of Santander, we demonstrate the proposed SRNN outperforms the image-based approaches using the capsule network (CapsNet) by 14.1% and the convolutional neural network (CNN) by 5.87%, respectively, in terms of root mean squared error (RMSE). Moreover, we show that the proposed model is scalable. The SRNN model trained with data of a road network is able to predict traffic speed of different road networks, with the fixed number of parameters to train.
Estimating hidden processes from non-linear noisy observations is particularly difficult when the parameters of these processes are not known. This paper adopts a machine learning approach to devise variational Bayesian inference for such scenarios. In particular, a random process generated by the autoregressive moving average (ARMA) linear model is inferred from non-linearity noise observations. The posterior distribution of hidden states are approximated by a set of weighted particles generated by the sequential Monte carlo (SMC) algorithm involving sampling with importance sampling resampling (SISR). Numerical efficiency and estimation accuracy of the proposed inference method are evaluated by computer simulations. Furthermore, the proposed inference method is demonstrated on a practical problem of estimating the missing values in the gene expression time series assuming vector autoregressive (VAR) data model.
Deep neural networks have recently demonstrated the traffic prediction capability with the time series data obtained by sensors mounted on road segments. However, capturing spatio-temporal features of the traffic data often requires a significant number of parameters to train, increasing computational burden. In this work we demonstrate that embedding topological information of the road network improves the process of learning traffic features. We use a graph of a vehicular road network with recurrent neural networks (RNNs) to infer the interaction between adjacent road segments as well as the temporal dynamics. The topology of the road network is converted into a spatio-temporal graph to form a structural RNN (SRNN). The proposed approach is validated over traffic speed data from the road network of the city of Santander in Spain. The experiment shows that the graph-based method outperforms the state-of-the-art methods based on spatio-temporal images, requiring much fewer parameters to train.
A novel method to propagate uncertainty through the soft-thresholding nonlinearity is proposed in this paper. At every layer the current distribution of the target vector is represented as a spike and slab distribution, which represents the probabilities of each variable being zero, or Gaussian-distributed. Using the proposed method of uncertainty propagation, the gradients of the logarithms of normalisation constants are derived, that can be used to update a weight distribution. A novel Bayesian neural network for sparse coding is designed utilising both the proposed method of uncertainty propagation and Bayesian inference algorithm.
This paper proposes a deep learning approach for traffic flow prediction in complex road networks. Traffic flow data from induction loop sensors are essentially a time series, which is also spatially related to traffic in different road segments. The spatio-temporal traffic data can be converted into an image where the traffic data are expressed in a 3D space with respect to space and time axes. Although convolutional neural networks (CNNs) have been showing surprising performance in understanding images, they have a major drawback. In the max pooling operation, CNNs are losing important information by locally taking the highest activation values. The inter-relationship in traffic data measured by sparsely located sensors in different time intervals should not be neglected in order to obtain accurate predictions. Thus, we propose a neural network with capsules that replaces max pooling by dynamic routing. This is the first approach that employs the capsule network on a time series forecasting problem, to our best knowledge. Moreover, an experiment on real traffic speed data measured in the Santander city of Spain demonstrates the proposed method outperforms the state-of-the-art method based on a CNN by 13.1% in terms of root mean squared error.
This paper introduces a new sparse spatio-temporal structured Gaussian process regression framework for online and offline Bayesian inference. This is the first framework that gives a time-evolving representation of the interdependencies between the components of the sparse signal of interest. A hierarchical Gaussian process describes such structure and the interdependencies are represented via the covariance matrices of the prior distributions. The inference is based on the expectation propagation method and the theoretical derivation of the posterior distribution is provided in the paper. The inference framework is thoroughly evaluated over synthetic, real video and electroencephalography (EEG) data where the spatio-temporal evolving patterns need to be reconstructed with high accuracy. It is shown that it achieves 15% improvement of the F-measure compared with the alternating direction method of multipliers, spatio-temporal sparse Bayesian learning method and one-level Gaussian process model. Additionally, the required memory for the proposed algorithm is less than in the one-level Gaussian process model. This structured sparse regression framework is of broad applicability to source localisation and object detection problems with sparse signals.
Gaussian process regression is a machine learning approach which has been shown its power for estimation of unknown functions. However, Gaussian processes suffer from high computational complexity, as in a basic form they scale cubically with the number of observations. Several approaches based on inducing points were proposed to handle this problem in a static context. These methods though face challenges with real-time tasks and when the data is received sequentially over time. In this paper, a novel online algorithm for training sparse Gaussian process models is presented. It treats the mean and hyperparameters of the Gaussian process as the state and parameters of the ensemble Kalman filter, respectively. The online evaluation of the parameters and the state is performed on new upcoming samples of data. This procedure iteratively improves the accuracy of parameter estimates. The ensemble Kalman filter reduces the computational complexity required to obtain predictions with Gaussian processes preserving the accuracy level of these predictions. The performance of the proposed method is demonstrated on the synthetic dataset and real large dataset of UK house prices.
Semi-supervised and unsupervised systems provide operators with invaluable support and can tremendously reduce the operators load. In the light of the necessity to process large volumes of video data and provide autonomous decisions, this work proposes new learning algorithms for activity analysis in video. The activities and behaviours are described by a dynamic topic model. Two novel learning algorithms based on the expectation maximisation approach and variational Bayes inference are proposed. Theoretical derivations of the posterior of model parameters are given. The designed learning algorithms are compared with the Gibbs sampling inference scheme introduced earlier in the literature. A detailed comparison of the learning algorithms is presented on real video data. We also propose an anomaly localisation procedure, elegantly embedded in the topic modeling framework. The proposed framework can be applied to a number of areas, including transportation systems, security and surveillance.