Deriving a good model for multitalker babble noise can facilitate different speech processing algorithms, e.g. noise reduction, to reduce the so-called cocktail party difficulty. In the available systems, the fact that the babble waveform is generated as a sum of N different speech waveforms is not exploited explicitly. In this paper, first we develop a gamma hidden Markov model for power spectra of the speech signal, and then formulate it as a sparse nonnegative matrix factorization (NMF). Second, the sparse NMF is extended by relaxing the sparsity constraint, and a novel model for babble noise (gamma nonnegative HMM) is proposed in which the babble basis matrix is the same as the speech basis matrix, and only the activation factors (weights) of the basis vectors are different for the two signals over time. Finally, a noise reduction algorithm is proposed using the derived speech and babble models. All of the stationary model parameters are estimated using the expectation-maximization (EM) algorithm, whereas the time-varying parameters, i.e. the gain parameters of speech and babble signals, are estimated using a recursive EM algorithm. The objective and subjective listening evaluations show that the proposed babble model and the final noise reduction algorithm significantly outperform the conventional methods.
This paper presents two single channel speech dereverberation methods to enhance the quality of speech signals that have been recorded in an enclosed space. For both methods, the room acoustics are modeled using a nonnegative approximation of the convolutive transfer function (NCTF), and to additionally exploit the spectral properties of the speech signal, such as the low rank nature of the speech spectrogram, the speech spectrogram is modeled using nonnegative matrix factorization (NMF). Two methods are described to combine the NCTF and NMF models. In the first method, referred to as the integrated method, a cost function is constructed by directly integrating the speech NMF model into the NCTF model, while in the second method, referred to as the weighted method, the NCTF and NMF based cost functions are weighted and summed. Efficient update rules are derived to solve both optimization problems. In addition, an extension of the integrated method is presented, which exploits the temporal dependencies of the speech signal. Several experiments are performed on reverberant speech signals with and without background noise, where the integrated method yields a considerably higher speech quality than the baseline NCTF method and a state of the art spectral enhancement method. Moreover, the experimental results indicate that the weighted method can even lead to a better performance in terms of instrumental quality measures, but that the optimal weighting parameter depends on the room acoustics and the utilized NMF model. Modeling the temporal dependencies in the integrated method was found to be useful only for highly reverberant conditions.
In this paper, the problem of road friction prediction from a fleet of connected vehicles is investigated. A framework is proposed to predict the road friction level using both historical friction data from the connected cars and data from weather stations, and comparative results from different methods are presented. The problem is formulated as a classification task where the available data is used to train three machine learning models including logistic regression, support vector machine, and neural networks to predict the friction class (slippery or non-slippery) in the future for specific road segments. In addition to the friction values, which are measured by moving vehicles, additional parameters such as humidity, temperature, and rainfall are used to obtain a set of descriptive feature vectors as input to the classification methods. The proposed prediction models are evaluated for different prediction horizons (0 to 120 minutes in the future) where the evaluation shows that the neural networks method leads to more stable results in different conditions.
Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e.g., Wiener filtering, supervised approaches, such as algorithms based on hidden Markov models (HMM), lead to higher-quality enhanced speech signals. However, the main practical difficulty of these approaches is that for each noise type a model is required to be trained a priori. In this paper, we investigate a new class of supervised speech denoising algorithms using nonnegative matrix factorization (NMF). We propose a novel speech enhancement method that is based on a Bayesian formulation of NMF (BNMF). To circumvent the mismatch problem between the training and testing stages, we propose two solutions. First, we use an HMM in combination with BNMF (BNMF-HMM) to derive a minimum mean square error (MMSE) estimator for the speech signal with no information about the underlying noise type. Second, we suggest a scheme to learn the required noise BNMF model online, which is then used to develop an unsupervised speech enhancement system. Extensive experiments are carried out to investigate the performance of the proposed methods under different conditions. Moreover, we compare the performance of the developed algorithms with state-of-the-art speech enhancement schemes using various objective measures. Our simulations show that the proposed BNMF-based methods outperform the competing algorithms substantially.
This paper aims to investigate direct imitation learning from human drivers for the task of lane keeping assistance in highway and country roads using grayscale images from a single front view camera. The employed method utilizes convolutional neural networks (CNN) to act as a policy that is driving a vehicle. The policy is successfully learned via imitation learning using real-world data collected from human drivers and is evaluated in closed-loop simulated environments, demonstrating good driving behaviour and a robustness for domain changes. Evaluation is based on two proposed performance metrics measuring how well the vehicle is positioned in a lane and the smoothness of the driven trajectory.