Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nasser Mohammadiha

Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement

Sep 16, 2017
Nasser Mohammadiha, Arne Leijon

Figure 1 for Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement

Figure 2 for Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement

Figure 3 for Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement

Figure 4 for Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement

Deriving a good model for multitalker babble noise can facilitate different speech processing algorithms, e.g. noise reduction, to reduce the so-called cocktail party difficulty. In the available systems, the fact that the babble waveform is generated as a sum of N different speech waveforms is not exploited explicitly. In this paper, first we develop a gamma hidden Markov model for power spectra of the speech signal, and then formulate it as a sparse nonnegative matrix factorization (NMF). Second, the sparse NMF is extended by relaxing the sparsity constraint, and a novel model for babble noise (gamma nonnegative HMM) is proposed in which the babble basis matrix is the same as the speech basis matrix, and only the activation factors (weights) of the basis vectors are different for the two signals over time. Finally, a noise reduction algorithm is proposed using the derived speech and babble models. All of the stationary model parameters are estimated using the expectation-maximization (EM) algorithm, whereas the time-varying parameters, i.e. the gain parameters of speech and babble signals, are estimated using a recursive EM algorithm. The objective and subjective listening evaluations show that the proposed babble model and the final noise reduction algorithm significantly outperform the conventional methods.

* IEEE Trans. Audio, Speech and Language Process., vol. 21, no. 5, pp. 998-1011, May 2013

Via

Access Paper or Ask Questions

Speech Dereverberation Using Nonnegative Convolutive Transfer Function and Spectro temporal Modeling

Sep 16, 2017
Nasser Mohammadiha, Simon Doclo

Figure 1 for Speech Dereverberation Using Nonnegative Convolutive Transfer Function and Spectro temporal Modeling

Figure 2 for Speech Dereverberation Using Nonnegative Convolutive Transfer Function and Spectro temporal Modeling

Figure 3 for Speech Dereverberation Using Nonnegative Convolutive Transfer Function and Spectro temporal Modeling

Figure 4 for Speech Dereverberation Using Nonnegative Convolutive Transfer Function and Spectro temporal Modeling

This paper presents two single channel speech dereverberation methods to enhance the quality of speech signals that have been recorded in an enclosed space. For both methods, the room acoustics are modeled using a nonnegative approximation of the convolutive transfer function (NCTF), and to additionally exploit the spectral properties of the speech signal, such as the low rank nature of the speech spectrogram, the speech spectrogram is modeled using nonnegative matrix factorization (NMF). Two methods are described to combine the NCTF and NMF models. In the first method, referred to as the integrated method, a cost function is constructed by directly integrating the speech NMF model into the NCTF model, while in the second method, referred to as the weighted method, the NCTF and NMF based cost functions are weighted and summed. Efficient update rules are derived to solve both optimization problems. In addition, an extension of the integrated method is presented, which exploits the temporal dependencies of the speech signal. Several experiments are performed on reverberant speech signals with and without background noise, where the integrated method yields a considerably higher speech quality than the baseline NCTF method and a state of the art spectral enhancement method. Moreover, the experimental results indicate that the weighted method can even lead to a better performance in terms of instrumental quality measures, but that the optimal weighting parameter depends on the room acoustics and the utilized NMF model. Modeling the temporal dependencies in the integrated method was found to be useful only for highly reverberant conditions.

* IEEE Trans. Audio, Speech and Language Process., vol. 24, no. 2, Feb. 2016

Via

Access Paper or Ask Questions

Road Friction Estimation for Connected Vehicles using Supervised Machine Learning

Sep 15, 2017
Ghazaleh Panahandeh, Erik Ek, Nasser Mohammadiha

Figure 1 for Road Friction Estimation for Connected Vehicles using Supervised Machine Learning

Figure 2 for Road Friction Estimation for Connected Vehicles using Supervised Machine Learning

Figure 3 for Road Friction Estimation for Connected Vehicles using Supervised Machine Learning

Figure 4 for Road Friction Estimation for Connected Vehicles using Supervised Machine Learning

In this paper, the problem of road friction prediction from a fleet of connected vehicles is investigated. A framework is proposed to predict the road friction level using both historical friction data from the connected cars and data from weather stations, and comparative results from different methods are presented. The problem is formulated as a classification task where the available data is used to train three machine learning models including logistic regression, support vector machine, and neural networks to predict the friction class (slippery or non-slippery) in the future for specific road segments. In addition to the friction values, which are measured by moving vehicles, additional parameters such as humidity, temperature, and rainfall are used to obtain a set of descriptive feature vectors as input to the classification methods. The proposed prediction models are evaluated for different prediction horizons (0 to 120 minutes in the future) where the evaluation shows that the neural networks method leads to more stable results in different conditions.

* Published at IV 2017

Via

Access Paper or Ask Questions

Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization

Sep 15, 2017
Nasser Mohammadiha, Paris Smaragdis, Arne Leijon

Figure 1 for Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization

Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e.g., Wiener filtering, supervised approaches, such as algorithms based on hidden Markov models (HMM), lead to higher-quality enhanced speech signals. However, the main practical difficulty of these approaches is that for each noise type a model is required to be trained a priori. In this paper, we investigate a new class of supervised speech denoising algorithms using nonnegative matrix factorization (NMF). We propose a novel speech enhancement method that is based on a Bayesian formulation of NMF (BNMF). To circumvent the mismatch problem between the training and testing stages, we propose two solutions. First, we use an HMM in combination with BNMF (BNMF-HMM) to derive a minimum mean square error (MMSE) estimator for the speech signal with no information about the underlying noise type. Second, we suggest a scheme to learn the required noise BNMF model online, which is then used to develop an unsupervised speech enhancement system. Extensive experiments are carried out to investigate the performance of the proposed methods under different conditions. Moreover, we compare the performance of the developed algorithms with state-of-the-art speech enhancement schemes using various objective measures. Our simulations show that the proposed BNMF-based methods outperform the competing algorithms substantially.

* IEEE Trans. Audio, Speech and Language Process., vol. 21, no. 10, Oct. 2013

Via

Access Paper or Ask Questions

Imitation Learning for Vision-based Lane Keeping Assistance

Sep 12, 2017
Christopher Innocenti, Henrik Lindén, Ghazaleh Panahandeh, Lennart Svensson, Nasser Mohammadiha

Figure 1 for Imitation Learning for Vision-based Lane Keeping Assistance

Figure 2 for Imitation Learning for Vision-based Lane Keeping Assistance

Figure 3 for Imitation Learning for Vision-based Lane Keeping Assistance

Figure 4 for Imitation Learning for Vision-based Lane Keeping Assistance

This paper aims to investigate direct imitation learning from human drivers for the task of lane keeping assistance in highway and country roads using grayscale images from a single front view camera. The employed method utilizes convolutional neural networks (CNN) to act as a policy that is driving a vehicle. The policy is successfully learned via imitation learning using real-world data collected from human drivers and is evaluated in closed-loop simulated environments, demonstrating good driving behaviour and a robustness for domain changes. Evaluation is based on two proposed performance metrics measuring how well the vehicle is positioned in a lane and the smoothness of the driven trajectory.

* International Conference on Intelligent Transportation Systems (ITSC)

Via

Access Paper or Ask Questions