Models, code, and papers for "Gautam Krishna":

Advancing Speech Recognition With No Speech Or With Noisy Speech

Jul 27, 2019
Gautam Krishna, Co Tran, Mason Carnahan, Ahmed H Tewfik

In this paper we demonstrate end to end continuous speech recognition (CSR) using electroencephalography (EEG) signals with no speech signal as input. An attention model based automatic speech recognition (ASR) and connectionist temporal classification (CTC) based ASR systems were implemented for performing recognition. We further demonstrate CSR for noisy speech by fusing with EEG features.

* Accepted for publication at IEEE EUSIPCO 2019. Camera-ready version. arXiv admin note: text overlap with arXiv:1906.08045 

  Click for Model/Code and Paper
Speech Recognition with no speech or with noisy speech

Mar 02, 2019
Gautam Krishna, Co Tran, Jianguo Yu, Ahmed H Tewfik

The performance of automatic speech recognition systems(ASR) degrades in the presence of noisy speech. This paper demonstrates that using electroencephalography (EEG) can help automatic speech recognition systems overcome performance loss in the presence of noise. The paper also shows that distillation training of automatic speech recognition systems using EEG features will increase their performance. Finally, we demonstrate the ability to recognize words from EEG with no speech signal on a limited English vocabulary with high accuracy.

* Accepted for ICASSP 2019 

  Click for Model/Code and Paper
Spoken Speech Enhancement using EEG

Sep 13, 2019
Gautam Krishna, Yan Han, Co Tran, Mason Carnahan, Ahmed H Tewfik

In this paper we demonstrate spoken speech enhancement using electroencephalography (EEG) signals using a generative adversarial network (GAN) based model and Long short-term Memory (LSTM) regression based model. Our results demonstrate that EEG features can be used to clean speech recorded in presence of background noise.

* To be submitted to ICASSP 2020. arXiv admin note: text overlap with arXiv:1906.08044, arXiv:1906.08871, arXiv:1906.08045 

  Click for Model/Code and Paper
Speech Recognition With No Speech Or With Noisy Speech Beyond English

Jul 14, 2019
Gautam Krishna, Co Tran, Yan Han, Mason Carnahan, Ahmed H Tewfik

In this paper we demonstrate continuous noisy speech recognition using connectionist temporal classification (CTC) model on limited Chinese vocabulary using electroencephalography (EEG) features with no speech signal as input and we further demonstrate single CTC model based continuous noisy speech recognition on limited joint English and Chinese vocabulary using EEG features with no speech signal as input.

* On preparation for submission for ICASSP 2020. arXiv admin note: text overlap with arXiv:1906.08044 

  Click for Model/Code and Paper
Robust End to End Speaker Verification Using EEG

Jun 17, 2019
Yan Han, Gautam Krishna, Co Tran, Mason Carnahan, Ahmed H Tewfik

In this paper we demonstrate that performance of a speaker verification system can be improved by concatenating electroencephalography (EEG) signal features with speech signal. We use state of art end to end deep learning model for performing speaker verification and we demonstrate our results for noisy speech. Our results indicate that EEG signals can improve the robustness of speaker verification systems.


  Click for Model/Code and Paper
Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond

Nov 06, 2018
Krishna Kumar Singh, Hao Yu, Aron Sarmasi, Gautam Pradeep, Yong Jae Lee

We propose 'Hide-and-Seek' a general purpose data augmentation technique, which is complementary to existing data augmentation techniques and is beneficial for various visual recognition tasks. The key idea is to hide patches in a training image randomly, in order to force the network to seek other relevant content when the most discriminative content is hidden. Our approach only needs to modify the input image and can work with any network to improve its performance. During testing, it does not need to hide any patches. The main advantage of Hide-and-Seek over existing data augmentation techniques is its ability to improve object localization accuracy in the weakly-supervised setting, and we therefore use this task to motivate the approach. However, Hide-and-Seek is not tied only to the image localization task, and can generalize to other forms of visual input like videos, as well as other recognition tasks like image classification, temporal action localization, semantic segmentation, emotion recognition, age/gender estimation, and person re-identification. We perform extensive experiments to showcase the advantage of Hide-and-Seek on these various visual recognition problems.

* TPAMI submission. This is a journal extension of our ICCV 2017 paper arXiv:1704.04232 

  Click for Model/Code and Paper
LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection

Jul 11, 2016
Pankaj Malhotra, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh Vig, Puneet Agarwal, Gautam Shroff

Mechanical devices such as engines, vehicles, aircrafts, etc., are typically instrumented with numerous sensors to capture the behavior and health of the machine. However, there are often external factors or variables which are not captured by sensors leading to time-series which are inherently unpredictable. For instance, manual controls and/or unmonitored environmental conditions or load may lead to inherently unpredictable time-series. Detecting anomalies in such scenarios becomes challenging using standard approaches based on mathematical models that rely on stationarity, or prediction models that utilize prediction errors to detect anomalies. We propose a Long Short Term Memory Networks based Encoder-Decoder scheme for Anomaly Detection (EncDec-AD) that learns to reconstruct 'normal' time-series behavior, and thereafter uses reconstruction error to detect anomalies. We experiment with three publicly available quasi predictable time-series datasets: power demand, space shuttle, and ECG, and two real-world engine datasets with both predictive and unpredictable behavior. We show that EncDec-AD is robust and can detect anomalies from predictable, unpredictable, periodic, aperiodic, and quasi-periodic time-series. Further, we show that EncDec-AD is able to detect anomalies from short time-series (length as small as 30) as well as long time-series (length as large as 500).

* Accepted at ICML 2016 Anomaly Detection Workshop, New York, NY, USA, 2016. Reference update in this version (v2) 

  Click for Model/Code and Paper
Multi-Sensor Prognostics using an Unsupervised Health Index based on LSTM Encoder-Decoder

Aug 22, 2016
Pankaj Malhotra, Vishnu TV, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh Vig, Puneet Agarwal, Gautam Shroff

Many approaches for estimation of Remaining Useful Life (RUL) of a machine, using its operational sensor data, make assumptions about how a system degrades or a fault evolves, e.g., exponential degradation. However, in many domains degradation may not follow a pattern. We propose a Long Short Term Memory based Encoder-Decoder (LSTM-ED) scheme to obtain an unsupervised health index (HI) for a system using multi-sensor time-series data. LSTM-ED is trained to reconstruct the time-series corresponding to healthy state of a system. The reconstruction error is used to compute HI which is then used for RUL estimation. We evaluate our approach on publicly available Turbofan Engine and Milling Machine datasets. We also present results on a real-world industry dataset from a pulverizer mill where we find significant correlation between LSTM-ED based HI and maintenance costs.

* Presented at 1st ACM SIGKDD Workshop on Machine Learning for Prognostics and Health Management, San Francisco, CA, USA, 2016. 10 pages 

  Click for Model/Code and Paper