Alert button
Picture for Jagadeesh Balam

Jagadeesh Balam

Alert button

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jan 11, 2024
Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

Add code
Bookmark button
Alert button
Oct 18, 2023
Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation

Add code
Bookmark button
Alert button
Oct 18, 2023
Tae Jin Park, He Huang, Coleman Hooper, Nithin Koluguri, Kunal Dhawan, Ante Jukic, Jagadeesh Balam, Boris Ginsburg

Figure 1 for Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
Figure 2 for Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
Figure 3 for Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
Figure 4 for Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
Viaarxiv icon

SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation

Add code
Bookmark button
Alert button
Oct 13, 2023
Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Investigating End-to-End ASR Architectures for Long Form Audio Transcription

Add code
Bookmark button
Alert button
Sep 20, 2023
Nithin Rao Koluguri, Samuel Kriman, Georgy Zelenfroind, Somshubra Majumdar, Dima Rekesh, Vahid Noroozi, Jagadeesh Balam, Boris Ginsburg

Figure 1 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 2 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 3 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 4 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Viaarxiv icon

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition

Add code
Bookmark button
Alert button
Sep 19, 2023
Krishna C. Puvvada, Nithin Rao Koluguri, Kunal Dhawan, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach

Add code
Bookmark button
Alert button
Sep 14, 2023
Tae Jin Park, Kunal Dhawan, Nithin Koluguri, Jagadeesh Balam

Figure 1 for Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach
Figure 2 for Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach
Figure 3 for Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach
Figure 4 for Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach
Viaarxiv icon

Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling

Add code
Bookmark button
Alert button
Jul 13, 2023
He Huang, Jagadeesh Balam, Boris Ginsburg

Figure 1 for Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling
Figure 2 for Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling
Figure 3 for Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling
Figure 4 for Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling
Viaarxiv icon

AmberNet: A Compact End-to-End Model for Spoken Language Identification

Add code
Bookmark button
Alert button
Oct 27, 2022
Fei Jia, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg

Figure 1 for AmberNet: A Compact End-to-End Model for Spoken Language Identification
Figure 2 for AmberNet: A Compact End-to-End Model for Spoken Language Identification
Figure 3 for AmberNet: A Compact End-to-End Model for Spoken Language Identification
Figure 4 for AmberNet: A Compact End-to-End Model for Spoken Language Identification
Viaarxiv icon

Multi-scale Speaker Diarization with Dynamic Scale Weighting

Add code
Bookmark button
Alert button
Mar 30, 2022
Tae Jin Park, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg

Figure 1 for Multi-scale Speaker Diarization with Dynamic Scale Weighting
Figure 2 for Multi-scale Speaker Diarization with Dynamic Scale Weighting
Figure 3 for Multi-scale Speaker Diarization with Dynamic Scale Weighting
Figure 4 for Multi-scale Speaker Diarization with Dynamic Scale Weighting
Viaarxiv icon