Alert button
Picture for Gary Wang

Gary Wang

Alert button

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

Add code
Bookmark button
Alert button
Feb 29, 2024
Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov

Viaarxiv icon

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Add code
Bookmark button
Alert button
Jan 08, 2024
Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar Aleksic

Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Add code
Bookmark button
Alert button
Aug 14, 2023
Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran

Viaarxiv icon

Understanding Shared Speech-Text Representations

Add code
Bookmark button
Alert button
Apr 27, 2023
Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang

Figure 1 for Understanding Shared Speech-Text Representations
Figure 2 for Understanding Shared Speech-Text Representations
Figure 3 for Understanding Shared Speech-Text Representations
Figure 4 for Understanding Shared Speech-Text Representations
Viaarxiv icon

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Bookmark button
Alert button
Mar 03, 2023
Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu

Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon

Modular Hybrid Autoregressive Transducer

Add code
Bookmark button
Alert button
Oct 31, 2022
Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno

Figure 1 for Modular Hybrid Autoregressive Transducer
Figure 2 for Modular Hybrid Autoregressive Transducer
Figure 3 for Modular Hybrid Autoregressive Transducer
Figure 4 for Modular Hybrid Autoregressive Transducer
Viaarxiv icon

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

Add code
Bookmark button
Alert button
Oct 27, 2022
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

Figure 1 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 2 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 3 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 4 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Viaarxiv icon

G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR

Add code
Bookmark button
Alert button
Oct 19, 2022
Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park

Figure 1 for G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR
Figure 2 for G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR
Figure 3 for G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR
Figure 4 for G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR
Viaarxiv icon

Non-Parallel Voice Conversion for ASR Augmentation

Add code
Bookmark button
Alert button
Sep 15, 2022
Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran, Fadi Biadsy, Yinghui Huang, Jesse Emond, Pedro Moreno Mengibar

Figure 1 for Non-Parallel Voice Conversion for ASR Augmentation
Figure 2 for Non-Parallel Voice Conversion for ASR Augmentation
Figure 3 for Non-Parallel Voice Conversion for ASR Augmentation
Figure 4 for Non-Parallel Voice Conversion for ASR Augmentation
Viaarxiv icon

Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The BARN Challenge at ICRA 2022

Add code
Bookmark button
Alert button
Aug 22, 2022
Xuesu Xiao, Zifan Xu, Zizhao Wang, Yunlong Song, Garrett Warnell, Peter Stone, Tingnan Zhang, Shravan Ravi, Gary Wang, Haresh Karnan, Joydeep Biswas, Nicholas Mohammad, Lauren Bramblett, Rahul Peddi, Nicola Bezzo, Zhanteng Xie, Philip Dames

Figure 1 for Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The BARN Challenge at ICRA 2022
Figure 2 for Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The BARN Challenge at ICRA 2022
Figure 3 for Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The BARN Challenge at ICRA 2022
Figure 4 for Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The BARN Challenge at ICRA 2022
Viaarxiv icon