Picture for Athanasios Katsamanis

Athanasios Katsamanis

Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling

Add code
May 30, 2023
Figure 1 for Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling
Figure 2 for Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling
Figure 3 for Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling
Figure 4 for Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling
Viaarxiv icon

Efficient Audio Captioning Transformer with Patchout and Text Guidance

Add code
Apr 06, 2023
Figure 1 for Efficient Audio Captioning Transformer with Patchout and Text Guidance
Figure 2 for Efficient Audio Captioning Transformer with Patchout and Text Guidance
Viaarxiv icon

Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP

Add code
Apr 03, 2023
Figure 1 for Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
Figure 2 for Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
Figure 3 for Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
Viaarxiv icon

Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek

Add code
Dec 31, 2022
Figure 1 for Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Figure 2 for Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Figure 3 for Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Figure 4 for Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Viaarxiv icon

Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos

Add code
Jul 22, 2022
Figure 1 for Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos
Figure 2 for Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos
Figure 3 for Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos
Figure 4 for Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos
Viaarxiv icon

Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss

Add code
Apr 28, 2022
Figure 1 for Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss
Figure 2 for Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss
Figure 3 for Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss
Figure 4 for Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss
Viaarxiv icon

Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition

Add code
Apr 01, 2022
Figure 1 for Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
Figure 2 for Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
Figure 3 for Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
Figure 4 for Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
Viaarxiv icon

EmpBot: A T5-based Empathetic Chatbot focusing on Sentiments

Add code
Oct 30, 2021
Figure 1 for EmpBot: A T5-based Empathetic Chatbot focusing on Sentiments
Figure 2 for EmpBot: A T5-based Empathetic Chatbot focusing on Sentiments
Figure 3 for EmpBot: A T5-based Empathetic Chatbot focusing on Sentiments
Figure 4 for EmpBot: A T5-based Empathetic Chatbot focusing on Sentiments
Viaarxiv icon

AudioVisual Speech Synthesis: A brief literature review

Add code
Feb 18, 2021
Viaarxiv icon