Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sejik Park

Diverse Feature Learning by Self-distillation and Reset

Mar 29, 2024
Sejik Park

Figure 1 for Diverse Feature Learning by Self-distillation and Reset

Figure 2 for Diverse Feature Learning by Self-distillation and Reset

Figure 3 for Diverse Feature Learning by Self-distillation and Reset

Figure 4 for Diverse Feature Learning by Self-distillation and Reset

Our paper addresses the problem of models struggling to learn diverse features, due to either forgetting previously learned features or failing to learn new ones. To overcome this problem, we introduce Diverse Feature Learning (DFL), a method that combines an important feature preservation algorithm with a new feature learning algorithm. Specifically, for preserving important features, we utilize self-distillation in ensemble models by selecting the meaningful model weights observed during training. For learning new features, we employ reset that involves periodically re-initializing part of the model. As a result, through experiments with various models on the image classification, we have identified the potential for synergistic effects between self-distillation and reset.

* 15 pages, 6 Figures

Via

Access Paper or Ask Questions

Learning to Discover Skills through Guidance

Nov 01, 2023
Hyunseung Kim, Byungkun Lee, Hojoon Lee, Dongyoon Hwang, Sejik Park, Kyushik Min, Jaegul Choo

In the field of unsupervised skill discovery (USD), a major challenge is limited exploration, primarily due to substantial penalties when skills deviate from their initial trajectories. To enhance exploration, recent methodologies employ auxiliary rewards to maximize the epistemic uncertainty or entropy of states. However, we have identified that the effectiveness of these rewards declines as the environmental complexity rises. Therefore, we present a novel USD algorithm, skill discovery with guidance (DISCO-DANCE), which (1) selects the guide skill that possesses the highest potential to reach unexplored states, (2) guides other skills to follow guide skill, then (3) the guided skills are dispersed to maximize their discriminability in unexplored states. Empirical evaluation demonstrates that DISCO-DANCE outperforms other USD baselines in challenging environments, including two navigation benchmarks and a continuous control benchmark. Qualitative visualizations and code of DISCO-DANCE are available at https://mynsng.github.io/discodance.

* 29 pages, 18 figures, published at NeurIPS 2023

Via

Access Paper or Ask Questions

Emotional Voice Conversion using Multitask Learning with Text-to-speech

Nov 27, 2019
Tae-Ho Kim, Sungjae Cho, Shinkook Choi, Sejik Park, Soo-Young Lee

Figure 1 for Emotional Voice Conversion using Multitask Learning with Text-to-speech

Figure 2 for Emotional Voice Conversion using Multitask Learning with Text-to-speech

Figure 3 for Emotional Voice Conversion using Multitask Learning with Text-to-speech

Figure 4 for Emotional Voice Conversion using Multitask Learning with Text-to-speech

Voice conversion (VC) is a task to transform a person's voice to different style while conserving linguistic contents. Previous state-of-the-art on VC is based on sequence-to-sequence (seq2seq) model, which could mislead linguistic information. There was an attempt to overcome it by using textual supervision, it requires explicit alignment which loses the benefit of using seq2seq model. In this paper, a voice converter using multitask learning with text-to-speech (TTS) is presented. The embedding space of seq2seq-based TTS has abundant information on the text. The role of the decoder of TTS is to convert embedding space to speech, which is same to VC. In the proposed model, the whole network is trained to minimize loss of VC and TTS. VC is expected to capture more linguistic information and to preserve training stability by multitask learning. Experiments of VC were performed on a male Korean emotional text-speech dataset, and it is shown that multitask learning is helpful to keep linguistic contents in VC.

* 4 pages, 3 figures, submitted to ICASSP2020

Via

Access Paper or Ask Questions