Models, code, and papers for "Jong-Hyeok Lee":

Phoneme-level speech and natural language intergration for agglutinative languages

Nov 05, 1994
Geunbae Lee Jong-Hyeok Lee Kyunghee Kim

A new tightly coupled speech and natural language integration model is presented for a TDNN-based large vocabulary continuous speech recognition system. Unlike the popular n-best techniques developed for integrating mainly HMM-based speech and natural language systems in word level, which is obviously inadequate for the morphologically complex agglutinative languages, our model constructs a spoken language system based on the phoneme-level integration. The TDNN-CYK spoken language architecture is designed and implemented using the TDNN-based diphone recognition module integrated with the table-driven phonological/morphological co-analysis. Our integration model provides a seamless integration of speech and natural language for connectionist speech recognition systems especially for morphologically complex languages such as Korean. Our experiment results show that the speaker-dependent continuous Eojeol (word) recognition can be integrated with the morphological analysis with over 80\% morphological analysis success rate directly from the speech input for the middle-level vocabularies.

* 12 pages, Latex Postscript, compressed, uuencoded, will be presented in TWLT-8 

  Access Model/Code and Paper
Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean

Mar 18, 1996
Geunbae Lee, Jong-Hyeok Lee

A new tightly coupled speech and natural language integration model is presented for a TDNN-based continuous possibly large vocabulary speech recognition system for Korean. Unlike popular n-best techniques developed for integrating mainly HMM-based speech recognition and natural language processing in a {\em word level}, which is obviously inadequate for morphologically complex agglutinative languages, our model constructs a spoken language system based on a {\em morpheme-level} speech and language integration. With this integration scheme, the spoken Korean processing engine (SKOPE) is designed and implemented using a TDNN-based diphone recognition module integrated with a Viterbi-based lexical decoding and symbolic phonological/morphological co-analysis. Our experiment results show that the speaker-dependent continuous {\em eojeol} (Korean word) recognition and integrated morphological analysis can be achieved with over 80.6% success rate directly from speech inputs for the middle-level vocabularies.

* latex source with a4 style, 15 pages, to be published in computer processing of oriental language journal 

  Access Model/Code and Paper
SKOPE: A connectionist/symbolic architecture of spoken Korean processing

Apr 25, 1995
Geunbae Lee, Jong-Hyeok Lee

Spoken language processing requires speech and natural language integration. Moreover, spoken Korean calls for unique processing methodology due to its linguistic characteristics. This paper presents SKOPE, a connectionist/symbolic spoken Korean processing engine, which emphasizes that: 1) connectionist and symbolic techniques must be selectively applied according to their relative strength and weakness, and 2) the linguistic characteristics of Korean must be fully considered for phoneme recognition, speech and language integration, and morphological/syntactic processing. The design and implementation of SKOPE demonstrates how connectionist/symbolic hybrid architectures can be constructed for spoken agglutinative language processing. Also SKOPE presents many novel ideas for speech and language processing. The phoneme recognition, morphological analysis, and syntactic analysis experiments show that SKOPE is a viable approach for the spoken Korean processing.

* 8 pages, latex, use aaai.sty & aaai.bst, bibfile: nlpsp.bib, to be presented at IJCAI95 workshops on new approaches to learning for natural language processing 

  Access Model/Code and Paper
Modeling Inter-Speaker Relationship in XLNet for Contextual Spoken Language Understanding

Oct 28, 2019
Jonggu Kim, Jong-Hyeok Lee

We propose two methods to capture relevant history information in a multi-turn dialogue by modeling inter-speaker relationship for spoken language understanding (SLU). Our methods are tailored for and therefore compatible with XLNet, which is a state-of-the-art pretrained model, so we verified our models built on the top of XLNet. In our experiments, all models achieved higher accuracy than state-of-the-art contextual SLU models on two benchmark datasets. Analysis on the results demonstrated that the proposed methods are effective to improve SLU accuracy of XLNet. These methods to identify important dialogue history will be useful to alleviate ambiguity in SLU of the current utterance.

* submitted to ICASSP 2020 

  Access Model/Code and Paper
Decay-Function-Free Time-Aware Attention to Context and Speaker Indicator for Spoken Language Understanding

Mar 29, 2019
Jonggu Kim, Jong-Hyeok Lee

To capture salient contextual information for spoken language understanding (SLU) of a dialogue, we propose time-aware models that automatically learn the latent time-decay function of the history without a manual time-decay function. We also propose a method to identify and label the current speaker to improve the SLU accuracy. In experiments on the benchmark dataset used in Dialog State Tracking Challenge 4, the proposed models achieved significantly higher F1 scores than the state-of-the-art contextual models. Finally, we analyze the effectiveness of the introduced models in detail. The analysis demonstrates that the proposed methods were effective to improve SLU accuracy individually.

* Accepted as a long paper at NAACL 2019 

  Access Model/Code and Paper
Multiple Range-Restricted Bidirectional Gated Recurrent Units with Attention for Relation Classification

Nov 01, 2017
Jonggu Kim, Jong-Hyeok Lee

Most of neural approaches to relation classification have focused on finding short patterns that represent the semantic relation using Convolutional Neural Networks (CNNs) and those approaches have generally achieved better performances than using Recurrent Neural Networks (RNNs). In a similar intuition to the CNN models, we propose a novel RNN-based model that strongly focuses on only important parts of a sentence using multiple range-restricted bidirectional layers and attention for relation classification. Experimental results on the SemEval-2010 relation classification task show that our model is comparable to the state-of-the-art CNN-based and RNN-based models that use additional linguistic information.

* 6 pages, 1 figure 

  Access Model/Code and Paper
Phonological modeling for continuous speech recognition in Korean

Jul 18, 1996
WonIl Lee, Geunbae Lee, Jong-Hyeok Lee

A new scheme to represent phonological changes during continuous speech recognition is suggested. A phonological tag coupled with its morphological tag is designed to represent the conditions of Korean phonological changes. A pairwise language model of these morphological and phonological tags is implemented in Korean speech recognition system. Performance of the model is verified through the TDNN-based speech recognition experiments.

* 5 pages, ACL96 sigphon workshop 

  Access Model/Code and Paper
Chart-driven Connectionist Categorial Parsing of Spoken Korean

Nov 29, 1995
WonIl Lee, Geunbae Lee, Jong-Hyeok Lee

While most of the speech and natural language systems which were developed for English and other Indo-European languages neglect the morphological processing and integrate speech and natural language at the word level, for the agglutinative languages such as Korean and Japanese, the morphological processing plays a major role in the language processing since these languages have very complex morphological phenomena and relatively simple syntactic functionality. Obviously degenerated morphological processing limits the usable vocabulary size for the system and word-level dictionary results in exponential explosion in the number of dictionary entries. For the agglutinative languages, we need sub-word level integration which leaves rooms for general morphological processing. In this paper, we developed a phoneme-level integration model of speech and linguistic processings through general morphological analysis for agglutinative languages and a efficient parsing scheme for that integration. Korean is modeled lexically based on the categorial grammar formalism with unordered argument and suppressed category extensions, and chart-driven connectionist parsing method is introduced.

* 6 pages, Postscript file, Proceedings of ICCPOL'95 

  Access Model/Code and Paper
Integrating HMM-Based Speech Recognition With Direct Manipulation In A Multimodal Korean Natural Language Interface

Nov 18, 1996
Geunbae Lee, Jong-Hyeok Lee, Sangeok Kim

This paper presents a HMM-based speech recognition engine and its integration into direct manipulation interfaces for Korean document editor. Speech recognition can reduce typical tedious and repetitive actions which are inevitable in standard GUIs (graphic user interfaces). Our system consists of general speech recognition engine called ABrain {Auditory Brain} and speech commandable document editor called SHE {Simple Hearing Editor}. ABrain is a phoneme-based speech recognition engine which shows up to 97% of discrete command recognition rate. SHE is a EuroBridge widget-based document editor that supports speech commands as well as direct manipulation interfaces.

* 6 pages, ps file, presented at icmi96 (Bejing) 

  Access Model/Code and Paper
Multi-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation

Apr 24, 1996
Geunbae Lee, Jong-Hyeok Lee, JinHee Yoo

Most of the post-processing methods for character recognition rely on contextual information of character and word-fragment levels. However, due to linguistic characteristics of Korean, such low-level information alone is not sufficient for high-quality character-recognition applications, and we need much higher-level contextual information to improve the recognition results. This paper presents a domain independent post-processing technique that utilizes multi-level morphological, syntactic, and semantic information as well as character-level information. The proposed post-processing system performs three-level processing: candidate character-set selection, candidate eojeol (Korean word) generation through morphological analysis, and final single eojeol-sequence selection by linguistic evaluation. All the required linguistic information and probabilities are automatically acquired from a statistical corpus analysis. Experimental results demonstrate the effectiveness of our method, yielding error correction rate of 80.46%, and improved recognition rate of 95.53% from before-post-processing rate 71.2% for single best-solution selection.

* latex with a4, epsfig style, 21 pages, 11 postscript figures, accepted in pattern recognition journal 

  Access Model/Code and Paper
TAKTAG: Two-phase learning method for hybrid statistical/rule-based part-of-speech disambiguation

May 28, 1995
Geunbae Lee, Jong-Hyeok Lee, Sanghyun Shin

Both statistical and rule-based approaches to part-of-speech (POS) disambiguation have their own advantages and limitations. Especially for Korean, the narrow windows provided by hidden markov model (HMM) cannot cover the necessary lexical and long-distance dependencies for POS disambiguation. On the other hand, the rule-based approaches are not accurate and flexible to new tag-sets and languages. In this regard, the statistical/rule-based hybrid method that can take advantages of both approaches is called for the robust and flexible POS disambiguation. We present one of such method, that is, a two-phase learning architecture for the hybrid statistical/rule-based POS disambiguation, especially for Korean. In this method, the statistical learning of morphological tagging is error-corrected by the rule-based learning of Brill [1992] style tagger. We also design the hierarchical and flexible Korean tag-set to cope with the multiple tagging applications, each of which requires different tag-set. Our experiments show that the two-phase learning method can overcome the undesirable features of solely HMM-based or solely rule-based tagging, especially for morphologically complex Korean.

* 10pages, latex, named.sty & named.bst, use psfig figures, submitted 

  Access Model/Code and Paper
Bi-directional memory-based dialog translation: The KEMDT approach

Feb 23, 1995
Geunbae Lee, Hanmin Jung, Jong-Hyeok Lee

A bi-directional Korean/English dialog translation system is designed and implemented using the memory-based translation technique. The system KEMDT (Korean/English Memory-based Dialog Translation system) can perform Korean to English, and English to Korean translation using unified memory network and extended marker passing algorithm. We resolve the word order variation and frequent word omission problems in Korean by classifying the concept sequence element in four different types and extending the marker- passing-based-translation algorithm. Unlike the previous memory-based translation systems, the KEMDT system develops the bilingual memory network and the unified bi-directional marker passing translation algorithm. For efficient language specific processing, we separate the morphological processors from the memory-based translator. The KEMDT technology provides a hierarchical memory network and an efficient marker-based control for the recent example-based MT paradigm.

* latex postscript with psfig, 7 pages, to be presented at pacific association for computational lingusitics conference (pacling95) 

  Access Model/Code and Paper
Self-Attention-Based Message-Relevant Response Generation for Neural Conversation Model

May 23, 2018
Jonggu Kim, Doyeon Kong, Jong-Hyeok Lee

Using a sequence-to-sequence framework, many neural conversation models for chit-chat succeed in naturalness of the response. Nevertheless, the neural conversation models tend to give generic responses which are not specific to given messages, and it still remains as a challenge. To alleviate the tendency, we propose a method to promote message-relevant and diverse responses for neural conversation model by using self-attention, which is time-efficient as well as effective. Furthermore, we present an investigation of why and how effective self-attention is in deep comparison with the standard dialogue generation. The experiment results show that the proposed method improves the standard dialogue generation in various evaluation metrics.

* 8 pages 

  Access Model/Code and Paper
Unlimited Vocabulary Grapheme to Phoneme Conversion for Korean TTS

Jun 10, 1998
Byeongchang Kim, WonIl Lee, Geunbae Lee, Jong-Hyeok Lee

This paper describes a grapheme-to-phoneme conversion method using phoneme connectivity and CCV conversion rules. The method consists of mainly four modules including morpheme normalization, phrase-break detection, morpheme to phoneme conversion and phoneme connectivity check. The morpheme normalization is to replace non-Korean symbols into standard Korean graphemes. The phrase-break detector assigns phrase breaks using part-of-speech (POS) information. In the morpheme-to-phoneme conversion module, each morpheme in the phrase is converted into phonetic patterns by looking up the morpheme phonetic pattern dictionary which contains candidate phonological changes in boundaries of the morphemes. Graphemes within a morpheme are grouped into CCV patterns and converted into phonemes by the CCV conversion rules. The phoneme connectivity table supports grammaticality checking of the adjacent two phonetic morphemes. In the experiments with a corpus of 4,973 sentences, we achieved 99.9% of the grapheme-to-phoneme conversion performance and 97.5% of the sentence conversion performance. The full Korean TTS system is now being implemented using this conversion method.

* 5 pages, uses colacl.sty and acl.bst, uses epsfig. To appear in the Proceedings of the Joint 17th International Conference on Computational Linguistics 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL'98) 

  Access Model/Code and Paper
Improving Term Frequency Normalization for Multi-topical Documents, and Application to Language Modeling Approaches

Feb 08, 2015
Seung-Hoon Na, In-Su Kang, Jong-Hyeok Lee

Term frequency normalization is a serious issue since lengths of documents are various. Generally, documents become long due to two different reasons - verbosity and multi-topicality. First, verbosity means that the same topic is repeatedly mentioned by terms related to the topic, so that term frequency is more increased than the well-summarized one. Second, multi-topicality indicates that a document has a broad discussion of multi-topics, rather than single topic. Although these document characteristics should be differently handled, all previous methods of term frequency normalization have ignored these differences and have used a simplified length-driven approach which decreases the term frequency by only the length of a document, causing an unreasonable penalization. To attack this problem, we propose a novel TF normalization method which is a type of partially-axiomatic approach. We first formulate two formal constraints that the retrieval model should satisfy for documents having verbose and multi-topicality characteristic, respectively. Then, we modify language modeling approaches to better satisfy these two constraints, and derive novel smoothing methods. Experimental results show that the proposed method increases significantly the precision for keyword queries, and substantially improves MAP (Mean Average Precision) for verbose queries.

* Advances in Information Retrieval Lecture Notes in Computer Science Volume 4956, 2008, pp 382-393 
* 8 pages, conference paper, published in ECIR '08 

  Access Model/Code and Paper
Transformer-based Automatic Post-Editing with a Context-Aware Encoding Approach for Multi-Source Inputs

Aug 15, 2019
WonKee Lee, Junsu Park, Byung-Hyun Go, Jong-Hyeok Lee

Recent approaches to the Automatic Post-Editing (APE) research have shown that better results are obtained by multi-source models, which jointly encode both source (src) and machine translation output (mt) to produce post-edited sentence (pe). Along this trend, we present a new multi-source APE model based on the Transformer. To construct effective joint representations, our model internally learns to incorporate src context into mt representation. With this approach, we achieve a significant improvement over baseline systems, as well as the state-of-the-art multi-source APE model. Moreover, to demonstrate the capability of our model to incorporate src context, we show that the word alignment of the unknown MT system is successfully captured in our encoding results.

* 6 pages, 3 figures 

  Access Model/Code and Paper