Research papers and code for "Minho Ha":
Deep convolutional neural networks have proven to be well suited for image classification applications. However, if there is distortion in the image, the classification accuracy can be significantly degraded, even with state-of-the-art neural networks. The accuracy cannot be significantly improved by simply training with distorted images. Instead, this paper proposes a multiple neural network topology referred to as a selective deep convolutional neural network. By modifying existing state-of-the-art neural networks in the proposed manner, it is shown that a similar level of classification accuracy can be achieved, but at a significantly lower cost. The cost reduction is obtained primarily through the use of fewer weight parameters. Using fewer weights reduces the number of multiply-accumulate operations and also reduces the energy required for data accesses. Finally, it is shown that the effectiveness of the proposed selective deep convolutional neural network can be further improved by combining it with previously proposed network cost reduction methods.

* 10 pages, 4 figuers
Click to Read Paper and Get Code
We show Correspondence Analysis (CA) is equivalent to defining Gini-index with appropriate scaled one-hot encoding. Using this relation, we introduce non-linear kernel extension of CA. The extended CA gives well-known analysis for categorical data (CD) and natural language processing by specializing kernels. For example, our formulation can give G-test, skip-gram with negative-sampling (SGNS), and GloVe as a special case. We introduce two kernels for natural language processing based on our formulation. First is a stop word(SW) kernel. Second is word similarity(WS) kernel. The SW kernel is the system introducing appropriate weights for SW. The WS kernel enables to use WS test data as training data for vector space representations of words. We show these kernels enhances accuracy when training data is not sufficiently large.

Click to Read Paper and Get Code
Deep learning is a potential paradigm changer for the design of wireless communications systems (WCS), from conventional handcrafted schemes based on sophisticated mathematical models with assumptions to autonomous schemes based on the end-to-end deep learning using a large number of data. In this article, we present a basic concept of the deep learning and its application to WCS by investigating the resource allocation (RA) scheme based on a deep neural network (DNN) where multiple goals with various constraints can be satisfied through the end-to-end deep learning. Especially, the optimality and feasibility of the DNN based RA are verified through simulation. Then, we discuss the technical challenges regarding the application of deep learning in WCS.

* This work has been submitted to the IEEE for possible publication
Click to Read Paper and Get Code
In this work, we introduce temporal hierarchies to the sequence to sequence (seq2seq) model to tackle the problem of abstractive summarization of scientific articles. The proposed Multiple Timescale model of the Gated Recurrent Unit (MTGRU) is implemented in the encoder-decoder setting to better deal with the presence of multiple compositionalities in larger texts. The proposed model is compared to the conventional RNN encoder-decoder, and the results demonstrate that our model trains faster and shows significant performance gains. The results also show that the temporal hierarchies help improve the ability of seq2seq models to capture compositionalities better without the presence of highly complex architectural hierarchies.

* To appear in RepL4NLP at ACL 2016
Click to Read Paper and Get Code
Realistic image synthesis is to generate an image that is perceptually indistinguishable from an actual image. Generating realistic looking images with large variations (e.g., large spatial deformations and large pose change), however, is very challenging. Handing large variations as well as preserving appearance needs to be taken into account in the realistic looking image generation. In this paper, we propose a novel realistic looking image synthesis method, especially in large change demands. To do that, we devise generative guiding blocks. The proposed generative guiding block includes realistic appearance preserving discriminator and naturalistic variation transforming discriminator. By taking the proposed generative guiding blocks into generative model, the latent features at the layer of generative model are enhanced to synthesize both realistic looking- and target variation- image. With qualitative and quantitative evaluation in experiments, we demonstrated the effectiveness of the proposed generative guiding blocks, compared to the state-of-the-arts.

* This work is accepted in ICIP 2019
Click to Read Paper and Get Code
Inspired by the recent advances in generative models, we introduce a human action generation model in order to generate a consecutive sequence of human motions to formulate novel actions. We propose a framework of an autoencoder and a generative adversarial network (GAN) to produce multiple and consecutive human actions conditioned on the initial state and the given class label. The proposed model is trained in an end-to-end fashion, where the autoencoder is jointly trained with the GAN. The model is trained on the NTU RGB+D dataset and we show that the proposed model can generate different styles of actions. Moreover, the model can successfully generate a sequence of novel actions given different action labels as conditions. The conventional human action prediction and generation models lack those features, which are essential for practical applications.

Click to Read Paper and Get Code