Models, code, and papers for "Hao Fu":

Efficient meta reinforcement learning via meta goal generation

Nov 10, 2019
Haotian Fu, Hongyao Tang, Jianye Hao

Meta reinforcement learning (meta-RL) is able to accelerate the acquisition of new tasks by learning from past experience. Current meta-RL methods usually learn to adapt to new tasks by directly optimizing the parameters of policies over primitive actions. However, for complex tasks which requires sophisticated control strategies, it would be quite inefficient to to directly learn such a meta-policy. Moreover, this problem can become more severe and even fail in spare reward settings, which is quite common in practice. To this end, we propose a new meta-RL algorithm called meta goal-generation for hierarchical RL (MGHRL) by leveraging hierarchical actor-critic framework. Instead of directly generate policies over primitive actions for new tasks, MGHRL learns to generate high-level meta strategies over subgoals given past experience and leaves the rest of how to achieve subgoals as independent RL subtasks. Our empirical results on several challenging simulated robotics environments show that our method enables more efficient and effective meta-learning from past experience and outperforms state-of-the-art meta-RL and Hierarchical-RL methods in sparse reward settings.

Non-imaging single-pixel sensing with optimized binary modulation

Sep 27, 2019
Hao Fu, Liheng Bian, Jun Zhang

The conventional high-level sensing techniques require high-fidelity images as input to extract target features, which are produced by either complex imaging hardware or high-complexity reconstruction algorithms. In this letter, we propose single-pixel sensing (SPS) that performs high-level sensing directly from coupled measurements of a single-pixel detector, without the conventional image acquisition and reconstruction process. The technique consists of three steps including binary light modulation that can be physically implemented at $\sim$22kHz, single-pixel coupled detection owning wide working spectrum and high signal-to-noise ratio, and end-to-end deep-learning based sensing that reduces both hardware and software complexity. Besides, the binary modulation is trained and optimized together with the sensing network, which ensures least required measurements and optimal sensing accuracy. The effectiveness of SPS is demonstrated on the classification task of handwritten MNIST dataset, and 96.68% classification accuracy at $\sim$1kHz is achieved. The reported single-pixel sensing technique is a novel framework for highly efficient machine intelligence.

Lidar-based Object Classification with Explicit Occlusion Modeling

Jul 10, 2019
Xiaoxiang Zhang, Hao Fu, Bin Dai

LIDAR is one of the most important sensors for Unmanned Ground Vehicles (UGV). Object detection and classification based on lidar point cloud is a key technology for UGV. In object detection and classification, the mutual occlusion between neighboring objects is an important factor affecting the accuracy. In this paper, we consider occlusion as an intrinsic property of the point cloud data. We propose a novel approach that explicitly model the occlusion. The occlusion property is then taken into account in the subsequent classification step. We perform experiments on the KITTI dataset. Experimental results indicate that by utilizing the occlusion property that we modeled, the classifier obtains much better performance.

A Three-Phase Search Approach for the Quadratic Minimum Spanning Tree Problem

Feb 06, 2014
Zhang-Hua Fu, Jin-Kao Hao

Given an undirected graph with costs associated with each edge as well as each pair of edges, the quadratic minimum spanning tree problem (QMSTP) consists of determining a spanning tree of minimum total cost. This problem can be used to model many real-life network design applications, in which both routing and interference costs should be considered. For this problem, we propose a three-phase search approach named TPS, which integrates 1) a descent-based neighborhood search phase using two different move operators to reach a local optimum from a given starting solution, 2) a local optima exploring phase to discover nearby local optima within a given regional search area, and 3) a perturbation-based diversification phase to jump out of the current regional search area. Additionally, we introduce dedicated techniques to reduce the neighborhood to explore and streamline the neighborhood evaluations. Computational experiments based on hundreds of representative benchmarks show that TPS produces highly competitive results with respect to the best performing approaches in the literature by improving the best known results for 31 instances and matching the best known results for the remaining instances only except two cases. Critical elements of the proposed algorithms are analyzed.

* 29 pages
Rethinking Text Attribute Transfer: A Lexical Analysis

Sep 26, 2019
Yao Fu, Hao Zhou, Jiaze Chen, Lei Li

Text attribute transfer is modifying certain linguistic attributes (e.g. sentiment, style, authorship, etc.) of a sentence and transforming them from one type to another. In this paper, we aim to analyze and interpret what is changed during the transfer process. We start from the observation that in many existing models and datasets, certain words within a sentence play important roles in determining the sentence attribute class. These words are referred to as \textit{the Pivot Words}. Based on these pivot words, we propose a lexical analysis framework, \textit{the Pivot Analysis}, to quantitatively analyze the effects of these words in text attribute classification and transfer. We apply this framework to existing datasets and models and show that: (1) the pivot words are strong features for the classification of sentence attributes; (2) to change the attribute of a sentence, many datasets only requires to change certain pivot words; (3) consequently, many transfer models only perform the lexical-level modification, while leaving higher-level sentence structures unchanged. Our work provides an in-depth understanding of linguistic attribute transfer and further identifies the future requirements and challenges of this task\footnote{Our code can be found at https://github.com/FranxYao/pivot_analysis}.

* INLG 2019
Densely Dilated Spatial Pooling Convolutional Network using benign loss functions for imbalanced volumetric prostate segmentation

Feb 01, 2018
Qiuhua Liu, Min Fu, Xinqi Gong, Hao Jiang

The high incidence rate of prostate disease poses a requirement in early detection for diagnosis. As one of the main imaging methods used for prostate cancer detection, Magnetic Resonance Imaging (MRI) has wide range of appearance and imbalance problems, making automated prostate segmentation fundamental but challenging. Here we propose a novel Densely Dilated Spatial Pooling Convolutional Network (DDSP ConNet) in encoder-decoder structure. It employs dense structure to combine dilated convolution and global pooling, thus supplies coarse segmentation results from encoder and decoder subnet and preserves more contextual information. To obtain richer hierarchical feature maps, residual long connection is furtherly adopted to fuse contexture features. Meanwhile, we adopt DSC loss and Jaccard loss functions to train our DDSP ConNet. We surprisingly found and proved that, in contrast to re-weighted cross entropy, DSC loss and Jaccard loss have a lot of benign properties in theory, including symmetry, continuity and differentiability about the parameters of network. Extensive experiments on the MICCAI PROMISE12 challenge dataset have been done to corroborate the effectiveness of our DDSP ConNet with DSC loss and Jaccard loss. Totally, our method achieves a score of 85.78 in the test dataset, outperforming most of other competitors.

* 14pages, 5 figures, anonymous review in IJACAI2018
"Love is as Complex as Math": Metaphor Generation System for Social Chatbot

Jan 03, 2020
Danning Zheng, Ruihua Song, Tianran Hu, Hao Fu, Jin Zhou

As the wide adoption of intelligent chatbot in human daily life, user demands for such systems evolve from basic task-solving conversations to more casual and friend-like communication. To meet the user needs and build emotional bond with users, it is essential for social chatbots to incorporate more human-like and advanced linguistic features. In this paper, we investigate the usage of a commonly used rhetorical device by human -- metaphor for social chatbot. Our work first designs a metaphor generation framework, which generates topic-aware and novel figurative sentences. By embedding the framework into a chatbot system, we then enables the chatbot to communicate with users using figurative language. Human annotators validate the novelty and properness of the generated metaphors. More importantly, we evaluate the effects of employing metaphors in human-chatbot conversations. Experiments indicate that our system effectively arouses user interests in communicating with our chatbot, resulting in significantly longer human-chatbot conversations.

Edge-Aware Deep Image Deblurring

Jul 04, 2019
Zhichao Fu, Yingbin Zheng, Hao Ye, Yu Kong, Jing Yang, Liang He

Image deblurring is a fundamental and challenging low-level vision problem. Previous vision research indicates that edge structure in natural scenes is one of the most important factors to estimate the abilities of human visual perception. In this paper, we resort to human visual demands of sharp edges and propose a two-phase edge-aware deep network to improve deep image deblurring. An edge detection convolutional subnet is designed in the first phase and a residual fully convolutional deblur subnet is then used for generating deblur results. The introduction of the edge-aware network enables our model with the specific capacity of enhancing images with sharp edges. We successfully apply our framework on standard benchmarks and promising results are achieved by our proposed deblur model.

G2R Bound: A Generalization Bound for Supervised Learning from GAN-Synthetic Data

May 29, 2019
Fu-Chieh Chang, Hao-Jen Wang, Chun-Nan Chou, Edward Y. Chang

Performing supervised learning from the data synthesized by using Generative Adversarial Networks (GANs), dubbed GAN-synthetic data, has two important applications. First, GANs may generate more labeled training data, which may help improve classification accuracy. Second, in scenarios where real data cannot be released outside certain premises for privacy and/or security reasons, using GAN- synthetic data to conduct training is a plausible alternative. This paper proposes a generalization bound to guarantee the generalization capability of a classifier learning from GAN-synthetic data. This generalization bound helps developers gauge the generalization gap between learning from synthetic data and testing on real data, and can therefore provide the clues to improve the generalization capability.

Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multiscale Rotation Region Convolutional Neural Network

Jun 13, 2018
Xue Yang, Hao Sun, Xian Sun, Menglong Yan, Zhi Guo, Kun Fu

Ship detection is of great importance and full of challenges in the field of remote sensing. The complexity of application scenarios, the redundancy of detection region, and the difficulty of dense ship detection are all the main obstacles that limit the successful operation of traditional methods in ship detection. In this paper, we propose a brand new detection model based on multiscale rotational region convolutional neural network to solve the problems above. This model is mainly consist of five consecutive parts: Dense Feature Pyramid Network (DFPN), adaptive region of interest (ROI) Align, rotational bounding box regression, prow direction prediction and rotational nonmaximum suppression (R-NMS). First of all, the low-level location information and high-level semantic information are fully utilized through multiscale feature networks. Then, we design adaptive ROI Align to obtain high quality proposals which remain complete spatial and semantic information. Unlike most previous approaches, the prediction obtained by our method is the minimum bounding rectangle of the object with less redundant regions. Therefore, rotational region detection framework is more suitable to detect the dense object than traditional detection model. Additionally, we can find the berthing and sailing direction of ship through prediction. A detailed evaluation based on SRSS and DOTA dataset for rotation detection shows that our detection method has a competitive performance.

Variable Population Memetic Search: A Case Study on the Critical Node Problem

Sep 12, 2019
Yangming Zhou, Jin-Kao Hao, Zhang-Hua Fu, Zhe Wang, Xiangjing Lai

Population-based memetic algorithms have been successfully applied to solve many difficult combinatorial problems. Often, a population of fixed size was used in such algorithms to record some best solutions sampled during the search. However, given the particular features of the problem instance under consideration, a population of variable size would be more suitable to ensure the best search performance possible. In this work, we propose variable population memetic search (VPMS), where a strategic population sizing mechanism is used to dynamically adjust the population size during the memetic search process. Our VPMS approach starts its search from a small population of only two solutions to focus on exploitation, and then adapts the population size according to the search status to continuously influence the balancing between exploitation and exploration. We illustrate an application of the VPMS approach to solve the challenging critical node problem (CNP). We show that the VPMS algorithm integrating a variable population, an effective local optimization procedure (called diversified late acceptance search) and a backbone-based crossover operator performs very well compared to state-of-the-art CNP algorithms. The algorithm is able to discover new upper bounds for 13 instances out of the 42 popular benchmark instances, while matching 23 previous best-known upper bounds.

Linked Dynamic Graph CNN: Learning on Point Cloud via Linking Hierarchical Features

Apr 22, 2019
Kuangen Zhang, Ming Hao, Jing Wang, Clarence W. de Silva, Chenglong Fu

Learning on point cloud is eagerly in demand because the point cloud is a common type of geometric data and can aid robots to understand environments robustly. However, the point cloud is sparse, unstructured, and unordered, which cannot be recognized accurately by a traditional convolutional neural network (CNN) nor a recurrent neural network (RNN). Fortunately, a graph convolutional neural network (Graph CNN) can process sparse and unordered data. Hence, we propose a linked dynamic graph CNN (LDGCNN) to classify and segment point cloud directly in this paper. We remove the transformation network, link hierarchical features from dynamic graphs, freeze feature extractor, and retrain the classifier to increase the performance of LDGCNN. We explain our network using theoretical analysis and visualization. Through experiments, we show that the proposed LDGCNN achieves state-of-art performance on two standard datasets: ModelNet40 and ShapeNet.

Comparison Network for One-Shot Conditional Object Detection

Apr 04, 2019
Tengfei Zhang, Yue Zhang, Xian Sun, Hao Sun, Menglong Yan, Xue Yang, Kun Fu

The current advances in object detection depend on large-scale datasets to get good performance. However, there may not always be sufficient samples in many scenarios, which leads to the research on few-shot detection as well as its extreme variation one-shot detection. In this paper, the one-shot detection has been formulated as a conditional probability problem. With this insight, a novel one-shot conditional object detection (OSCD) framework, referred as Comparison Network (ComparisonNet), has been proposed. Specifically, query and target image features are extracted through a Siamese network as mapped metrics of marginal probabilities. A two-stage detector for OSCD is introduced to compare the extracted query and target features with the learnable metric to approach the optimized non-linear conditional probability. Once trained, ComparisonNet can detect objects of both seen and unseen classes without further training, which also has the advantages including class-agnostic, training-free for unseen classes, and without catastrophic forgetting. Experiments show that the proposed approach achieves state-of-the-art performance on the proposed datasets of Fashion-MNIST and PASCAL VOC.

* 10 pages
Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing

Variational autoencoders (VAEs) with an auto-regressive decoder have been applied for many natural language processing (NLP) tasks. The VAE objective consists of two terms, (i) reconstruction and (ii) KL regularization, balanced by a weighting hyper-parameter \beta. One notorious training difficulty is that the KL term tends to vanish. In this paper we study scheduling schemes for \beta, and show that KL vanishing is caused by the lack of good latent codes in training the decoder at the beginning of optimization. To remedy this, we propose a cyclical annealing schedule, which repeats the process of increasing \beta multiple times. This new procedure allows the progressive learning of more meaningful latent codes, by leveraging the informative representations of previous cycles as warm re-starts. The effectiveness of cyclical annealing is validated on a broad range of NLP tasks, including language modeling, dialog response generation and unsupervised language pre-training.

* Published in NAACL 2019; The first two authors contribute equally; Code: https://github.com/haofuml/cyclical_annealing
Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces

Mar 12, 2019
Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan

Deep Reinforcement Learning (DRL) has been applied to address a variety of cooperative multi-agent problems with either discrete action spaces or continuous action spaces. However, to the best of our knowledge, no previous work has ever succeeded in applying DRL to multi-agent problems with discrete-continuous hybrid (or parameterized) action spaces which is very common in practice. Our work fills this gap by proposing two novel algorithms: Deep Multi-Agent Parameterized Q-Networks (Deep MAPQN) and Deep Multi-Agent Hierarchical Hybrid Q-Networks (Deep MAHHQN). We follow the centralized training but decentralized execution paradigm: different levels of communication between different agents are used to facilitate the training process, while each agent executes its policy independently based on local observations during execution. Our empirical results on several challenging tasks (simulated RoboCup Soccer and game Ghost Story) show that both Deep MAPQN and Deep MAHHQN are effective and significantly outperform existing independent deep parameterized Q-learning method.

Transformation Consistent Self-ensembling Model for Semi-supervised Medical Image Segmentation

Mar 04, 2019
Xiaomeng Li, Lequan Yu, Hao Chen, Chi-Wing Fu, Pheng-Ann Heng

Deep convolutional neural networks have achieved remarkable progress on a variety of medical image computing tasks. A common problem when applying supervised deep learning methods to medical images is the lack of labeled data, which is very expensive and time-consuming to be collected. In this paper, we present a novel semi-supervised method for medical image segmentation, where the network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data. To utilize the unlabeled data, our method encourages the consistent predictions of the network-in-training for the same input under different regularizations. Aiming for the semi-supervised segmentation problem, we enhance the effect of regularization for pixel-level predictions by introducing a transformation, including rotation and flipping, consistent scheme in our self-ensembling model. With the aim of semi-supervised segmentation tasks, we introduce a transformation consistent strategy in our self-ensembling model to enhance the regularization effect for pixel-level predictions. We have extensively validated the proposed semi-supervised method on three typical yet challenging medical image segmentation tasks: (i) skin lesion segmentation from dermoscopy images on International Skin Imaging Collaboration (ISIC) 2017 dataset, (ii) optic disc segmentation from fundus images on Retinal Fundus Glaucoma Challenge (REFUGE) dataset, and (iii) liver segmentation from volumetric CT scans on Liver Tumor Segmentation Challenge (LiTS) dataset. Compared to the state-of-the-arts, our proposed method shows superior segmentation performance on challenging 2D/3D medical images, demonstrating the effectiveness of our semi-supervised method for medical image segmentation.

* A preliminary version of this work is presented in arXiv:1808.03887

Nov 30, 2018
Li Huang, Yifeng Yin, Zeng Fu, Shifa Zhang, Hao Deng, Dianbo Liu

Medical data are valuable for improvement of health care, policy making and many other purposes. Vast amount of medical data are stored in different locations ,on many different devices and in different data silos. Sharing medical data among different sources is a big challenge due to regulatory , operational and security reasons. One potential solution is federated machine learning ,which a method that sends machine learning algorithms simultaneously to all data sources ,train models in each source and aggregates the learned models. This strategy allows utilization of valuable data without moving them. In this article, we proposed an adaptive boosting method that increases the efficiency of federated machine learning. Using intensive care unit data from hospital, we showed that LoAdaBoost federated learning outperformed baseline method and increased communication efficiency at negligible additional cost.

Semi-supervised Skin Lesion Segmentation via Transformation Consistent Self-ensembling Model

Aug 12, 2018
Xiaomeng Li, Lequan Yu, Hao Chen, Chi-Wing Fu, Pheng-Ann Heng

Automatic skin lesion segmentation on dermoscopic images is an essential component in computer-aided diagnosis of melanoma. Recently, many fully supervised deep learning based methods have been proposed for automatic skin lesion segmentation. However, these approaches require massive pixel-wise annotation from experienced dermatologists, which is very costly and time-consuming. In this paper, we present a novel semi-supervised method for skin lesion segmentation by leveraging both labeled and unlabeled data. The network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data. In this paper, we present a novel semi-supervised method for skin lesion segmentation, where the network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data. Our method encourages a consistent prediction for unlabeled images using the outputs of the network-in-training under different regularizations, so that it can utilize the unlabeled data. To utilize the unlabeled data, our method encourages the consistent predictions of the network-in-training for the same input under different regularizations. Aiming for the semi-supervised segmentation problem, we enhance the effect of regularization for pixel-level predictions by introducing a transformation, including rotation and flipping, consistent scheme in our self-ensembling model. With only 300 labeled training samples, our method sets a new record on the benchmark of the International Skin Imaging Collaboration (ISIC) 2017 skin lesion segmentation challenge. Such a result clearly surpasses fully-supervised state-of-the-arts that are trained with 2000 labeled data.

* BMVC 2018