Models, code, and papers for "Dan Lin":

##### Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network

Aug 11, 2018
Kai Wang, Yu Liu, Xiujuan Xu, Dan Lin

Knowledge Graph Embedding (KGE) aims to represent entities and relations of knowledge graph in a low-dimensional continuous vector space. Recent works focus on incorporating structural knowledge with additional information, such as entity descriptions, relation paths and so on. However, common used additional information usually contains plenty of noise, which makes it hard to learn valuable representation. In this paper, we propose a new kind of additional information, called entity neighbors, which contain both semantic and topological features about given entity. We then develop a deep memory network model to encode information from neighbors. Employing a gating mechanism, representations of structure and neighbors are integrated into a joint representation. The experimental results show that our model outperforms existing KGE methods utilizing entity descriptions and achieves state-of-the-art metrics on 4 datasets.

* 9 pages, 4 figures
##### PRSNet: Part Relation and Selection Network for Bone Age Assessment

Sep 05, 2019
Yuanfeng Ji, Hao Chen, Dan Lin, Xiaohua Wu, Di Lin

Bone age is one of the most important indicators for assessing bone's maturity, which can help to interpret human's growth development level and potential progress. In the clinical practice, bone age assessment (BAA) of X-ray images requires the joint consideration of the appearance and location information of hand bones. These kinds of information can be effectively captured by the relation of different anatomical parts of hand bone. Recently developed methods differ mostly in how they model the part relation and choose useful parts for BAA. However, these methods neglect the mining of relationship among different parts, which can help to improve the assessment accuracy. In this paper, we propose a novel part relation module, which accurately discovers the underlying concurrency of parts by using multi-scale context information of deep learning feature representation. Furthermore, based on the part relation, we explore a new part selection module, which comprehensively measures the importance of parts and select the top ranking parts for assisting BAA. We jointly train our part relation and selection modules in an end-to-end way, achieving state-of-the-art performance on the public RSNA 2017 Pediatric Bone Age benchmark dataset and outperforming other competitive methods by a significant margin.

##### Neural Networks Models for Entity Discovery and Linking

Nov 11, 2016
Dan Liu, Wei Lin, Shiliang Zhang, Si Wei, Hui Jiang

This paper describes the USTC_NELSLIP systems submitted to the Trilingual Entity Detection and Linking (EDL) track in 2016 TAC Knowledge Base Population (KBP) contests. We have built two systems for entity discovery and mention detection (MD): one uses the conditional RNNLM and the other one uses the attention-based encoder-decoder framework. The entity linking (EL) system consists of two modules: a rule based candidate generation and a neural networks probability ranking model. Moreover, some simple string matching rules are used for NIL clustering. At the end, our best system has achieved an F1 score of 0.624 in the end-to-end typed mention ceaf plus metric.

* 9 pages, 5 figures
##### Neural Module Networks for Reasoning over Text

Dec 10, 2019
Nitish Gupta, Kevin Lin, Dan Roth, Sameer Singh, Matt Gardner

Answering compositional questions that require multiple steps of reasoning against text is challenging, especially when they involve discrete, symbolic operations. Neural module networks (NMNs) learn to parse such questions as executable programs composed of learnable modules, performing well on synthetic visual QA domains. However, we find that it is challenging to learn these models for non-synthetic questions on open-domain text, where a model needs to deal with the diversity of natural language and perform a broader range of reasoning. We extend NMNs by: (a) introducing modules that reason over a paragraph of text, performing symbolic reasoning (such as arithmetic, sorting, counting) over numbers and dates in a probabilistic and differentiable manner; and (b) proposing an unsupervised auxiliary loss to help extract arguments associated with the events in text. Additionally, we show that a limited amount of heuristically-obtained question program and intermediate module output supervision provides sufficient inductive bias for accurate learning. Our proposed model significantly outperforms state-of-the-art models on a subset of the DROP dataset that poses a variety of reasoning challenges that are covered by our modules.

##### Efficient piecewise training of deep structured models for semantic segmentation

Jun 06, 2016
Guosheng Lin, Chunhua Shen, Anton van dan Hengel, Ian Reid

Recent advances in semantic image segmentation have mostly been achieved by training deep convolutional neural networks (CNNs). We show how to improve semantic segmentation through the use of contextual information; specifically, we explore patch-patch' context between image regions, and patch-background' context. For learning from the patch-patch context, we formulate Conditional Random Fields (CRFs) with CNN-based pairwise potential functions to capture semantic correlations between neighboring patches. Efficient piecewise training of the proposed deep structured model is then applied to avoid repeated expensive CRF inference for back propagation. For capturing the patch-background context, we show that a network design with traditional multi-scale image input and sliding pyramid pooling is effective for improving performance. Our experimental results set new state-of-the-art performance on a number of popular semantic segmentation datasets, including NYUDv2, PASCAL VOC 2012, PASCAL-Context, and SIFT-flow. In particular, we achieve an intersection-over-union score of 78.0 on the challenging PASCAL VOC 2012 dataset.

* Appearing in IEEE Conf. Computer Vision and Pattern Recognition (CVPR) 2016
##### Interactive Learning for Identifying Relevant Tweets to Support Real-time Situational Awareness

Various domain users are increasingly leveraging real-time social media data to gain rapid situational awareness. However, due to the high noise in the deluge of data, effectively determining semantically relevant information can be difficult, further complicated by the changing definition of relevancy by each end user for different events. The majority of existing methods for short text relevance classification fail to incorporate users' knowledge into the classification process. Existing methods that incorporate interactive user feedback focus on historical datasets. Therefore, classifiers cannot be interactively retrained for specific events or user-dependent needs in real-time. This limits real-time situational awareness, as streaming data that is incorrectly classified cannot be corrected immediately, permitting the possibility for important incoming data to be incorrectly classified as well. We present a novel interactive learning framework to improve the classification process in which the user iteratively corrects the relevancy of tweets in real-time to train the classification model on-the-fly for immediate predictive improvements. We computationally evaluate our classification model adapted to learn at interactive rates. Our results show that our approach outperforms state-of-the-art machine learning models. In addition, we integrate our framework with the extended Social Media Analytics and Reporting Toolkit (SMART) 2.0 system, allowing the use of our interactive learning framework within a visual analytics system tailored for real-time situational awareness. To demonstrate our framework's effectiveness, we provide domain expert feedback from first responders who used the extended SMART 2.0 system.

* 12 pages, 8 figures, 3 tables, IEEE VAST, TVCG
##### Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts

Apr 19, 2019
Julia Kruk, Jonah Lubin, Karan Sikka, Xiao Lin, Dan Jurafsky, Ajay Divakaran

Computing author intent from multimodal data like Instagram posts requires modeling a complex relationship between text and image. For example a caption might reflect ironically on the image, so neither the caption nor the image is a mere transcript of the other. Instead they combine -- via what has been called meaning multiplication -- to create a new meaning that has a more complex relation to the literal meanings of text and image. Here we introduce a multimodal dataset of 1299 Instagram post labeled for three orthogonal taxonomies: the authorial intent behind the image-caption pair, the contextual relationship between the literal meanings of the image and caption, and the semiotic relationship between the signified meanings of the image and caption. We build a baseline deep multimodal classifier to validate the taxonomy, showing that employing both text and image improves intent detection by 8% compared to using only image modality, demonstrating the commonality of non-intersective meaning multiplication. Our dataset offers an important resource for the study of the rich meanings that results from pairing text and image.

##### Quantum-enhanced least-square support vector machine: simplified quantum algorithm and sparse solutions

Aug 05, 2019
Jie Lin, Dan-Bo Zhang, Shuo Zhang, Xiang Wang, Tan Li, Wan-su Bao

Quantum algorithms can enhance machine learning in different aspects. Here, we study quantum-enhanced least-square support vector machine (LS-SVM). Firstly, a novel quantum algorithm that uses continuous variable to assist matrix inversion is introduced to simplify the algorithm for quantum LS-SVM, while retaining exponential speed-up. Secondly, we propose a hybrid quantum-classical version for sparse solutions of LS-SVM. By encoding a large dataset into a quantum state, a much smaller transformed dataset can be extracted using quantum matrix toolbox, which is further processed in classical SVM. We also incorporate kernel methods into the above quantum algorithms, which uses both exponential growth Hilbert space of qubits and infinite dimensionality of continuous variable for quantum feature maps. The quantum LS-SVM exploits quantum properties to explore important themes for SVM such as sparsity and kernel methods, and stresses its quantum advantages ranging from speed-up to the potential capacity to solve classically difficult machine learning tasks.

* 9 pages and 0 figures
##### ATCSpeech: a multilingual pilot-controller speech corpus from real Air Traffic Control environment

Nov 26, 2019
Bo Yang, Xianlong Tan, Zhengmao Chen, Bing Wang, Dan Li, Zhongping Yang, Xiping Wu, Yi Lin

Automatic Speech Recognition (ASR) is greatly developed in recent years, which expedites many applications on other fields. For the ASR research, speech corpus is always an essential foundation, especially for the vertical industry, such as Air Traffic Control (ATC). There are some speech corpora for common applications, public or paid. However, for the ATC, it is difficult to collect raw speeches from real systems due to safety issues. More importantly, for a supervised learning task like ASR, annotating the transcription is a more laborious work, which hugely restricts the prospect of ASR application. In this paper, a multilingual speech corpus (ATCSpeech) from real ATC systems, including accented Mandarin Chinese and English, is built and released to encourage the non-commercial ASR research in ATC domain. The corpus is detailly introduced from the perspective of data amount, speaker gender and role, speech quality and other attributions. In addition, the performance of our baseline ASR models is also reported. A community edition for our speech database can be applied and used under a special contrast. To our best knowledge, this is the first work that aims at building a real and multilingual ASR corpus for the air traffic related research.

##### Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger

We formulate the problem of defogging as state estimation and future state prediction from previous, partial observations in the context of real-time strategy games. We propose to employ encoder-decoder neural networks for this task, and introduce proxy tasks and baselines for evaluation to assess their ability of capturing basic game rules and high-level dynamics. By combining convolutional neural networks and recurrent networks, we exploit spatial and sequential correlations and train well-performing models on a large dataset of human games of StarCraft: Brood War. Finally, we demonstrate the relevance of our models to downstream tasks by applying them for enemy unit prediction in a state-of-the-art, rule-based StarCraft bot. We observe improvements in win rates against several strong community bots.

* Advances in Neural Information Processing Systems 31 (2018) 10759-10770
##### Fully Automated Organ Segmentation in Male Pelvic CT Images

Accurate segmentation of prostate and surrounding organs at risk is important for prostate cancer radiotherapy treatment planning. We present a fully automated workflow for male pelvic CT image segmentation using deep learning. The architecture consists of a 2D localization network followed by a 3D segmentation network for volumetric segmentation of prostate, bladder, rectum, and femoral heads. We used a multi-channel 2D U-Net followed by a 3D U-Net with encoding arm modified with aggregated residual networks, known as ResNeXt. The models were trained and tested on a pelvic CT image dataset comprising 136 patients. Test results show that 3D U-Net based segmentation achieves mean (SD) Dice coefficient values of 90 (2.0)% ,96 (3.0)%, 95 (1.3)%, 95 (1.5)%, and 84 (3.7)% for prostate, left femoral head, right femoral head, bladder, and rectum, respectively, using the proposed fully automated segmentation method.

* 21 pages; 11 figures; 4 tables
##### Three-Dimensional Radiotherapy Dose Prediction on Head and Neck Cancer Patients with a Hierarchically Densely Connected U-net Deep Learning Architecture

May 25, 2018
Dan Nguyen, Xun Jia, David Sher, Mu-Han Lin, Zohaib Iqbal, Hui Liu, Steve Jiang

The treatment planning process for patients with head and neck (H&N) cancer is regarded as one of the most complicated due large target volume, multiple prescription dose levels, and many radiation-sensitive critical structures near the target. Treatment planning for this site requires a high level of human expertise and a tremendous amount of effort to produce personalized high quality plans, taking as long as a week, which deteriorates the chances of tumor control and patient survival. To solve this problem, we propose to investigate a deep learning-based dose prediction model, Hierarchically Densely Connected U-net, based on two highly popular network architectures: U-net and DenseNet. We find that this new architecture is able to accurately and efficiently predict the dose distribution, outperforming the other two models, the Standard U-net and DenseNet, in homogeneity, dose conformity, and dose coverage on the test data. On average, our proposed model is capable of predicting the OAR max dose within 6.3% and mean dose within 5.1% of the prescription dose on the test data. The other models, the Standard U-net and DenseNet, performed worse, having an OAR max dose prediction error of 8.2% and 9.3%, respectively, and mean dose prediction error of 6.4% and 6.8%, respectively. In addition, our proposed model used 12 times less trainable parameters than the Standard U-net, and predicted the patient dose 4 times faster than DenseNet.

##### Personalized Context-aware Re-ranking for E-commerce Recommender Systems

Ranking is a core task in E-commerce recommender systems, which aims at providing an ordered list of items to users. Typically, a ranking function is learned from the labeled dataset to optimize the global performance, which produces a ranking score for each individual item. However, it may be sub-optimal because the scoring function applies to each item individually and does not explicitly consider the mutual influence between items, as well as the differences of users' preferences or intents. Therefore, we propose a personalized context-aware re-ranking model for E-commerce recommender systems. The proposed re-ranking model can be easily deployed as a follow-up modular after ranking by directly using the existing feature vectors of ranking. It directly optimizes the whole recommendation list by employing a transformer structure to efficiently encode the information of all items in the list. Specifically, the Transformer applies a self-attention mechanism that directly models the global relationships between any pair of items in the whole list. Besides, we introduce the personalized embedding to model the differences between feature distributions for different users. Experimental results on both offline benchmarks and real-world online E-commerce systems demonstrate the significant improvements of the proposed re-ranking model.

* 11 pages
##### Three-Dimensional Dose Prediction for Lung IMRT Patients with Deep Neural Networks: Robust Learning from Heterogeneous Beam Configurations

The use of neural networks to directly predict three-dimensional dose distributions for automatic planning is becoming popular. However, the existing methods only use patient anatomy as input and assume consistent beam configuration for all patients in the training database. The purpose of this work is to develop a more general model that, in addition to patient anatomy, also considers variable beam configurations, to achieve a more comprehensive automatic planning with a potentially easier clinical implementation, without the need of training specific models for different beam settings.

##### PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report

This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with a DSLR camera. The target metric used in this challenge combined the runtime, PSNR scores and solutions' perceptual results measured in the user study. To ensure the efficiency of the submitted models, we additionally measured their runtime and memory requirements on Android smartphones. The proposed solutions significantly improved baseline results defining the state-of-the-art for image enhancement on smartphones.

##### Team NCTU: Toward AI-Driving for Autonomous Surface Vehicles -- From Duckietown to RobotX

Robotic software and hardware systems of autonomous surface vehicles have been developed in transportation, military, and ocean researches for decades. Previous efforts in RobotX Challenges 2014 and 2016 facilitates the developments for important tasks such as obstacle avoidance and docking. Team NCTU is motivated by the AI Driving Olympics (AI-DO) developed by the Duckietown community, and adopts the principles to RobotX challenge. With the containerization (Docker) and uniformed AI agent (with observations and actions), we could better 1) integrate solutions developed in different middlewares (ROS and MOOS), 2) develop essential functionalities of from simulation (Gazebo) to real robots (either miniaturized or full-sized WAM-V), and 3) compare different approaches either from classic model-based or learning-based. Finally, we setup an outdoor on-surface platform with localization services for evaluation. Some of the preliminary results will be presented for the Team NCTU participations of the RobotX competition in Hawaii in 2018.

##### Linear Convergence of Frank-Wolfe for Rank-One Matrix Recovery Without Strong Convexity

Dec 03, 2019
Dan Garber

We consider convex optimization problems which are widely used as convex relaxations for low-rank matrix recovery problems. In particular, in several important problems, such as phase retrieval and robust PCA, the underlying assumption in many cases is that the optimal solution is rank-one. In this paper we consider a simple and natural sufficient condition on the objective so that the optimal solution to these relaxations is indeed unique and rank-one. Mainly, we show that under this condition, the standard Frank-Wolfe method with line-search (i.e., without any tuning of parameters whatsoever), which only requires a single rank-one SVD computation per iteration, finds an $\epsilon$-approximated solution in only $O(\log{1/\epsilon})$ iterations (as opposed to the previous best known bound of $O(1/\epsilon)$), despite the fact that the objective is not strongly convex. We consider several variants of the basic method with improved complexities, as well as an extension motivated by robust PCA, and finally, an extension to nonsmooth problems.

Sep 27, 2018
Dan Garber

##### Fast Rates for Online Gradient Descent Without Strong Convexity via Hoffman's Bound

Feb 13, 2018
Dan Garber

Hoffman's classical result gives a bound on the distance of a point from a convex and compact polytope in terms of the magnitude of violation of the constraints. Recently, several results showed that Hoffman's bound can be used to derive strongly-convex-like rates for first-order methods for convex optimization of curved, though not strongly convex, functions, over polyhedral sets. In this work, we use this classical result for the first time to obtain faster rates for \textit{online convex optimization} over polyhedral sets with curved convex, though not strongly convex, loss functions. Mainly, we show that under several reasonable assumptions on the data, the standard \textit{Online Gradient Descent} (OGD) algorithm guarantees logarithmic regret. To the best of our knowledge, the only previous algorithm to achieve logarithmic regret in the considered settings is the \textit{Online Newton Step} algorithm which requires quadratic (in the dimension) memory and to solve a linear system on each iteration, which greatly limits its applicability to large-scale problems. We also show that in the corresponding stochastic convex optimization setting, Stochastic Gradient Descent achieves convergence rate of $1/t$, matching the strongly-convex case.

##### Efficient Online Linear Optimization with Approximation Algorithms

Sep 10, 2017
Dan Garber

We revisit the problem of \textit{online linear optimization} in case the set of feasible actions is accessible through an approximated linear optimization oracle with a factor $\alpha$ multiplicative approximation guarantee. This setting is in particular interesting since it captures natural online extensions of well-studied \textit{offline} linear optimization problems which are NP-hard, yet admit efficient approximation algorithms. The goal here is to minimize the $\alpha$\textit{-regret} which is the natural extension of the standard \textit{regret} in \textit{online learning} to this setting. We present new algorithms with significantly improved oracle complexity for both the full information and bandit variants of the problem. Mainly, for both variants, we present $\alpha$-regret bounds of $O(T^{-1/3})$, were $T$ is the number of prediction rounds, using only $O(\log{T})$ calls to the approximation oracle per iteration, on average. These are the first results to obtain both average oracle complexity of $O(\log{T})$ (or even poly-logarithmic in $T$) and $\alpha$-regret bound $O(T^{-c})$ for a constant $c>0$, for both variants.

* Accepted to Conference on Neural Information Processing System (NIPS) 2017