Models, code, and papers for "Kun Sun":

Meta-modeling game for deriving theoretical-consistent, micro-structural-based traction-separation laws via deep reinforcement learning

Oct 24, 2018
Kun Wang, WaiChing Sun

This paper presents a new meta-modeling framework to employ deep reinforcement learning (DRL) to generate mechanical constitutive models for interfaces. The constitutive models are conceptualized as information flow in directed graphs. The process of writing constitutive models are simplified as a sequence of forming graph edges with the goal of maximizing the model score (a function of accuracy, robustness and forward prediction quality). Thus meta-modeling can be formulated as a Markov decision process with well-defined states, actions, rules, objective functions, and rewards. By using neural networks to estimate policies and state values, the computer agent is able to efficiently self-improve the constitutive model it generated through self-playing, in the same way AlphaGo Zero (the algorithm that outplayed the world champion in the game of Go)improves its gameplay. Our numerical examples show that this automated meta-modeling framework not only produces models which outperform existing cohesive models on benchmark traction-separation data but is also capable of detecting hidden mechanisms among micro-structural features and incorporating them in constitutive models to improve the forward prediction accuracy, which are difficult tasks to do manually.

  Click for Model/Code and Paper
Trilaminar Multiway Reconstruction Tree for Efficient Large Scale Structure from Motion

Dec 21, 2016
Kun Sun, Wenbing Tao

Accuracy and efficiency are two key problems in large scale incremental Structure from Motion (SfM). In this paper, we propose a unified framework to divide the image set into clusters suitable for reconstruction as well as find multiple reliable and stable starting points. Image partitioning performs in two steps. First, some small image groups are selected at places with high image density, and then all the images are clustered according to their optimal reconstruction paths to these image groups. This promises that the scene is always reconstructed from dense places to sparse areas, which can reduce error accumulation when images have weak overlap. To enable faster speed, images outside the selected group in each cluster are further divided to achieve a greater degree of parallelism. Experiments show that our method achieves significant speedup, higher accuracy and better completeness.

* this manuscript has been submitted to cvpr 2017 

  Click for Model/Code and Paper
GPU Accelerated Cascade Hashing Image Matching for Large Scale 3D Reconstruction

May 23, 2018
Tao Xu, Kun Sun, Wenbing Tao

Image feature point matching is a key step in Structure from Motion(SFM). However, it is becoming more and more time consuming because the number of images is getting larger and larger. In this paper, we proposed a GPU accelerated image matching method with improved Cascade Hashing. Firstly, we propose a Disk-Memory-GPU data exchange strategy and optimize the load order of data, so that the proposed method can deal with big data. Next, we parallelize the Cascade Hashing method on GPU. An improved parallel reduction and an improved parallel hashing ranking are proposed to fulfill this task. Finally, extensive experiments show that our image matching is about 20 times faster than SiftGPU on the same graphics card, nearly 100 times faster than the CPU CasHash method and hundreds of times faster than the CPU Kd-Tree based matching method. Further more, we introduce the epipolar constraint to the proposed method, and use the epipolar geometry to guide the feature matching procedure, which further reduces the matching cost.

  Click for Model/Code and Paper
A cooperative game for automated learning of elasto-plasticity knowledge graphs and models with AI-guided experimentation

Mar 08, 2019
Kun Wang, WaiChing Sun, Qiang Du

We introduce a multi-agent meta-modeling game to generate data, knowledge, and models that make predictions on constitutive responses of elasto-plastic materials. We introduce a new concept from graph theory where a modeler agent is tasked with evaluating all the modeling options recast as a directed multigraph and find the optimal path that links the source of the directed graph (e.g. strain history) to the target (e.g. stress) measured by an objective function. Meanwhile, the data agent, which is tasked with generating data from real or virtual experiments (e.g. molecular dynamics, discrete element simulations), interacts with the modeling agent sequentially and uses reinforcement learning to design new experiments to optimize the prediction capacity. Consequently, this treatment enables us to emulate an idealized scientific collaboration as selections of the optimal choices in a decision tree search done automatically via deep reinforcement learning.

  Click for Model/Code and Paper
4D Cardiac Ultrasound Standard Plane Location by Spatial-Temporal Correlation

Jul 20, 2016
Yun Gu, Guang-Zhong Yang, Jie Yang, Kun Sun

Echocardiography plays an important part in diagnostic aid in cardiac diseases. A critical step in echocardiography-aided diagnosis is to extract the standard planes since they tend to provide promising views to present different structures that are benefit to diagnosis. To this end, this paper proposes a spatial-temporal embedding framework to extract the standard view planes from 4D STIC (spatial-temporal image corre- lation) volumes. The proposed method is comprised of three stages, the frame smoothing, spatial-temporal embedding and final classification. In first stage, an L 0 smoothing filter is used to preprocess the frames that removes the noise and preserves the boundary. Then a compact repre- sentation is learned via embedding spatial and temporal features into a latent space in the supervised scheme considering both standard plane information and diagnosis result. In last stage, the learned features are fed into support vector machine to identify the standard plane. We eval- uate the proposed method on a 4D STIC volume dataset with 92 normal cases and 93 abnormal cases in three standard planes. It demonstrates that our method outperforms the baselines in both classification accuracy and computational efficiency.

* submitted to MICCAI 2016 

  Click for Model/Code and Paper
Learning from Web Data: the Benefit of Unsupervised Object Localization

Dec 21, 2018
Xiaoxiao Sun, Liang Zheng, Yu-Kun Lai, Jufeng Yang

Annotating a large number of training images is very time-consuming. In this background, this paper focuses on learning from easy-to-acquire web data and utilizes the learned model for fine-grained image classification in labeled datasets. Currently, the performance gain from training with web data is incremental, like a common saying "better than nothing, but not by much". Conventionally, the community looks to correcting the noisy web labels to select informative samples. In this work, we first systematically study the built-in gap between the web and standard datasets, i.e. different data distributions between the two kinds of data. Then, in addition to using web labels, we present an unsupervised object localization method, which provides critical insights into the object density and scale in web images. Specifically, we design two constraints on web data to substantially reduce the difference of data distributions for the web and standard datasets. First, we present a method to control the scale, localization and number of objects in the detected region. Second, we propose to select the regions containing objects that are consistent with the web tag. Based on the two constraints, we are able to process web images to reduce the gap, and the processed web data is used to better assist the standard dataset to train CNNs. Experiments on several fine-grained image classification datasets confirm that our method performs favorably against the state-of-the-art methods.

* 13 pages, 9 figures, 6 tables 

  Click for Model/Code and Paper
MARTA GANs: Unsupervised Representation Learning for Remote Sensing Image Classification

Nov 21, 2017
Daoyu Lin, Kun Fu, Yang Wang, Guangluan Xu, Xian Sun

With the development of deep learning, supervised learning has frequently been adopted to classify remotely sensed images using convolutional networks (CNNs). However, due to the limited amount of labeled data available, supervised learning is often difficult to carry out. Therefore, we proposed an unsupervised model called multiple-layer feature-matching generative adversarial networks (MARTA GANs) to learn a representation using only unlabeled data. MARTA GANs consists of both a generative model $G$ and a discriminative model $D$. We treat $D$ as a feature extractor. To fit the complex properties of remote sensing data, we use a fusion layer to merge the mid-level and global features. $G$ can produce numerous images that are similar to the training data; therefore, $D$ can learn better representations of remotely sensed images using the training data provided by $G$. The classification results on two widely used remote sensing image databases show that the proposed method significantly improves the classification performance compared with other state-of-the-art methods.

* IEEE Geoscience and Remote Sensing Letters ( Volume: 14, Issue: 11, Nov. 2017 ) 

  Click for Model/Code and Paper
A Training-free, One-shot Detection Framework For Geospatial Objects In Remote Sensing Images

Apr 04, 2019
Tengfei Zhang, Yue Zhang, Xian Sun, Menglong Yan, Yaoling Wang, Kun Fu

Deep learning based object detection has achieved great success. However, these supervised learning methods are data-hungry and time-consuming. This restriction makes them unsuitable for limited data and urgent tasks, especially in the applications of remote sensing. Inspired by the ability of humans to quickly learn new visual concepts from very few examples, we propose a training-free, one-shot geospatial object detection framework for remote sensing images. It consists of (1) a feature extractor with remote sensing domain knowledge, (2) a multi-level feature fusion method, (3) a novel similarity metric method, and (4) a 2-stage object detection pipeline. Experiments on sewage treatment plant and airport detections show that proposed method has achieved a certain effect. Our method can serve as a baseline for training-free, one-shot geospatial object detection.

* 5 pages 4 figures 

  Click for Model/Code and Paper
A Remote Sensing Image Dataset for Cloud Removal

Jan 03, 2019
Daoyu Lin, Guangluan Xu, Xiaoke Wang, Yang Wang, Xian Sun, Kun Fu

Cloud-based overlays are often present in optical remote sensing images, thus limiting the application of acquired data. Removing clouds is an indispensable pre-processing step in remote sensing image analysis. Deep learning has achieved great success in the field of remote sensing in recent years, including scene classification and change detection. However, deep learning is rarely applied in remote sensing image removal clouds. The reason is the lack of data sets for training neural networks. In order to solve this problem, this paper first proposed the Remote sensing Image Cloud rEmoving dataset (RICE). The proposed dataset consists of two parts: RICE1 contains 500 pairs of images, each pair has images with cloud and cloudless size of 512*512; RICE2 contains 450 sets of images, each set contains three 512*512 size images. , respectively, the reference picture without clouds, the picture of the cloud and the mask of its cloud. The dataset is freely available at \url{}.

  Click for Model/Code and Paper
Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multiscale Rotation Region Convolutional Neural Network

Jun 13, 2018
Xue Yang, Hao Sun, Xian Sun, Menglong Yan, Zhi Guo, Kun Fu

Ship detection is of great importance and full of challenges in the field of remote sensing. The complexity of application scenarios, the redundancy of detection region, and the difficulty of dense ship detection are all the main obstacles that limit the successful operation of traditional methods in ship detection. In this paper, we propose a brand new detection model based on multiscale rotational region convolutional neural network to solve the problems above. This model is mainly consist of five consecutive parts: Dense Feature Pyramid Network (DFPN), adaptive region of interest (ROI) Align, rotational bounding box regression, prow direction prediction and rotational nonmaximum suppression (R-NMS). First of all, the low-level location information and high-level semantic information are fully utilized through multiscale feature networks. Then, we design adaptive ROI Align to obtain high quality proposals which remain complete spatial and semantic information. Unlike most previous approaches, the prediction obtained by our method is the minimum bounding rectangle of the object with less redundant regions. Therefore, rotational region detection framework is more suitable to detect the dense object than traditional detection model. Additionally, we can find the berthing and sailing direction of ship through prediction. A detailed evaluation based on SRSS and DOTA dataset for rotation detection shows that our detection method has a competitive performance.

  Click for Model/Code and Paper
Comparison Network for One-Shot Conditional Object Detection

Apr 04, 2019
Tengfei Zhang, Yue Zhang, Xian Sun, Hao Sun, Menglong Yan, Xue Yang, Kun Fu

The current advances in object detection depend on large-scale datasets to get good performance. However, there may not always be sufficient samples in many scenarios, which leads to the research on few-shot detection as well as its extreme variation one-shot detection. In this paper, the one-shot detection has been formulated as a conditional probability problem. With this insight, a novel one-shot conditional object detection (OSCD) framework, referred as Comparison Network (ComparisonNet), has been proposed. Specifically, query and target image features are extracted through a Siamese network as mapped metrics of marginal probabilities. A two-stage detector for OSCD is introduced to compare the extracted query and target features with the learnable metric to approach the optimized non-linear conditional probability. Once trained, ComparisonNet can detect objects of both seen and unseen classes without further training, which also has the advantages including class-agnostic, training-free for unseen classes, and without catastrophic forgetting. Experiments show that the proposed approach achieves state-of-the-art performance on the proposed datasets of Fashion-MNIST and PASCAL VOC.

* 10 pages 

  Click for Model/Code and Paper
Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks

Jun 12, 2018
Xue Yang, Hao Sun, Kun Fu, Jirui Yang, Xian Sun, Menglong Yan, Zhi Guo

Ship detection has been playing a significant role in the field of remote sensing for a long time but it is still full of challenges. The main limitations of traditional ship detection methods usually lie in the complexity of application scenarios, the difficulty of intensive object detection and the redundancy of detection region. In order to solve such problems above, we propose a framework called Rotation Dense Feature Pyramid Networks (R-DFPN) which can effectively detect ship in different scenes including ocean and port. Specifically, we put forward the Dense Feature Pyramid Network (DFPN), which is aimed at solving the problem resulted from the narrow width of the ship. Compared with previous multi-scale detectors such as Feature Pyramid Network (FPN), DFPN builds the high-level semantic feature-maps for all scales by means of dense connections, through which enhances the feature propagation and encourages the feature reuse. Additionally, in the case of ship rotation and dense arrangement, we design a rotation anchor strategy to predict the minimum circumscribed rectangle of the object so as to reduce the redundant detection region and improve the recall. Furthermore, we also propose multi-scale ROI Align for the purpose of maintaining the completeness of semantic and spatial information. Experiments based on remote sensing images from Google Earth for ship detection show that our detection method based on R-DFPN representation has a state-of-the-art performance.

* Remote Sens. 2018, 10, 132 
* 14 pages, 11 figures 

  Click for Model/Code and Paper
Cross-Domain Complementary Learning with Synthetic Data for Multi-Person Part Segmentation

Jul 11, 2019
Kevin Lin, Lijuan Wang, Kun Luo, Yinpeng Chen, Zicheng Liu, Ming-Ting Sun

The success of supervised deep learning depends on the training labels. However, data labeling at pixel-level is very expensive, and people have been exploring synthetic data as an alternative. Even though it is easy to generate labels for synthetic data, the quality gap makes it challenging to transfer knowledge from synthetic data to real data. In this paper, we propose a novel technique, called cross-domain complementary learning that takes advantage of the rich variations of real data and the easily obtainable labels of synthetic data to learn multi-person part segmentation on real images without any human-annotated segmentation labels. To make sure the synthetic data and real data are aligned in a common latent space, we use an auxiliary task of human pose estimation to bridge the two domains. Without any real part segmentation training data, our method performs comparably to several supervised state-of-the-art approaches which require real part segmentation training data on Pascal-Person-Parts and COCO-DensePose datasets. We further demonstrate the generalizability of our method on predicting novel keypoints in the wild where no real data labels are available for the novel keypoints.

  Click for Model/Code and Paper
R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy

Nov 20, 2018
Xue Yang, Kun Fu, Hao Sun, Jirui Yang, Zhi Guo, Menglong Yan, Tengfei Zhang, Sun Xian

Object detection plays a vital role in natural scene and aerial scene and is full of challenges. Although many advanced algorithms have succeeded in the natural scene, the progress in the aerial scene has been slow due to the complexity of the aerial image and the large degree of freedom of remote sensing objects in scale, orientation, and density. In this paper, a novel multi-category rotation detector is proposed, which can efficiently detect small objects, arbitrary direction objects, and dense objects in complex remote sensing images. Specifically, the proposed model adopts a targeted feature fusion strategy called inception fusion network, which fully considers factors such as feature fusion, anchor sampling, and receptive field to improve the ability to handle small objects. Then we combine the pixel attention network and the channel attention network to weaken the noise information and highlight the objects feature. Finally, the rotational object detection algorithm is realized by redefining the rotating bounding box. Experiments on public datasets including DOTA, NWPU VHR-10 demonstrate that the proposed algorithm significantly outperforms state-of-the-art methods. The code and models will be available at

* 10 pages, 8 figures, 5 tables 

  Click for Model/Code and Paper
Efficient training and design of photonic neural network through neuroevolution

Aug 04, 2019
Tian Zhang, Jia Wang, Yihang Dan, Yuxiang Lanqiu, Jian Dai, Xu Han, Xiaojuan Sun, Kun Xu

Recently, optical neural networks (ONNs) integrated in photonic chips has received extensive attention because they are expected to implement the same pattern recognition tasks in the electronic platforms with high efficiency and low power consumption. However, the current lack of various learning algorithms to train the ONNs obstructs their further development. In this article, we propose a novel learning strategy based on neuroevolution to design and train the ONNs. Two typical neuroevolution algorithms are used to determine the hyper-parameters of the ONNs and to optimize the weights (phase shifters) in the connections. In order to demonstrate the effectiveness of the training algorithms, the trained ONNs are applied in the classification tasks for iris plants dataset, wine recognition dataset and modulation formats recognition. The calculated results exhibit that the training algorithms based on neuroevolution are competitive with other traditional learning algorithms on both accuracy and stability. Compared with previous works, we introduce an efficient training method for the ONNs and demonstrate their broad application prospects in pattern recognition, reinforcement learning and so on.

* 11 pages, 4 figures 

  Click for Model/Code and Paper
Image Matters: Visually modeling user behaviors using Advanced Model Server

Sep 04, 2018
Tiezheng Ge, Liqin Zhao, Guorui Zhou, Keyu Chen, Shuying Liu, Huimin Yi, Zelin Hu, Bochao Liu, Peng Sun, Haoyu Liu, Pengtao Yi, Sui Huang, Zhiqiang Zhang, Xiaoqiang Zhu, Yu Zhang, Kun Gai

In Taobao, the largest e-commerce platform in China, billions of items are provided and typically displayed with their images. For better user experience and business effectiveness, Click Through Rate (CTR) prediction in online advertising system exploits abundant user historical behaviors to identify whether a user is interested in a candidate ad. Enhancing behavior representations with user behavior images will help understand user's visual preference and improve the accuracy of CTR prediction greatly. So we propose to model user preference jointly with user behavior ID features and behavior images. However, training with user behavior images brings tens to hundreds of images in one sample, giving rise to a great challenge in both communication and computation. To handle these challenges, we propose a novel and efficient distributed machine learning paradigm called Advanced Model Server (AMS). With the well known Parameter Server (PS) framework, each server node handles a separate part of parameters and updates them independently. AMS goes beyond this and is designed to be capable of learning a unified image descriptor model shared by all server nodes which embeds large images into low dimensional high level features before transmitting images to worker nodes. AMS thus dramatically reduces the communication load and enables the arduous joint training process. Based on AMS, the methods of effectively combining the images and ID features are carefully studied, and then we propose a Deep Image CTR Model. Our approach is shown to achieve significant improvements in both online and offline evaluations, and has been deployed in Taobao display advertising system serving the main traffic.

* CIKM 2018 

  Click for Model/Code and Paper
Scalable and accurate deep learning for electronic health records

May 11, 2018
Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M. Dai, Nissan Hajaj, Peter J. Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, Kun Zhang, Gavin E. Duggan, Gerardo Flores, Michaela Hardt, Jamie Irvine, Quoc Le, Kurt Litsch, Jake Marcus, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L. Volchenboum, Katherine Chou, Michael Pearson, Srinivasan Madabushi, Nigam H. Shah, Atul J. Butte, Michael Howell, Claire Cui, Greg Corrado, Jeff Dean

Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire, raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two U.S. academic medical centers with 216,221 adult patients hospitalized for at least 24 hours. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting in-hospital mortality (AUROC across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed state-of-the-art traditional predictive models in all cases. We also present a case-study of a neural-network attribution system, which illustrates how clinicians can gain some transparency into the predictions. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios, complete with explanations that directly highlight evidence in the patient's chart.

* npj Digital Medicine 1:18 (2018) 
* Published version from 

  Click for Model/Code and Paper
Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55

Oct 27, 2017
Li Yi, Lin Shao, Manolis Savva, Haibin Huang, Yang Zhou, Qirui Wang, Benjamin Graham, Martin Engelcke, Roman Klokov, Victor Lempitsky, Yuan Gan, Pengyu Wang, Kun Liu, Fenggen Yu, Panpan Shui, Bingyang Hu, Yan Zhang, Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Minki Jeong, Jaehoon Choi, Changick Kim, Angom Geetchandra, Narasimha Murthy, Bhargava Ramu, Bharadwaj Manda, M Ramanathan, Gautam Kumar, P Preetham, Siddharth Srivastava, Swati Bhugra, Brejesh Lall, Christian Haene, Shubham Tulsiani, Jitendra Malik, Jared Lafer, Ramsey Jones, Siyuan Li, Jie Lu, Shi Jin, Jingyi Yu, Qixing Huang, Evangelos Kalogerakis, Silvio Savarese, Pat Hanrahan, Thomas Funkhouser, Hao Su, Leonidas Guibas

We introduce a large-scale 3D shape understanding benchmark using data and annotation from ShapeNet 3D object database. The benchmark consists of two tasks: part-level segmentation of 3D shapes and 3D reconstruction from single view images. Ten teams have participated in the challenge and the best performing teams have outperformed state-of-the-art approaches on both tasks. A few novel deep learning architectures have been proposed on various 3D representations on both tasks. We report the techniques used by each team and the corresponding performances. In addition, we summarize the major discoveries from the reported results and possible trends for the future work in the field.

  Click for Model/Code and Paper
Understanding Social Networks using Transfer Learning

Oct 16, 2019
Jun Sun, Steffen Staab, Jérôme Kunegis

A detailed understanding of users contributes to the understanding of the Web's evolution, and to the development of Web applications. Although for new Web platforms such a study is especially important, it is often jeopardized by the lack of knowledge about novel phenomena due to the sparsity of data. Akin to human transfer of experiences from one domain to the next, transfer learning as a subfield of machine learning adapts knowledge acquired in one domain to a new domain. We systematically investigate how the concept of transfer learning may be applied to the study of users on newly created (emerging) Web platforms, and propose our transfer learning-based approach, TraNet. We show two use cases where TraNet is applied to tasks involving the identification of user trust and roles on different Web platforms. We compare the performance of TraNet with other approaches and find that our approach can best transfer knowledge on users across platforms in the given tasks.

* IEEE Computer (Volume: 51, Issue: 6, June 2018) 
* 11 pages, 4 figures, IEEE Computer. arXiv admin note: text overlap with arXiv:1611.02941 

  Click for Model/Code and Paper
Predicting User Roles in Social Networks using Transfer Learning with Feature Transformation

Nov 09, 2016
Jun Sun, Jérôme Kunegis, Steffen Staab

How can we recognise social roles of people, given a completely unlabelled social network? We present a transfer learning approach to network role classification based on feature transformations from each network's local feature distribution to a global feature space. Experiments are carried out on real-world datasets. (See manuscript for the full abstract.)

* 8 pages, 5 figures, IEEE ICDMW 2016 

  Click for Model/Code and Paper