Models, code, and papers for "Yuan Yu":

Experimentally detecting a quantum change point via Bayesian inference

Jan 23, 2018
Shang Yu, Chang-Jiang Huang, Jian-Shun Tang, Zhih-Ahn Jia, Yi-Tao Wang, Zhi-Jin Ke, Wei Liu, Xiao Liu, Zong-Quan Zhou, Ze-Di Cheng, Jin-Shi Xu, Yu-Chun Wu, Yuan-Yuan Zhao, Guo-Yong Xiang, Chuan-Feng Li, Guang-Can Guo, Gael Sentís, Ramon Muñoz-Tapia

Detecting a change point is a crucial task in statistics that has been recently extended to the quantum realm. A source state generator that emits a series of single photons in a default state suffers an alteration at some point and starts to emit photons in a mutated state. The problem consists in identifying the point where the change took place. In this work, we consider a learning agent that applies Bayesian inference on experimental data to solve this problem. This learning machine adjusts the measurement over each photon according to the past experimental results finds the change position in an online fashion. Our results show that the local-detection success probability can be largely improved by using such a machine learning technique. This protocol provides a tool for improvement in many applications where a sequence of identical quantum states is required.

* Phys. Rev. A 98, 040301 (2018) 

  Click for Model/Code and Paper
CoachAI: A Project for Microscopic Badminton Match Data Collection and Tactical Analysis

Jul 12, 2019
Tzu-Han Hsu, Ching-Hsuan Chen, Nyan Ping Ju, Tsì-Uí İk, Wen-Chih Peng, Chih-Chuan Wang, Yu-Shuen Wang, Yuan-Hsiang Lin, Yu-Chee Tseng, Jiun-Long Huang, Yu-Tai Ching

Computer vision based object tracking has been used to annotate and augment sports video. For sports learning and training, video replay is often used in post-match review and training review for tactical analysis and movement analysis. For automatically and systematically competition data collection and tactical analysis, a project called CoachAI has been supported by the Ministry of Science and Technology, Taiwan. The proposed project also includes research of data visualization, connected training auxiliary devices, and data warehouse. Deep learning techniques will be used to develop video-based real-time microscopic competition data collection based on broadcast competition video. Machine learning techniques will be used to develop a tactical analysis. To reveal data in more understandable forms and to help in pre-match training, AR/VR techniques will be used to visualize data, tactics, and so on. In addition, training auxiliary devices including smart badminton rackets and connected serving machines will be developed based on the IoT technology to further utilize competition data and tactical data and boost training efficiency. Especially, the connected serving machines will be developed to perform specified tactics and to interact with players in their training.


  Click for Model/Code and Paper
Mesh Variational Autoencoders with Edge Contraction Pooling

Aug 07, 2019
Yu-Jie Yuan, Yu-Kun Lai, Jie Yang, Hongbo Fu, Lin Gao

3D shape analysis is an important research topic in computer vision and graphics. While existing methods have generalized image-based deep learning to meshes using graph-based convolutions, the lack of an effective pooling operation restricts the learning capability of their networks. In this paper, we propose a novel pooling operation for mesh datasets with the same connectivity but different geometry, by building a mesh hierarchy using mesh simplification. For this purpose, we develop a modified mesh simplification method to avoid generating highly irregularly sized triangles. Our pooling operation effectively encodes the correspondence between coarser and finer meshes in the hierarchy. We then present a variational auto-encoder structure with the edge contraction pooling and graph-based convolutions, to explore probability latent spaces of 3D surfaces. Our network requires far fewer parameters than the original mesh VAE and thus can handle denser models thanks to our new pooling operation and convolutional kernels. Our evaluation also shows that our method has better generalization ability and is more reliable in various applications, including shape generation, shape interpolation and shape embedding.


  Click for Model/Code and Paper
Descriptor Ensemble: An Unsupervised Approach to Descriptor Fusion in the Homography Space

Dec 13, 2014
Yuan-Ting Hu, Yen-Yu Lin, Hsin-Yi Chen, Kuang-Jui Hsu, Bing-Yu Chen

With the aim to improve the performance of feature matching, we present an unsupervised approach to fuse various local descriptors in the space of homographies. Inspired by the observation that the homographies of correct feature correspondences vary smoothly along the spatial domain, our approach stands on the unsupervised nature of feature matching, and can select a good descriptor for matching each feature point. Specifically, the homography space serves as the common domain, in which a correspondence obtained by any descriptor is considered as a point, for integrating various heterogeneous descriptors. Both geometric coherence and spatial continuity among correspondences are considered via computing their geodesic distances in the space. In this way, mutual verification across different descriptors is allowed, and correct correspondences will be highlighted with a high degree of consistency (i.e., short geodesic distances here). It follows that one-class SVM can be applied to identifying these correct correspondences, and boosts the performance of feature matching. The proposed approach is comprehensively compared with the state-of-the-art approaches, and evaluated on four benchmarks of image matching. The promising results manifest its effectiveness.


  Click for Model/Code and Paper
SDM-NET: Deep Generative Network for Structured Deformable Mesh

Sep 03, 2019
Lin Gao, Jie Yang, Tong Wu, Yu-Jie Yuan, Hongbo Fu, Yu-Kun Lai, Hao Zhang

We introduce SDM-NET, a deep generative neural network which produces structured deformable meshes. Specifically, the network is trained to generate a spatial arrangement of closed, deformable mesh parts, which respect the global part structure of a shape collection, e.g., chairs, airplanes, etc. Our key observation is that while the overall structure of a 3D shape can be complex, the shape can usually be decomposed into a set of parts, each homeomorphic to a box, and the finer-scale geometry of the part can be recovered by deforming the box. The architecture of SDM-NET is that of a two-level variational autoencoder (VAE). At the part level, a PartVAE learns a deformable model of part geometries. At the structural level, we train a Structured Parts VAE (SP-VAE), which jointly learns the part structure of a shape collection and the part geometries, ensuring a coherence between global shape structure and surface details. Through extensive experiments and comparisons with the state-of-the-art deep generative models of shapes, we demonstrate the superiority of SDM-NET in generating meshes with visual quality, flexible topology, and meaningful structures, which benefit shape interpolation and other subsequently modeling tasks.

* Conditionally Accepted to Siggraph Asia 2019 

  Click for Model/Code and Paper
Dixit: Interactive Visual Storytelling via Term Manipulation

Mar 11, 2019
Chao-Chun Hsu, Yu-Hua Chen, Zi-Yuan Chen, Hsin-Yu Lin, Ting-Hao 'Kenneth' Huang, Lun-Wei Ku

In this paper, we introduce Dixit, an interactive visual storytelling system that the user interacts with iteratively to compose a short story for a photo sequence. The user initiates the process by uploading a sequence of photos. Dixit first extracts text terms from each photo which describe the objects (e.g., boy, bike) or actions (e.g., sleep) in the photo, and then allows the user to add new terms or remove existing terms. Dixit then generates a short story based on these terms. Behind the scenes, Dixit uses an LSTM-based model trained on image caption data and FrameNet to distill terms from each image and utilizes a transformer decoder to compose a context-coherent story. Users change images or terms iteratively with Dixit to create the most ideal story. Dixit also allows users to manually edit and rate stories. The proposed procedure opens up possibilities for interpretable and controllable visual storytelling, allowing users to understand the story formation rationale and to intervene in the generation process.

* WWW'19 Demo, demo video: https://www.youtube.com/watch?v=CUu1MOwnveI 

  Click for Model/Code and Paper
Abstractive Dialog Summarization with Semantic Scaffolds

Oct 02, 2019
Lin Yuan, Zhou Yu

The demand for abstractive dialog summary is growing in real-world applications. For example, customer service center or hospitals would like to summarize customer service interaction and doctor-patient interaction. However, few researchers explored abstractive summarization on dialogs due to the lack of suitable datasets. We propose an abstractive dialog summarization dataset based on MultiWOZ. If we directly apply previous state-of-the-art document summarization methods on dialogs, there are two significant drawbacks: the informative entities such as restaurant names are difficult to preserve, and the contents from different dialog domains are sometimes mismatched. To address these two drawbacks, we propose Scaffold Pointer Network (SPNet)to utilize the existing annotation on speaker role, semantic slot and dialog domain. SPNet incorporates these semantic scaffolds for dialog summarization. Since ROUGE cannot capture the two drawbacks mentioned, we also propose a new evaluation metric that considers critical informative entities in the text. On MultiWOZ, our proposed SPNet outperforms state-of-the-art abstractive summarization methods on all the automatic and human evaluation metrics.

* unpublished preprint 

  Click for Model/Code and Paper
Feature-Less End-to-End Nested Term Extraction

Aug 15, 2019
Yuze Gao, Yu Yuan

In this paper, we proposed a deep learning-based end-to-end method on the domain specified automatic term extraction (ATE), it considers possible term spans within a fixed length in the sentence and predicts them whether they can be conceptual terms. In comparison with current ATE methods, the model supports nested term extraction and does not crucially need extra (extracted) features. Results show that it can achieve high recall and a comparable precision on term extraction task with inputting segmented raw text.

* NLPCC Workshop on Explainable Artificial Intelligence 2019 

  Click for Model/Code and Paper
Variance-Based Risk Estimations in Markov Processes via Transformation with State Lumping

Jul 09, 2019
Shuai Ma, Jia Yuan Yu

Variance plays a crucial role in risk-sensitive reinforcement learning, and most risk measures can be analyzed via variance. In this paper, we consider two law-invariant risks as examples: mean-variance risk and exponential utility risk. With the aid of the state-augmentation transformation (SAT), we show that, the two risks can be estimated in Markov decision processes (MDPs) with a stochastic transition-based reward and a randomized policy. To relieve the enlarged state space, a novel definition of isotopic states is proposed for state lumping, considering the special structure of the transformed transition probability. In the numerical experiment, we illustrate state lumping in the SAT, errors from a naive reward simplification, and the validity of the SAT for the two risk estimations.

* 7 pages, 7 figures, SMC 2019 accepted. arXiv admin note: text overlap with arXiv:1907.04269 

  Click for Model/Code and Paper
Distribution Estimation in Discounted MDPs via a Transformation

Apr 16, 2018
Shuai Ma, Jia Yuan Yu

Although the general deterministic reward function in MDPs takes three arguments - current state, action, and next state; it is often simplified to a function of two arguments - current state and action. The former is called a transition-based reward function, whereas the latter is called a state-based reward function. When the objective is a function of the expected cumulative reward only, this simplification works perfectly. However, when the objective is risk-sensitive - e.g., depends on the reward distribution, this simplification leads to incorrect values of the objective. This paper studies the distribution estimation of the cumulative discounted reward in infinite-horizon MDPs with finite state and action spaces. First, by taking the Value-at-Risk (VaR) objective as an example, we illustrate and analyze the error from the above simplification on the reward distribution. Next, we propose a transformation for MDPs to preserve the reward distribution and convert transition-based reward functions to deterministic state-based reward functions. This transformation works whether the transition-based reward function is deterministic or stochastic. Lastly, we show how to estimate the reward distribution after applying the proposed transformation in different settings, provided that the distribution is approximately normal.


  Click for Model/Code and Paper
The Merits of Sharing a Ride

Dec 19, 2017
Pooyan Ehsani, Jia Yuan Yu

The culture of sharing instead of ownership is sharply increasing in individuals behaviors. Particularly in transportation, concepts of sharing a ride in either carpooling or ridesharing have been recently adopted. An efficient optimization approach to match passengers in real-time is the core of any ridesharing system. In this paper, we model ridesharing as an online matching problem on general graphs such that passengers do not drive private cars and use shared taxis. We propose an optimization algorithm to solve it. The outlined algorithm calculates the optimal waiting time when a passenger arrives. This leads to a matching with minimal overall overheads while maximizing the number of partnerships. To evaluate the behavior of our algorithm, we used NYC taxi real-life data set. Results represent a substantial reduction in overall overheads.


  Click for Model/Code and Paper
Effect of Reward Function Choices in MDPs with Value-at-Risk

Feb 27, 2017
Shuai Ma, Jia Yuan Yu

This paper studies Value-at-Risk (VaR) problems in short- and long-horizon Markov decision processes (MDPs) with finite state space and two different reward functions. Firstly we examine the effects of two reward functions under two criteria in a short-horizon MDP. We show that under the VaR criterion, when the original reward function is on both current and next states, the reward simplification will change the VaR. Secondly, for long-horizon MDPs, we estimate the Pareto front of the total reward distribution set with the aid of spectral theory and the central limit theorem. Since the estimation is for a Markov process with the simplified reward function only, we present a transformation algorithm for the Markov process with the original reward function, in order to estimate the Pareto front with an intact total reward distribution.

* 23 pages, 5 figures 

  Click for Model/Code and Paper
Measuring Asymmetric Opinions on Online Social Interrelationship with Language and Network Features

Nov 15, 2016
Bo Wang, Yanshu Yu, Yuan Wang

Instead of studying the properties of social relationship from an objective view, in this paper, we focus on individuals' subjective and asymmetric opinions on their interrelationships. Inspired by the theories from sociolinguistics, we investigate two individuals' opinions on their interrelationship with their interactive language features. Eliminating the difference of personal language style, we clarify that the asymmetry of interactive language feature values can indicate individuals' asymmetric opinions on their interrelationship. We also discuss how the degree of opinions' asymmetry is related to the individuals' personality traits. Furthermore, to measure the individuals' asymmetric opinions on interrelationship concretely, we develop a novel model synthetizing interactive language and social network features. The experimental results with Enron email dataset provide multiple evidences of the asymmetric opinions on interrelationship, and also verify the effectiveness of the proposed model in measuring the degree of opinions' asymmetry.


  Click for Model/Code and Paper
Adaptive and Optimal Online Linear Regression on L1-balls

Jan 23, 2012
Sébastien Gerchinovitz, Jia Yuan Yu

We consider the problem of online linear regression on individual sequences. The goal in this paper is for the forecaster to output sequential predictions which are, after T time rounds, almost as good as the ones output by the best linear predictor in a given L1-ball in R^d. We consider both the cases where the dimension d is small and large relative to the time horizon T. We first present regret bounds with optimal dependencies on the sizes U, X and Y of the L1-ball, the input data and the observations. The minimax regret is shown to exhibit a regime transition around the point d = sqrt(T) U X / (2 Y). Furthermore, we present efficient algorithms that are adaptive, i.e., they do not require the knowledge of U, X, and Y, but still achieve nearly optimal regret bounds.


  Click for Model/Code and Paper
PointIT: A Fast Tracking Framework Based on 3D Instance Segmentation

Feb 18, 2019
Yuan Wang, Yang Yu, Ming Liu

Recently most popular tracking frameworks focus on 2D image sequences. They seldom track the 3D object in point clouds. In this paper, we propose PointIT, a fast, simple tracking method based on 3D on-road instance segmentation. Firstly, we transform 3D LiDAR data into the spherical image with the size of 64 x 512 x 4 and feed it into instance segment model to get the predicted instance mask for each class. Then we use MobileNet as our primary encoder instead of the original ResNet to reduce the computational complexity. Finally, we extend the Sort algorithm with this instance framework to realize tracking in the 3D LiDAR point cloud data. The model is trained on the spherical images dataset with the corresponding instance label masks which are provided by KITTI 3D Object Track dataset. According to the experiment results, our network can achieve on Average Precision (AP) of 0.617 and the performance of multi-tracking task has also been improved.


  Click for Model/Code and Paper
Safe Driving Capacity of Autonomous Vehicles

May 26, 2018
Yuan-Ying Wang, Hung-Yu Wei

An excellent self-driving car is expected to take its passengers safely and efficiently from one place to another. However, different ways of defining safety and efficiency may significantly affect the conclusion we make. In this paper, we give formal definitions to the safe state of a road and safe state of a vehicle using the syntax of linear temporal logic (LTL). We then propose the concept of safe driving throughput (SDT) and safe driving capacity (SDC) which measure the amount of vehicles in the safe state on a road. We analyze how SDT is affected by different factors. We show the analytic difference of SDC between the road with perception-based vehicles (PBV) and the road with cooperative-based vehicles (CBV). We claim that through proper design, the SDC of the road filled with PBVs will be upper-bounded by the SDC of the road filled with CBVs.

* 5 pages, VTC 2018 

  Click for Model/Code and Paper
Face Attention Network: An Effective Face Detector for the Occluded Faces

Nov 22, 2017
Jianfeng Wang, Ye Yuan, Gang Yu

The performance of face detection has been largely improved with the development of convolutional neural network. However, the occlusion issue due to mask and sunglasses, is still a challenging problem. The improvement on the recall of these occluded cases usually brings the risk of high false positives. In this paper, we present a novel face detector called Face Attention Network (FAN), which can significantly improve the recall of the face detection problem in the occluded case without compromising the speed. More specifically, we propose a new anchor-level attention, which will highlight the features from the face region. Integrated with our anchor assign strategy and data augmentation techniques, we obtain state-of-art results on public face detection benchmarks like WiderFace and MAFA. The code will be released for reproduction.


  Click for Model/Code and Paper
Functional Bandits

May 10, 2014
Long Tran-Thanh, Jia Yuan Yu

We introduce the functional bandit problem, where the objective is to find an arm that optimises a known functional of the unknown arm-reward distributions. These problems arise in many settings such as maximum entropy methods in natural language processing, and risk-averse decision-making, but current best-arm identification techniques fail in these domains. We propose a new approach, that combines functional estimation and arm elimination, to tackle this problem. This method achieves provably efficient performance guarantees. In addition, we illustrate this method on a number of important functionals in risk management and information theory, and refine our generic theoretical results in those cases.


  Click for Model/Code and Paper
Risk-Averse Action Selection Using Extreme Value Theory Estimates of the CVaR

Dec 03, 2019
Dylan Troop, Frédéric Godin, Jia Yuan Yu

The Conditional Value-at-Risk (CVaR) is a useful risk measure in machine learning, finance, insurance, energy, etc. When the CVaR confidence parameter is very high, estimation by sample averaging exhibits high variance due to the limited number of samples above the corresponding threshold. To mitigate this problem, we present an estimation procedure for the CVaR that combines extreme value theory and a recently introduced method of automated threshold selection by Bader et al. (2018). Under appropriate conditions, we estimate the tail risk using a generalized Pareto distribution. We compare empirically this estimation procedure with the naive method of sample averaging, and show an improvement in accuracy for some specific cases. We also show how the estimation procedure can be used in reinforcement learning by applying our method to the multi-armed bandit problem where the goal is to avoid catastrophic risk.


  Click for Model/Code and Paper