Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chao Yan

AD-Aligning: Emulating Human-like Generalization for Cognitive Domain Adaptation in Deep Learning

May 15, 2024
Zhuoying Li, Bohua Wan, Cong Mu, Ruzhang Zhao, Shushan Qiu, Chao Yan

Domain adaptation is pivotal for enabling deep learning models to generalize across diverse domains, a task complicated by variations in presentation and cognitive nuances. In this paper, we introduce AD-Aligning, a novel approach that combines adversarial training with source-target domain alignment to enhance generalization capabilities. By pretraining with Coral loss and standard loss, AD-Aligning aligns target domain statistics with those of the pretrained encoder, preserving robustness while accommodating domain shifts. Through extensive experiments on diverse datasets and domain shift scenarios, including noise-induced shifts and cognitive domain adaptation tasks, we demonstrate AD-Aligning's superior performance compared to existing methods such as Deep Coral and ADDA. Our findings highlight AD-Aligning's ability to emulate the nuanced cognitive processes inherent in human perception, making it a promising solution for real-world applications requiring adaptable and robust domain adaptation strategies.

Via

Access Paper or Ask Questions

Predicting NVIDIA's Next-Day Stock Price: A Comparative Analysis of LSTM, MLP, ARIMA, and ARIMA-GARCH Models

May 14, 2024
Yiluan Xing, Chao Yan, Cathy Chang Xie

Forecasting stock prices remains a considerable challenge in financial markets, bearing significant implications for investors, traders, and financial institutions. Amid the ongoing AI revolution, NVIDIA has emerged as a key player driving innovation across various sectors. Given its prominence, we chose NVIDIA as the subject of our study.

* 7 pages, 4 figures, 2 tables, conference paper

Via

Access Paper or Ask Questions

Optimizing Contrail Detection: A Deep Learning Approach with EfficientNet-b4 Encoding

Apr 20, 2024
Qunwei Lin, Qian Leng, Zhicheng Ding, Chao Yan, Xiaonan Xu

Figure 1 for Optimizing Contrail Detection: A Deep Learning Approach with EfficientNet-b4 Encoding

Figure 2 for Optimizing Contrail Detection: A Deep Learning Approach with EfficientNet-b4 Encoding

Figure 3 for Optimizing Contrail Detection: A Deep Learning Approach with EfficientNet-b4 Encoding

In the pursuit of environmental sustainability, the aviation industry faces the challenge of minimizing its ecological footprint. Among the key solutions is contrail avoidance, targeting the linear ice-crystal clouds produced by aircraft exhaust. These contrails exacerbate global warming by trapping atmospheric heat, necessitating precise segmentation and comprehensive analysis of contrail images to gauge their environmental impact. However, this segmentation task is complex due to the varying appearances of contrails under different atmospheric conditions and potential misalignment issues in predictive modeling. This paper presents an innovative deep-learning approach utilizing the efficient net-b4 encoder for feature extraction, seamlessly integrating misalignment correction, soft labeling, and pseudo-labeling techniques to enhance the accuracy and efficiency of contrail detection in satellite imagery. The proposed methodology aims to redefine contrail image analysis and contribute to the objectives of sustainable aviation by providing a robust framework for precise contrail detection and analysis in satellite imagery, thus aiding in the mitigation of aviation's environmental impact.

Via

Access Paper or Ask Questions

MP-MVS: Multi-Scale Windows PatchMatch and Planar Prior Multi-View Stereo

Sep 23, 2023
Rongxuan Tan, Qing Wang, Xueyan Wang, Chao Yan, Yang Sun, Youyang Feng

Figure 1 for MP-MVS: Multi-Scale Windows PatchMatch and Planar Prior Multi-View Stereo

Figure 2 for MP-MVS: Multi-Scale Windows PatchMatch and Planar Prior Multi-View Stereo

Figure 3 for MP-MVS: Multi-Scale Windows PatchMatch and Planar Prior Multi-View Stereo

Figure 4 for MP-MVS: Multi-Scale Windows PatchMatch and Planar Prior Multi-View Stereo

Significant strides have been made in enhancing the accuracy of Multi-View Stereo (MVS)-based 3D reconstruction. However, untextured areas with unstable photometric consistency often remain incompletely reconstructed. In this paper, we propose a resilient and effective multi-view stereo approach (MP-MVS). We design a multi-scale windows PatchMatch (mPM) to obtain reliable depth of untextured areas. In contrast with other multi-scale approaches, which is faster and can be easily extended to PatchMatch-based MVS approaches. Subsequently, we improve the existing checkerboard sampling schemes by limiting our sampling to distant regions, which can effectively improve the efficiency of spatial propagation while mitigating outlier generation. Finally, we introduce and improve planar prior assisted PatchMatch of ACMP. Instead of relying on photometric consistency, we utilize geometric consistency information between multi-views to select reliable triangulated vertices. This strategy can obtain a more accurate planar prior model to rectify photometric consistency measurements. Our approach has been tested on the ETH3D High-res multi-view benchmark with several state-of-the-art approaches. The results demonstrate that our approach can reach the state-of-the-art. The associated codes will be accessible at https://github.com/RongxuanTan/MP-MVS.

Via

Access Paper or Ask Questions

Split Learning for Distributed Collaborative Training of Deep Learning Models in Health Informatics

Aug 21, 2023
Zhuohang Li, Chao Yan, Xinmeng Zhang, Gharib Gharibi, Zhijun Yin, Xiaoqian Jiang, Bradley A. Malin

Figure 1 for Split Learning for Distributed Collaborative Training of Deep Learning Models in Health Informatics

Figure 2 for Split Learning for Distributed Collaborative Training of Deep Learning Models in Health Informatics

Figure 3 for Split Learning for Distributed Collaborative Training of Deep Learning Models in Health Informatics

Figure 4 for Split Learning for Distributed Collaborative Training of Deep Learning Models in Health Informatics

Deep learning continues to rapidly evolve and is now demonstrating remarkable potential for numerous medical prediction tasks. However, realizing deep learning models that generalize across healthcare organizations is challenging. This is due, in part, to the inherent siloed nature of these organizations and patient privacy requirements. To address this problem, we illustrate how split learning can enable collaborative training of deep learning models across disparate and privately maintained health datasets, while keeping the original records and model parameters private. We introduce a new privacy-preserving distributed learning framework that offers a higher level of privacy compared to conventional federated learning. We use several biomedical imaging and electronic health record (EHR) datasets to show that deep learning models trained via split learning can achieve highly similar performance to their centralized and federated counterparts while greatly improving computational efficiency and reducing privacy risks.

Via

Access Paper or Ask Questions

Private Everlasting Prediction

May 16, 2023
Moni Naor, Kobbi Nissim, Uri Stemmer, Chao Yan

Figure 1 for Private Everlasting Prediction

Figure 2 for Private Everlasting Prediction

Figure 3 for Private Everlasting Prediction

Figure 4 for Private Everlasting Prediction

A private learner is trained on a sample of labeled points and generates a hypothesis that can be used for predicting the labels of newly sampled points while protecting the privacy of the training set [Kasiviswannathan et al., FOCS 2008]. Research uncovered that private learners may need to exhibit significantly higher sample complexity than non-private learners as is the case with, e.g., learning of one-dimensional threshold functions [Bun et al., FOCS 2015, Alon et al., STOC 2019]. We explore prediction as an alternative to learning. Instead of putting forward a hypothesis, a predictor answers a stream of classification queries. Earlier work has considered a private prediction model with just a single classification query [Dwork and Feldman, COLT 2018]. We observe that when answering a stream of queries, a predictor must modify the hypothesis it uses over time, and, furthermore, that it must use the queries for this modification, hence introducing potential privacy risks with respect to the queries themselves. We introduce private everlasting prediction taking into account the privacy of both the training set and the (adaptively chosen) queries made to the predictor. We then present a generic construction of private everlasting predictors in the PAC model. The sample complexity of the initial training sample in our construction is quadratic (up to polylog factors) in the VC dimension of the concept class. Our construction allows prediction for all concept classes with finite VC dimension, and in particular threshold functions with constant size initial training sample, even when considered over infinite domains, whereas it is known that the sample complexity of privately learning threshold functions must grow as a function of the domain size and hence is impossible for infinite domains.

Via

Access Paper or Ask Questions

A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models

Aug 02, 2022
Chao Yan, Yao Yan, Zhiyu Wan, Ziqi Zhang, Larsson Omberg, Justin Guinney, Sean D. Mooney, Bradley A. Malin

Figure 1 for A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models

Figure 2 for A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models

Figure 3 for A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models

Figure 4 for A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models

Synthetic health data have the potential to mitigate privacy concerns when sharing data to support biomedical research and the development of innovative healthcare applications. Modern approaches for data generation based on machine learning, generative adversarial networks (GAN) methods in particular, continue to evolve and demonstrate remarkable potential. Yet there is a lack of a systematic assessment framework to benchmark methods as they emerge and determine which methods are most appropriate for which use cases. In this work, we introduce a generalizable benchmarking framework to appraise key characteristics of synthetic health data with respect to utility and privacy metrics. We apply the framework to evaluate synthetic data generation methods for electronic health records (EHRs) data from two large academic medical centers with respect to several use cases. The results illustrate that there is a utility-privacy tradeoff for sharing synthetic EHR data. The results further indicate that no method is unequivocally the best on all criteria in each use case, which makes it evident why synthetic data generation methods need to be assessed in context.

Via

Access Paper or Ask Questions

Distillation to Enhance the Portability of Risk Models Across Institutions with Large Patient Claims Database

Jul 06, 2022
Steve Nyemba, Chao Yan, Ziqi Zhang, Amol Rajmane, Pablo Meyer, Prithwish Chakraborty, Bradley Malin

Figure 1 for Distillation to Enhance the Portability of Risk Models Across Institutions with Large Patient Claims Database

Figure 2 for Distillation to Enhance the Portability of Risk Models Across Institutions with Large Patient Claims Database

Figure 3 for Distillation to Enhance the Portability of Risk Models Across Institutions with Large Patient Claims Database

Figure 4 for Distillation to Enhance the Portability of Risk Models Across Institutions with Large Patient Claims Database

Artificial intelligence, and particularly machine learning (ML), is increasingly developed and deployed to support healthcare in a variety of settings. However, clinical decision support (CDS) technologies based on ML need to be portable if they are to be adopted on a broad scale. In this respect, models developed at one institution should be reusable at another. Yet there are numerous examples of portability failure, particularly due to naive application of ML models. Portability failure can lead to suboptimal care and medical errors, which ultimately could prevent the adoption of ML-based CDS in practice. One specific healthcare challenge that could benefit from enhanced portability is the prediction of 30-day readmission risk. Research to date has shown that deep learning models can be effective at modeling such risk. In this work, we investigate the practicality of model portability through a cross-site evaluation of readmission prediction models. To do so, we apply a recurrent neural network, augmented with self-attention and blended with expert features, to build readmission prediction models for two independent large scale claims datasets. We further present a novel transfer learning technique that adapts the well-known method of born-again network (BAN) training. Our experiments show that direct application of ML models trained at one institution and tested at another institution perform worse than models trained and tested at the same institution. We further show that the transfer learning approach based on the BAN produces models that are better than those trained on just a single institution's data. Notably, this improvement is consistent across both sites and occurs after a single retraining, which illustrates the potential for a cheap and general model transfer mechanism of readmission risk prediction.

Via

Access Paper or Ask Questions

Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems

Oct 16, 2021
Zhongli Li, Wenxuan Zhang, Chao Yan, Qingyu Zhou, Chao Li, Hongzhi Liu, Yunbo Cao

Figure 1 for Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems

Figure 2 for Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems

Figure 3 for Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems

Figure 4 for Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems

Math Word Problem (MWP) solving needs to discover the quantitative relationships over natural language narratives. Recent work shows that existing models memorize procedures from context and rely on shallow heuristics to solve MWPs. In this paper, we look at this issue and argue that the cause is a lack of overall understanding of MWP patterns. We first investigate how a neural network understands patterns only from semantics, and observe that, if the prototype equations are the same, most problems get closer representations and those representations apart from them or close to other prototypes tend to produce wrong solutions. Inspired by it, we propose a contrastive learning approach, where the neural network perceives the divergence of patterns. We collect contrastive examples by converting the prototype equation into a tree and seeking similar tree structures. The solving model is trained with an auxiliary objective on the collected examples, resulting in the representations of problems with similar prototypes being pulled closer. We conduct experiments on the Chinese dataset Math23k and the English dataset MathQA. Our method greatly improves the performance in monolingual and multilingual settings.

Via

Access Paper or Ask Questions

Fast Wireless Sensor Anomaly Detection based on Data Stream in Edge Computing Enabled Smart Greenhouse

Jul 28, 2021
Yihong Yang, Sheng Ding, Yuwen Liu, Shunmei Meng, Xiaoxiao Chi, Rui Ma, Chao Yan

Figure 1 for Fast Wireless Sensor Anomaly Detection based on Data Stream in Edge Computing Enabled Smart Greenhouse

Figure 2 for Fast Wireless Sensor Anomaly Detection based on Data Stream in Edge Computing Enabled Smart Greenhouse

Figure 3 for Fast Wireless Sensor Anomaly Detection based on Data Stream in Edge Computing Enabled Smart Greenhouse

Figure 4 for Fast Wireless Sensor Anomaly Detection based on Data Stream in Edge Computing Enabled Smart Greenhouse

Edge computing enabled smart greenhouse is a representative application of Internet of Things technology, which can monitor the environmental information in real time and employ the information to contribute to intelligent decision-making. In the process, anomaly detection for wireless sensor data plays an important role. However, traditional anomaly detection algorithms originally designed for anomaly detection in static data have not properly considered the inherent characteristics of data stream produced by wireless sensor such as infiniteness, correlations and concept drift, which may pose a considerable challenge on anomaly detection based on data stream, and lead to low detection accuracy and efficiency. First, data stream usually generates quickly which means that it is infinite and enormous, so any traditional off-line anomaly detection algorithm that attempts to store the whole dataset or to scan the dataset multiple times for anomaly detection will run out of memory space. Second, there exist correlations among different data streams, which traditional algorithms hardly consider. Third, the underlying data generation process or data distribution may change over time. Thus, traditional anomaly detection algorithms with no model update will lose their effects. Considering these issues, a novel method (called DLSHiForest) on basis of Locality-Sensitive Hashing and time window technique in this paper is proposed to solve these problems while achieving accurate and efficient detection. Comprehensive experiments are executed using real-world agricultural greenhouse dataset to demonstrate the feasibility of our approach. Experimental results show that our proposal is practicable in addressing challenges of traditional anomaly detection while ensuring accuracy and efficiency.

* 12 pages, 8 figures

Via

Access Paper or Ask Questions