Models, code, and papers for "Mengnan Du":
Interpretable Machine Learning (IML) has become increasingly important in many real-world applications, such as autonomous cars and medical diagnosis, where explanations are significantly preferred to help people better understand how machine learning systems work and further enhance their trust towards systems. However, due to the diversified scenarios and subjective nature of explanations, we rarely have the ground truth for benchmark evaluation in IML on the quality of generated explanations. Having a sense of explanation quality not only matters for assessing system boundaries, but also helps to realize the true benefits to human users in practical settings. To benchmark the evaluation in IML, in this article, we rigorously define the problem of evaluating explanations, and systematically review the existing efforts from state-of-the-arts. Specifically, we summarize three general aspects of explanation (i.e., generalizability, fidelity and persuasibility) with formal definitions, and respectively review the representative methodologies for each of them under different tasks. Further, a unified evaluation framework is designed according to the hierarchical needs from developers and end-users, which could be easily adopted for different scenarios in practice. In the end, open problems are discussed, and several limitations of current evaluation techniques are raised for future explorations.
Interpretable machine learning tackles the important problem that humans cannot understand the behaviors of complex machine learning models and how these classifiers arrive at a particular decision. Although many approaches have been proposed, a comprehensive understanding of the achievements and challenges is still lacking. This paper provides a survey covering existing techniques and methods to increase the interpretability of machine learning models and also discusses the crucial issues to consider in future work such as interpretation design principles and evaluation metrics in order to push forward the area of interpretable machine learning.
Recently, more and more attention has been drawn into the internal mechanism of the convolutional neural network and on what basis does the network make a specific decision. In this paper, we develop a novel post-hoc visual explanation method called Score-CAM based on class activation mapping. Unlike previous class activation mapping based approaches, Score-CAM gets rid of the dependence on gradient by obtaining the weight of each activation map through its forward passing score on target class, the final result is obtained by a linear combination of weights and activation maps. We demonstrate that Score-CAM achieves better visual performance with less noise and has better stability than Grad-CAM and Grad-CAM++. In the experiment, we rethink issues of previous evaluation metrics and propose a representative evaluation approach Energy- Based Pointing Game to measure the quality of the generated saliency maps. Our approach outperforms previous methods on energy-based pointing game and recognition and shows more robustness under adversarial attack.
With advancements of deep learning techniques, it is now possible to generate super-realistic fake images and videos. These manipulated forgeries could reach mass audience and result in adverse impacts on our society. Although lots of efforts have been devoted to detect forgeries, their performance drops significantly on previously unseen but related manipulations and the detection generalization capability remains a problem. To bridge this gap, in this paper we propose Locality-aware AutoEncoder (LAE), which combines fine-grained representation learning and enforcing locality in a unified framework. In the training process, we use pixel-wise mask to regularize local interpretation of LAE to enforce the model to learn intrinsic representation from the forgery region, instead of capturing artifacts in the training set and learning spurious correlations to perform detection. We further propose an active learning framework to select the challenging candidates for labeling, to reduce the annotation efforts to regularize interpretations. Experimental results indicate that LAE indeed could focus on the forgery regions to make decisions. The results further show that LAE achieves superior generalization performance compared to state-of-the-arts on forgeries generated by alternative manipulation methods.
Deep learning is increasingly being used in high-stake decision making applications that affect individual lives. However, deep learning models might exhibit algorithmic discrimination behaviors with respect to protected groups, potentially posing negative impacts on individuals and society. Therefore, fairness in deep learning has attracted tremendous attention recently. We provide a comprehensive review covering existing techniques to tackle algorithmic fairness problems from the computational perspective. Specifically, we show that interpretability can serve as a useful ingredient, which could be augmented into the biases detection and mitigation pipelines. We also discuss open research problems and future research directions, aiming to push forward the area of fairness in deep learning and build genuinely fair, accountable, and transparent deep learning systems.
Recent explainability related studies have shown that state-of-the-art DNNs do not always adopt correct evidences to make decisions. It not only hampers their generalization but also makes them less likely to be trusted by end-users. In pursuit of developing more credible DNNs, in this paper we propose CREX, which encourages DNN models to focus more on evidences that actually matter for the task at hand, and to avoid overfitting to data-dependent bias and artifacts. Specifically, CREX regularizes the training process of DNNs with rationales, i.e., a subset of features highlighted by domain experts as justifications for predictions, to enforce DNNs to generate local explanations that conform with expert rationales. Even when rationales are not available, CREX still could be useful by requiring the generated explanations to be sparse. Experimental results on two text classification datasets demonstrate the increased credibility of DNNs trained with CREX. Comprehensive analysis further shows that while CREX does not always improve prediction accuracy on the held-out test set, it significantly increases DNN accuracy on new and previously unseen data beyond test set, highlighting the advantage of the increased credibility.
While deep neural networks (DNN) have become an effective computational tool, the prediction results are often criticized by the lack of interpretability, which is essential in many real-world applications such as health informatics. Existing attempts based on local interpretations aim to identify relevant features contributing the most to the prediction of DNN by monitoring the neighborhood of a given input. They usually simply ignore the intermediate layers of the DNN that might contain rich information for interpretation. To bridge the gap, in this paper, we propose to investigate a guided feature inversion framework for taking advantage of the deep architectures towards effective interpretation. The proposed framework not only determines the contribution of each feature in the input but also provides insights into the decision-making process of DNN models. By further interacting with the neuron of the target category at the output layer of the DNN, we enforce the interpretation result to be class-discriminative. We apply the proposed interpretation model to different CNN architectures to provide explanations for image data and conduct extensive experiments on ImageNet and PASCAL VOC07 datasets. The interpretation results demonstrate the effectiveness of our proposed framework in providing class-discriminative interpretation for DNN-based prediction.
Anomaly detection aims to distinguish observations that are rare and different from the majority. While most existing algorithms assume that instances are i.i.d., in many practical scenarios, links describing instance-to-instance dependencies and interactions are available. Such systems are called attributed networks. Anomaly detection in attributed networks has various applications such as monitoring suspicious accounts in social media and financial fraud in transaction networks. However, it remains a challenging task since the definition of anomaly becomes more complicated and topological structures are heterogeneous with nodal attributes. In this paper, we propose a spectral convolution and deconvolution based framework -- SpecAE, to project the attributed network into a tailored space to detect global and community anomalies. SpecAE leverages Laplacian sharpening to amplify the distances between representations of anomalies and the ones of the majority. The learned representations along with reconstruction errors are combined with a density estimation model to perform the detection. They are trained jointly as an end-to-end framework. Experiments on real-world datasets demonstrate the effectiveness of SpecAE.
Anomaly detection is a fundamental problem in data mining field with many real-world applications. A vast majority of existing anomaly detection methods predominately focused on data collected from a single source. In real-world applications, instances often have multiple types of features, such as images (ID photos, finger prints) and texts (bank transaction histories, user online social media posts), resulting in the so-called multi-modal data. In this paper, we focus on identifying anomalies whose patterns are disparate across different modalities, i.e., cross-modal anomalies. Some of the data instances within a multi-modal context are often not anomalous when they are viewed separately in each individual modality, but contains inconsistent patterns when multiple sources are jointly considered. The existence of multi-modal data in many real-world scenarios brings both opportunities and challenges to the canonical task of anomaly detection. On the one hand, in multi-modal data, information of different modalities may complement each other in improving the detection performance. On the other hand, complicated distributions across different modalities call for a principled framework to characterize their inherent and complex correlations, which is often difficult to capture with conventional linear models. To this end, we propose a novel deep structured anomaly detection framework to identify the cross-modal anomalies embedded in the data. Experiments on real-world datasets demonstrate the effectiveness of the proposed framework comparing with the state-of-the-art.
RNN models have achieved the state-of-the-art performance in a wide range of text mining tasks. However, these models are often regarded as black-boxes and are criticized due to the lack of interpretability. In this paper, we enhance the interpretability of RNNs by providing interpretable rationales for RNN predictions. Nevertheless, interpreting RNNs is a challenging problem. Firstly, unlike existing methods that rely on local approximation, we aim to provide rationales that are more faithful to the decision making process of RNN models. Secondly, a flexible interpretation method should be able to assign contribution scores to text segments of varying lengths, instead of only to individual words. To tackle these challenges, we propose a novel attribution method, called REAT, to provide interpretations to RNN predictions. REAT decomposes the final prediction of a RNN into additive contribution of each word in the input text. This additive decomposition enables REAT to further obtain phrase-level attribution scores. In addition, REAT is generally applicable to various RNN architectures, including GRU, LSTM and their bidirectional versions. Experimental results demonstrate the faithfulness and interpretability of the proposed attribution method. Comprehensive analysis shows that our attribution method could unveil the useful linguistic knowledge captured by RNNs. Some analysis further demonstrates our method could be utilized as a debugging tool to examine the vulnerability and failure reasons of RNNs, which may lead to several promising future directions to promote generalization ability of RNNs.
Neural architecture search (NAS) is gaining more and more attention in recent years due to its flexibility and the remarkable capability of reducing the burden of neural network design. To achieve better performance, however, the searching process usually costs massive computation, which might not be affordable to researchers and practitioners. While recent attempts have employed ensemble learning methods to mitigate the enormous computation, an essential characteristic of diversity in ensemble methods is missed out, causing more similar sub-architectures to be gathered and potential redundancy in the final ensemble architecture. To bridge this gap, we propose a pruning method for NAS ensembles, named as ''Sub-Architecture Ensemble Pruning in Neural Architecture Search (SAEP).'' It targets to utilize diversity and achieve sub-ensemble architectures in a smaller size with comparable performance to the unpruned ensemble architectures. Three possible solutions are proposed to decide which subarchitectures should be pruned during the searching process. Experimental results demonstrate the effectiveness of the proposed method in largely reducing the size of ensemble architectures while maintaining the final performance. Moreover, distinct deeper architectures could be discovered if the searched sub-architectures are not diverse enough.
In this demo paper, we present the XFake system, an explainable fake news detector that assists end-users to identify news credibility. To effectively detect and interpret the fakeness of news items, we jointly consider both attributes (e.g., speaker) and statements. Specifically, MIMIC, ATTN and PERT frameworks are designed, where MIMIC is built for attribute analysis, ATTN is for statement semantic analysis and PERT is for statement linguistic analysis. Beyond the explanations extracted from the designed frameworks, relevant supporting examples as well as visualization are further provided to facilitate the interpretation. Our implemented system is demonstrated on a real-world dataset crawled from PolitiFact, where thousands of verified political news have been collected.
In this paper, we explore the redundancy in convolutional neural network, which scales with the complexity of vision tasks. Considering that many front-end visual systems are interested in only a limited range of visual targets, the removing of task-specified network redundancy can promote a wide range of potential applications. We propose a task-specified knowledge distillation algorithm to derive a simplified model with pre-set computation cost and minimized accuracy loss, which suits the resource constraint front-end systems well. Experiments on the MNIST and CIFAR10 datasets demonstrate the feasibility of the proposed approach as well as the existence of task-specified redundancy.