Explainable Artificial Intelligence (XAI) has re-emerged in response to the development of modern AI and ML systems. These systems are complex and sometimes biased, but they nevertheless make decisions that impact our lives. XAI systems are frequently algorithm-focused; starting and ending with an algorithm that implements a basic untested idea about explainability. These systems are often not tested to determine whether the algorithm helps users accomplish any goals, and so their explainability remains unproven. We propose an alternative: to start with human-focused principles for the design, testing, and implementation of XAI systems, and implement algorithms to serve that purpose. In this paper, we review some of the basic concepts that have been used for user-centered XAI systems over the past 40 years of research. Based on these, we describe the "Self-Explanation Scorecard", which can help developers understand how they can empower users by enabling self-explanation. Finally, we present a set of empirically-grounded, user-centered design principles that may guide developers to create successful explainable systems.
Current discussions of "Explainable AI" (XAI) do not much consider the role of abduction in explanatory reasoning (see Mueller, et al., 2018). It might be worthwhile to pursue this, to develop intelligent systems that allow for the observation and analysis of abductive reasoning and the assessment of abductive reasoning as a learnable skill. Abductive inference has been defined in many ways. For example, it has been defined as the achievement of insight. Most often abduction is taken as a single, punctuated act of syllogistic reasoning, like making a deductive or inductive inference from given premises. In contrast, the originator of the concept of abduction---the American scientist/philosopher Charles Sanders Peirce---regarded abduction as an exploratory activity. In this regard, Peirce's insights about reasoning align with conclusions from modern psychological research. Since abduction is often defined as "inferring the best explanation," the challenge of implementing abductive reasoning and the challenge of automating the explanation process are closely linked. We explore these linkages in this report. This analysis provides a theoretical framework for understanding what the XAI researchers are already doing, it explains why some XAI projects are succeeding (or might succeed), and it leads to design advice.
This is an integrative review that address the question, "What makes for a good explanation?" with reference to AI systems. Pertinent literatures are vast. Thus, this review is necessarily selective. That said, most of the key concepts and issues are expressed in this Report. The Report encapsulates the history of computer science efforts to create systems that explain and instruct (intelligent tutoring systems and expert systems). The Report expresses the explainability issues and challenges in modern AI, and presents capsule views of the leading psychological theories of explanation. Certain articles stand out by virtue of their particular relevance to XAI, and their methods, results, and key points are highlighted. It is recommended that AI/XAI researchers be encouraged to include in their research reports fuller details on their empirical or experimental methods, in the fashion of experimental psychology research reports: details on Participants, Instructions, Procedures, Tasks, Dependent Variables (operational definitions of the measures and metrics), Independent Variables (conditions), and Control Conditions.
The question addressed in this paper is: If we present to a user an AI system that explains how it works, how do we know whether the explanation works and the user has achieved a pragmatic understanding of the AI? In other words, how do we know that an explanainable AI system (XAI) is any good? Our focus is on the key concepts of measurement. We discuss specific methods for evaluating: (1) the goodness of explanations, (2) whether users are satisfied by explanations, (3) how well users understand the AI systems, (4) how curiosity motivates the search for explanations, (5) whether the user's trust and reliance on the AI are appropriate, and finally, (6) how the human-XAI work system performs. The recommendations we present derive from our integration of extensive research literatures and our own psychometric evaluations.