Abstract

Recent work on interpretability in machine learning and AI has focused on the\nbuilding of simplified models that approximate the true criteria used to make\ndecisions. These models are a useful pedagogical device for teaching trained\nprofessionals how to predict what decisions will be made by the complex system,\nand most importantly how the system might break. However, when considering any\nsuch model it's important to remember Box's maxim that "All models are wrong\nbut some are useful." We focus on the distinction between these models and\nexplanations in philosophy and sociology. These models can be understood as a\n"do it yourself kit" for explanations, allowing a practitioner to directly\nanswer "what if questions" or generate contrastive explanations without\nexternal assistance. Although a valuable ability, giving these models as\nexplanations appears more difficult than necessary, and other forms of\nexplanation may not have the same trade-offs. We contrast the different schools\nof thought on what makes an explanation, and suggest that machine learning\nmight benefit from viewing the problem more broadly.\n

Keywords

InterpretabilityComputer scienceFocus (optics)Contrast (vision)Artificial intelligenceMaximCognitive scienceEpistemologyPsychology

Affiliated Institutions

Related Publications

Publication Info

Year
2019
Type
preprint
Pages
279-288
Citations
679
Access
Closed

External Links

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

679
OpenAlex

Cite This

Brent Mittelstadt, Chris Russell, Sandra Wachter (2019). Explaining Explanations in AI. , 279-288. https://doi.org/10.1145/3287560.3287574

Identifiers

DOI
10.1145/3287560.3287574