Understanding with Toy Surrogate Models in Machine Learning (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s11023-024-09700-1.pdf

Understanding with Toy Surrogate Models in Machine Learning

Minds and Machines (2024) 34:45 https://doi.org/10.1007/s11023-024-09700-1 Understanding with Toy Surrogate Models in Machine Learning Andrés Páez1 Received: 8 June 2023 / Accepted: 6 October 2024 © The Author(s) 2024 Abstract In the natural and social sciences, it is common to use toy models—extremely simple and highly idealized representations—to understand complex phenomena. Some of the simple surrogate models used to understand opaque machine learning (ML) models, such as rule lists and sparse decision trees, bear some resemblance to scientific toy models. They allow non-experts to understand how an opaque ML model works globally via a much simpler model that highlights the most relevant features of the input space and their effect on the output. The obvious difference is that the common target of a toy and a full-scale model in the sciences is some phenomenon in the world, while the target of a surrogate model is another model. This essential difference makes toy surrogate models (TSMs) a new object of study for theories of understanding, one that is not easily accommodated under current analyses. This paper provides an account of what it means to understand an opaque ML model globally with the aid of such simple models. Keywords Toy models · Surrogate models · Machine learning · Understanding · Idealization 1 Introduction In the natural and social sciences, it is common to use extremely simple and highly idealized models to understand complex phenomena. Unlike regular models, these very simple models—often referred to as toy models—are not required to be linked to the real world through structural similarity or resemblance relations. They are not Andrés Páez 1 Department of Philosophy and Center for Research and Formation in Artificial Intelligence (CinfonIA), Universidad de los Andes, Carrera 1 No. 18A-12 (G-533), Bogotá, DC 111711, Colombia 13 45 Page 2 of 26 A. Páez meant to be approximations of the target world system, and in some cases, they are not even required to be representational. In semantic terms, they do not accurately map onto their targets. Despite these limitations, they are still useful in understanding theoretical concepts and possible configurations of the target system. Paradigmatic examples of toy models include Boyle’s law and the Ising model in physics, the Lotka–Volterra model in population ecology, and the Schelling model in the social sciences (Weisberg, 2013). In recent years, philosophers of science have become interested in toy models (Grüne-Yanoff, 2009; Luczak, 2017; Reutlinger et al., 2018; Frigg & Nguyen, 2017; Nguyen, 2020). The main purpose of this literature is to explore the nature of these models and examine how they perform their epistemic function. Despite lacking the regular descriptive and predictive features of full-scale scientific models, they often offer an elementary understanding of a phenomenon. Their definitions of “toy model” differ as well as their assessment of the importance of representation in modelling generally, but they all agree that toy models play an important epistemic role in scientific research, exploration, and pedagogy. Prima facie, some of the proxy, interpretative, approximate, or surrogate models1 used in explainable AI (XAI) to make sense of black box machine learning (ML) systems play an analogous role to toy models in the sciences.2 In both cases, the models fulfill what Frigg and Nguyen (2020, p. 3), following Swoyer (1991), call the surrogative reasoning condition for representation: models represent in a way that allows scientists or users to make inferences about the models’ target systems; they can generate claims about target systems by investigating models that represent them. Although many surrogate models used by developers in ML are black boxes,3 the simplest of them—e.g., rule lists and sparse decision trees—allow non-experts to understand how an opaque ML model works globally via a much simpler model that highlights the most relevant features of the input space and their effect on the output. Toy surrogate models (TSMs), as I will call them, only work when the system’s features can be interpreted semantically, that is, when they represent recognizable elements of the user’s environment. It is well-known that many ML systems use noninterpretable features that would impede the extraction of a TSM. The examples used in this paper therefore assume that the features are human-interpretable. The ultimate goal of TSMs is to provide the end users of an AI system with valuable understanding that will result in informed decisions and/or actionable changes. TSMs can be a valuable instrument to comply, for example, with Article 13 of the GDPR (Regulation EU 2016/679) which requires the data controller to provide the data subject with “meaningful information about the logic involved” whenever automated decisionmaking tools are used. 1 I will refer to these models as “surrogate models,” but some papers use the other terms to refer to models that perform the same epistemic function. 2 In this paper, I will assume that the reader is familiar with the problem of opacity in machine learning and with the literature on interpretability and XAI. For an introduction to the topic and some of the controversies involved, see Beisbart and Räz (2022), Humphreys (forthcoming), Krishnan (2020), and Lipton (2018). 3 For example, Xu et al. (2018) build a surrogate model by compressing an existing DNN model to a shallow DNN model, but the latter is still a black box. 13 Understanding with Toy Surrogate Models in Machine Learning Page 3 of 26 45 Despite having similar epistemic roles, the relation between TSMs and opaque ML models is different than the relation between their counterparts in the sciences. Toy models and complex models in the sciences share a common target: some social or physical phenomenon that can be understood either in highly idealized and simple terms through the toy model, or in a more complex and detailed fashion—often involving causation and lawlike generalizations—via the main model. In contrast, the most common use of discriminative ML models is to perform a prediction or classification task that is not necessarily causally grounded in the world or reflective of lawlike relations between inputs and outputs. In other words, most ML models do not have the same representational and epistemic function as the models used in the natural and social sciences. They do not aim at uncovering complex real-world causal or lawlike structures that are responsible for the properties of a phenomenon, but rather to detect useful correlations that optimize the predictive or classificatory task at hand.4 Toy surrogate models in ML, in turn, focus on the statistical correlations in the main model, which they aim to approximate and present in simpler and understandable terms. They are models of models, i.e., metamodels (Alaa & van der Schaar, 2019). As I will be discussing models of (...truncated)