Internal and collective interpretation for improving human interpretability of multi-layered neural networks (pdf)

Article PDF cannot be displayed. You can download it here:

http://ijain.org/index.php/IJAIN/article/download/420/ijain_v5i3_p179-192

Internal and collective interpretation for improving human interpretability of multi-layered neural networks

International Journal of Advances in Intelligent Informatics Vol. 5, No. 3, November 2019, pp. 179-192 ISSN 2442-6571 179 Internal and collective interpretation for improving human interpretability of multi-layered neural networks Ryotaro Kamimura a,b,1,* Kumamoto Drone Technology and Development Foundation, Techno Research Park, Techno Lab 203, 1155-12, Japan IT Education Center, Tokai Univerisity, 4-1-1 Kitakaname, Hiratsuka, Kanagawa 259-1292, Japan 1 * corresponding author a b ARTICLE INFO Article history Received July 1, 2019 Revised October 21, 2019 Accepted October 29, 2019 Available online October 29, 2019 Keywords Mutual information Internal interpretation Collective interpretation Inference mechanism Generalization ABSTRACT The present paper aims to propose a new type of information-theoretic method to interpret the inference mechanism of neural networks. We interpret the internal inference mechanism for itself without any external methods such as symbolic or fuzzy rules. In addition, we make interpretation processes as stable as possible. This means that we interpret the inference mechanism, considering all internal representations, created by those different conditions and patterns. To make the internal interpretation possible, we try to compress multi-layered neural networks into the simplest ones without hidden layers. Then, the natural information loss in the process of compression is complemented by the introduction of a mutual information augmentation component. The method was applied to two data sets, namely, the glass data set and the pregnancy data set. In both data sets, information augmentation and compression methods could improve generalization performance. In addition, compressed or collective weights from the multi-layered networks tended to produce weights, ironically, similar to the linear correlation coefficients between inputs and targets, while the conventional methods such as the logistic regression analysis failed to do so. This is an open access article under the CC–BY-SA license. 1. Introduction Machine learning has been used in many areas of our daily life, causing some troubles in our life. As the techniques inside become larger and more complex, it becomes harder to interpret the main inference mechanism and to explain why and how the decisions made by the machine learning techniques reach their final conclusion. Because the methods have had serious influences over our safety [1], and the users of the techniques should have the right to receive an explanation of how the decisions are made, there has been an urgent need to develop methods to interpret and explain the main inference mechanism of the machine learning techniques [2]. Thus, many types of methods for interpretation have been developed in machine learning, which can be classified into two types: internal and external interpretation. In the internal interpretation, the methods aim to produce models whose components can be directly inspected and interpreted [3]–[5]. On the contrary, in the external interpretation, the models are considered as black-box ones, and try to interpret the inference mechanism externally [6]–[8]. In the neural networks, similarly as for the machine learning techniques, the interpretation methods have been classified as “decompositional” or “pedagogic” [9]. The pedagogic model is the black-box model and tries to infer the relations between inputs and outputs only by inspecting the inputs and outputs externally. The decompositional approach http://dx.doi.org/10.26555/ijain.v5i3.420 http://ijain.org 180 International Journal of Advances in Intelligent Informatics Vol. 5, No. 3, November 2019, pp. 179-192 ISSN 2442-6571 tries to analyze the components such as connection weights and neuron activations directly. Thus, the method can be considered as the above-mentioned internal interpretation. However, usually, in the decompositional approach, many external methods, such as symbolic rules, fuzzy rules, decision trees, have been used to analyze and represent the components [10]–[12]. In addition, to extract the rules, many techniques, such as digitization of inputs and outputs for extracting rules, have been applied [9]. Thus, those methods cannot be called “internal interpretation” methods, but they have tried to interpret the final results by some external methods, and it is more appropriate call them “external interpretation.” As is known, the objective of the interpretation is two-fold. First, and naturally, the interpretation method can be used to explain the inference mechanism in human intelligible ways. In addition, the clarification of the inference mechanism can be used to improve the general property, such as generalization performance, of neural networks. Considering two important aspects behind the interpretation, the interpretation methods so far developed have been dependent on methods not related to the real inference mechanism of neural networks. Thus, when we need to improve further the performance of neural networks, it is necessary to interpret internally the main inference mechanism. In addition to the external interpretation, we have faced another problem, that of instable interpretation. Ordinarily, machine learning, as well as neural networks, are trained with many different data sets and initial conditions, in particular, in evaluating generalization performance. Thus, even for the same data set, we can have completely different internal representations due to different initial conditions. The problem is selecting which representation among many we should interpret. One of the practical solutions is to interpret a representation with the best generalization performance. This means that we try to see the ability of neural networks only from one aspect of improved generalization. We think that all representations created by different data sets and initial conditions should be taken into account for uncovering the fundamental properties of data sets. Then, for the problem of instability of interpretation, we should collectively interpret all internal representations created by learning, where each representation should have equal importance. It seems to us that the problem of collective interpretation has not been fully examined in machine learning as well as neural networks except in some exceptional cases with the ensemble methods [9], [13]. In this context, the present paper proposes a new type of interpretation called “collective interpretation,” in which all representations from neural networks should be taken into account with equal importance. We have shown that interpretable neural networks should be internally interpreted and all different types of internal representations should be collectively interpreted. Let us consider how to create neural networks with those properties for interpretation. As mentioned, in neural networks, there have been many types of interpretation methods, and the majority of those (...truncated)