The application of machine learning to structural health monitoring (pdf)

Article PDF cannot be displayed. You can download it here:

https://rsta.royalsocietypublishing.org/content/365/1851/515.full.pdf

The application of machine learning to structural health monitoring

0 Dynamics Research Group, Department of Mechanical Engineering, University of Sheffield , Mappin Street, Sheffield S1 3JD , UK In broad terms, there are two approaches to damage identification. Model-driven methods establish a high-fidelity physical model of the structure, usually by finite element analysis, and then establish a comparison metric between the model and the measured data from the real structure. If the model is for a system or structure in normal (i.e. undamaged) condition, any departures indicate that the structure has deviated from normal condition and damage is inferred. Data-driven approaches also establish a model, but this is usually a statistical representation of the system, e.g. a probability density function of the normal condition. Departures from normality are then signalled by measured data appearing in regions of very low density. The algorithms that have been developed over the years for data-driven approaches are mainly drawn from the discipline of pattern recognition, or more broadly, machine learning. The object of this paper is to illustrate the utility of the data-driven approach to damage identification by means of a number of case studies. 1. Introduction As the title of this paper suggests, it is concerned with a specific class of algorithms that are applicable to damage detection problems. Owing to space limitations, the paper will not attempt to discuss the desirability of structural health monitoring (SHM); the interested reader will be directed elsewhere within this theme issue. The assumption here is that SHM is a good thing and one should only be concerned with how it is to be accomplished. Even within this philosophy, the remit of this paper will be limited to a discussion of pattern recognition and machine learning algorithms, competing approaches will simply be indicated in the references. The fundamental problem of SHM, the question of damage detection, is simply posed. The object is just to identify if and when the system departs from normal condition. This is the most basic question that can be addressed. At a slightly more sophisticated level, the problem of damage identification can be approached. This seeks to determine a much finer diagnosis and can even address issues of prognosis. The broader problem can be regarded as a hierarchy of levels which are as follows (Rytter 1993). * Author for correspondence (). One contribution of 15 to a Theme Issue Structural health monitoring. Level 1. (Detection.) The method gives a qualitative indication that damage might be present in the structure. Level 2. (Localization.) The method gives information about the probable position of the damage. Level 3. (Assessment.) The method gives an estimate of the extent of the damage. Level 4. (Prediction.) The method offers information about the safety of the structure, e.g. estimates a residual life. The main body of this paper will argue that machine learning theory offers a natural framework in which to address these problems (at least at levels 13). Before this can begin, it is necessary to specify the remit of machine learning. This is a body of knowledge that attempts to construct computational relationships between quantities on the basis of observed data and rules. It is characterized by the fact that computational rules are inferred or learned on the basis of observational evidence. This contrasts with the classical view of computation, where the algorithmic rules are imposed in the form of a sequence of serially enacted instructions. It is sometimes stated that learning theory is designed to address the following three main problems (Cherkassky & Mulier 1998). Classification, i.e. the association of a class or set label with a set or vector of measured quantities. The set of observations may be sparse and/or noisy. Regression, i.e. the construction of a map between a group of continuous input variables and a continuous output variable on the basis of a set of (again, potentially noisy) samples. Density estimation, i.e. the estimation of probability density functions from samples of measured data. A further division of learning algorithms may be made between unsupervised and supervised learning. The former is concerned with characterizing a set on the basis of measurements and perhaps determining underlying structure. The latter requires examples of input and output data for a postulated relationship so that associations might be learnt and errors corrected. Although it is not universally so, regression and classification problems usually require supervised learning, while density estimation can be conducted in an unsupervised framework. It has been proposed (Vapnik 1998) that the three learning problems shown above are given in order of difficulty. A commonly stated rule of machine learning is that one should never replace a problem with one of a more difficult type. For example, one should not solve a classification problem by learning the densities of the individual classes. One might suspect that the hierarchical system proposed above for learning problems might be brought into correspondence with Rytters level-based system for damage identification on the basis of difficulty. In fact, this is not necessarily the case. Level 1 is often addressed by using density estimation techniques (albeit usually with restrictive assumptions), while level 3 is naturally posed as a regression problem in many cases. The other important factor in using machine learning is that it requires an organizing principle. This is sometimes implicit in the analysis, but can be made explicit, for example, in the data to decision process of Lowe (2000) or in the embedding of the problem in the general framework of data fusion (Worden & Staszewski 2003). In general, the actual machine learning step may only be a part of the required analysis. It is usually necessary to convert measured data into features, i.e. quantities that make the rule to be learned explicit. Alternatively, feature selection can be regarded as a process of amplification. In the context of damage detection, one transforms the data in such a way so as to retain only the information necessary for a diagnosis, any other redundant information is discarded. This is clearly desirable. Another frequent aim of feature selection is to produce quantities with low (vector) dimension. The reason for this is that the data requirements of learning algorithms usually grow explosively with the dimension of the problemthe so-called curse of dimensionality. Before feature selection, it may be necessary to clean the data or attend to it in other ways: filtering might be employed as a means of noise rejection, missing values may need to be estimated, etc. As an example of an organizing principle, one might cite the waterfall model (Bedworth & OBrien 2000), as depicted in figure 1. The waterfall model is one of the simpler structures in data fusion, more sophisticated examples can be fou (...truncated)