The application of machine learning to structural health monitoring
0
Dynamics Research Group, Department of Mechanical Engineering, University of Sheffield
,
Mappin Street, Sheffield S1 3JD
,
UK
In broad terms, there are two approaches to damage identification. Model-driven methods establish a high-fidelity physical model of the structure, usually by finite element analysis, and then establish a comparison metric between the model and the measured data from the real structure. If the model is for a system or structure in normal (i.e. undamaged) condition, any departures indicate that the structure has deviated from normal condition and damage is inferred. Data-driven approaches also establish a model, but this is usually a statistical representation of the system, e.g. a probability density function of the normal condition. Departures from normality are then signalled by measured data appearing in regions of very low density. The algorithms that have been developed over the years for data-driven approaches are mainly drawn from the discipline of pattern recognition, or more broadly, machine learning. The object of this paper is to illustrate the utility of the data-driven approach to damage identification by means of a number of case studies.
1. Introduction
As the title of this paper suggests, it is concerned with a specific class of algorithms
that are applicable to damage detection problems. Owing to space limitations, the
paper will not attempt to discuss the desirability of structural health monitoring
(SHM); the interested reader will be directed elsewhere within this theme issue. The
assumption here is that SHM is a good thing and one should only be concerned with
how it is to be accomplished. Even within this philosophy, the remit of this paper will
be limited to a discussion of pattern recognition and machine learning algorithms,
competing approaches will simply be indicated in the references.
The fundamental problem of SHM, the question of damage detection, is simply
posed. The object is just to identify if and when the system departs from normal
condition. This is the most basic question that can be addressed. At a slightly more
sophisticated level, the problem of damage identification can be approached. This
seeks to determine a much finer diagnosis and can even address issues of prognosis.
The broader problem can be regarded as a hierarchy of levels which are as follows
(Rytter 1993).
* Author for correspondence ().
One contribution of 15 to a Theme Issue Structural health monitoring.
Level 1. (Detection.) The method gives a qualitative indication that damage might
be present in the structure.
Level 2. (Localization.) The method gives information about the probable position
of the damage.
Level 3. (Assessment.) The method gives an estimate of the extent of the damage.
Level 4. (Prediction.) The method offers information about the safety of the
structure, e.g. estimates a residual life.
The main body of this paper will argue that machine learning theory offers a
natural framework in which to address these problems (at least at levels 13).
Before this can begin, it is necessary to specify the remit of machine learning. This
is a body of knowledge that attempts to construct computational relationships
between quantities on the basis of observed data and rules. It is characterized by
the fact that computational rules are inferred or learned on the basis of
observational evidence. This contrasts with the classical view of computation,
where the algorithmic rules are imposed in the form of a sequence of serially
enacted instructions. It is sometimes stated that learning theory is designed to
address the following three main problems (Cherkassky & Mulier 1998).
Classification, i.e. the association of a class or set label with a set or vector of
measured quantities. The set of observations may be sparse and/or noisy.
Regression, i.e. the construction of a map between a group of continuous input
variables and a continuous output variable on the basis of a set of (again,
potentially noisy) samples.
Density estimation, i.e. the estimation of probability density functions from
samples of measured data.
A further division of learning algorithms may be made between unsupervised
and supervised learning. The former is concerned with characterizing a set on the
basis of measurements and perhaps determining underlying structure. The latter
requires examples of input and output data for a postulated relationship so that
associations might be learnt and errors corrected. Although it is not universally
so, regression and classification problems usually require supervised learning,
while density estimation can be conducted in an unsupervised framework.
It has been proposed (Vapnik 1998) that the three learning problems shown
above are given in order of difficulty. A commonly stated rule of machine learning
is that one should never replace a problem with one of a more difficult type. For
example, one should not solve a classification problem by learning the densities of
the individual classes. One might suspect that the hierarchical system proposed
above for learning problems might be brought into correspondence with Rytters
level-based system for damage identification on the basis of difficulty. In fact, this
is not necessarily the case. Level 1 is often addressed by using density estimation
techniques (albeit usually with restrictive assumptions), while level 3 is naturally
posed as a regression problem in many cases.
The other important factor in using machine learning is that it requires an
organizing principle. This is sometimes implicit in the analysis, but can be made
explicit, for example, in the data to decision process of Lowe (2000) or in the
embedding of the problem in the general framework of data fusion (Worden &
Staszewski 2003). In general, the actual machine learning step may only be a part
of the required analysis. It is usually necessary to convert measured data into
features, i.e. quantities that make the rule to be learned explicit. Alternatively,
feature selection can be regarded as a process of amplification. In the context of
damage detection, one transforms the data in such a way so as to retain only the
information necessary for a diagnosis, any other redundant information is
discarded. This is clearly desirable. Another frequent aim of feature selection is to
produce quantities with low (vector) dimension. The reason for this is that the
data requirements of learning algorithms usually grow explosively with the
dimension of the problemthe so-called curse of dimensionality. Before feature
selection, it may be necessary to clean the data or attend to it in other ways:
filtering might be employed as a means of noise rejection, missing values may
need to be estimated, etc. As an example of an organizing principle, one might
cite the waterfall model (Bedworth & OBrien 2000), as depicted in figure 1.
The waterfall model is one of the simpler structures in data fusion, more
sophisticated examples can be fou (...truncated)