LMC and SDL Complexity Measures: A Tool to Explore Time Series (pdf)

Article PDF cannot be displayed. You can download it here:

http://downloads.hindawi.com/journals/complexity/2019/2095063.pdf

LMC and SDL Complexity Measures: A Tool to Explore Time Series

Hindawi Complexity Volume 2019, Article ID 2095063, 8 pages https://doi.org/10.1155/2019/2095063 Research Article LMC and SDL Complexity Measures: A Tool to Explore Time Series José Roberto C. Piqueira 1 2 1 and Sérgio Henrique Vannucchi Leme de Mattos 2 Escola Politécnica da Universidade de São Paulo, Avenida Prof. Luciano Gualberto, travessa 3, n. 158, 05508-900 São Paulo, SP, Brazil Universidade Federal de São Carlos, Rod. Washington Luı́s km 235, SP-310, 13565-905 São Carlos, SP, Brazil Correspondence should be addressed to José Roberto C. Piqueira; Received 21 September 2018; Revised 24 November 2018; Accepted 10 December 2018; Published 2 January 2019 Guest Editor: Jose Garcia-Rodriguez Copyright © 2019 José Roberto C. Piqueira and Sérgio Henrique Vannucchi Leme de Mattos. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This work is a generalization of the López-Ruiz, Mancini, and Calbet (LMC) and Shiner, Davison, and Landsberg (SDL) complexity measures, considering that the state of a system or process is represented by a continuous temporal series of a dynamical variable. As the two complexity measures are based on the calculation of informational entropy, an equivalent information source is defined by using partitions of the dynamical variable range. During the time intervals, the information associated with the measured dynamical variable is the seed to calculate instantaneous LMC and SDL measures. To show how the methodology works generating indicators, two examples, one concerning meteorological data and the other concerning economic data, are presented and discussed. 1. Introduction The word complexity, in the common sense meaning, represents systems that are difficult to describe, design, or understand. However, since Kolmogorov presented the concept of computational complexity [1], new ideas have been associated with this word, mainly in life sciences [2], relating complexity, and information [3]. As a consequence, complexity started to be associated to with systems and with the emergence of unexpected behaviors, due to nonlinearities [4, 5] and, concerning system theory [6], a new meaning was carved, postulating that complexity is half way of the equilibrium and disequilibrium [7]. Developing this idea, in a seminal paper [8], LópezRuiz, Mancini, and Calbet proposed the LMC (López-Ruiz, Mancini, and Calbet) complexity measure for a random distribution by using informational entropy [9] to evaluate equilibrium, and the quadratic deviation from the uniform distribution to evaluate disequilibrium. However, there has been some criticism about the LMC measure, considering that it is inaccurate for some classes of systems obeying Markovian chains and cannot be considered to represent an extensive variable. Feldman and Crutchfield [10] proposed a correction for the disequilibrium term, replacing it by the relative entropy with respect to the uniform distribution. Shiner, Davison, and Landsberg proposed another modification of the LMC measure, replacing the disequilibrium term by the complement of the equilibrium term. This measure is called SDL (Shiner, Davison, and Landsberg) [11] and presents conclusions similar to that obtained by using LMC, for the majority of usual statistical distributions [2]. The main restriction to LMC and SDL complexity measures is due to Crutchfield, Feldman, and Shalizi, as they argue that an equilibrium system can be structurally complex [12], but this problem could be solved by weighting order and disorder, according to the specific problem to be analyzed. Since the early 2000s, the idea of adapting LMC and SDL to dynamical systems was successfully applied to different types of time evolution problems: bird songs [13], neural plasticity [14], interactions between species in ecological systems [2], physiognomies of landscapes [15], economic series [16], spread depression [17], and quantum information [18]. With these ideas in mind, this article presents a systematization of the methodology used in the referred papers, 2 based on LMC and SDL measures, to be applied to temporal series, by defining and calculating the dynamic complexity measures. The procedure, applied to a temporal series representing some organizational or functional aspect of a system, provides insights regarding the evolution of its complexity. As the LMC and SDL dynamical measures are based on informational entropy [16], the first task, described in the next section, is to define an alphabet source, associating a probability distribution with the possible system states. Following the definition of the probability distribution, a new section defines how dynamical LMC and SDL measures can be calculated at each time, based on the individual information associated with the system state at this time, generating temporal series for LMC and SDL measures. To illustrate the calculation procedure, two examples are presented: one related to a meteorological time series and the other to an economic time series. In both cases section, a practical discussion about how to divide the range of the values assumed by the system state is presented. The examples were chosen to show that the methodology can be applied to different types of phenomena: precipitation (first example) with strong periodic component and economic time series (second example) that seems to be random. The work is closed with a conclusion section, emphasizing that the same procedure can be applied to any kind of temporal real numbers series, even with different temporal scale, to calculate complexity measures. 2. Defining Source and Probability Distribution for a Temporal Series Considering Shannon’s model [9] for an information source, a time series 𝑥(𝑛) is considered to be a function of the nonnegative integers into a real interval, i.e., 𝑥(𝑛) : 𝑍+ 󳨀→ (𝑎, 𝑏), associating with each time 𝑡0 + 𝑛𝑇 a real number belonging to (𝑎, 𝑏), with 𝑡0 > 0 being the initial instant and 𝑇 > 0 an arbitrary period, depending on the data availability. The set 𝑥(𝑡0 ), 𝑥(𝑡0 + 𝑇), . . . 𝑥(𝑡0 + 𝑛𝑇) is assumed to be a sequence of independent random variables and the stochastic process 𝑥(𝑛) as a whole is stationary [19]. The first step is to divide the interval (𝑎, 𝑏) into N subintervals. For the sake of simplicity, N is chosen equal to 2𝑘 , 𝑘 ∈ 𝑍+ . At this point, it could be asked how to choose N, as there is a compromise between precision (high values of N) and speed of calculation (low values of N). This question will not be addressed theoretically; however, in the example section, practical hints about this choice are presented. Consequently, the source alphabet is defined by the intervals 𝐴 𝑖 , 𝑖 = 1, . . . , 𝑁, with ⋃𝑁 𝑖=1 𝐴 𝑖 = (𝑎, 𝑏) and 𝐴 𝑖 ∩ 𝐴 𝑗 = 𝜙, ∀𝑖 ≠ 𝑗. Then, a time interval defined by a given n must be chosen, and for the time se (...truncated)