Prediction, Retrodiction, and the Amount of Information Stored in the Present

Journal of Statistical Physics, Sep 2009

We introduce an ambidextrous view of stochastic dynamical systems, comparing their forward-time and reverse-time representations and then integrating them into a single time-symmetric representation. The perspective is useful theoretically, computationally, and conceptually. Mathematically, we prove that the excess entropy—a familiar measure of organization in complex systems—is the mutual information not only between the past and future, but also between the predictive and retrodictive causal states. Practically, we exploit the connection between prediction and retrodiction to directly calculate the excess entropy. Conceptually, these lead one to discover new system measures for stochastic dynamical systems: crypticity (information accessibility) and causal irreversibility. Ultimately, we introduce a time-symmetric representation that unifies all of these quantities, compressing the two directional representations into one. The resulting compression offers a new conception of the amount of information stored in the present.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs10955-009-9808-z.pdf

Prediction, Retrodiction, and the Amount of Information Stored in the Present

Christopher J. Ellison 0 John R. Mahoney 0 James P. Crutchfield 0 0 J.P. Crutchfield Santa Fe Institute , 1399 Hyde Park Road, Santa Fe, NM 87501, USA We introduce an ambidextrous view of stochastic dynamical systems, comparing their forward-time and reverse-time representations and then integrating them into a single time-symmetric representation. The perspective is useful theoretically, computationally, and conceptually. Mathematically, we prove that the excess entropya familiar measure of organization in complex systemsis the mutual information not only between the past and future, but also between the predictive and retrodictive causal states. Practically, we exploit the connection between prediction and retrodiction to directly calculate the excess entropy. Conceptually, these lead one to discover new system measures for stochastic dynamical systems: crypticity (information accessibility) and causal irreversibility. Ultimately, we introduce a time-symmetric representation that unifies all of these quantities, compressing the two directional representations into one. The resulting compression offers a new conception of the amount of information stored in the present. 1 Introduction Predicting time series encapsulates two notions of directionality. Predictionmaking a claim about the future based on the pastis directional. Time evokes images of rivers, clocks, and actions in progress. Curiously, though, when one writes a time series as a lattice of random variables, any necessary dependence on times inherent direction is removed; at best it becomes convention. When we analyze a stochastic process to determine its correlation function, block entropy, entropy rate, and the like, we already have shed our commitment to the idea of forward by virtue of the fact that these quantities are defined independently of any perceived direction of the process. Here we explore this ambivalence. In making it explicit, we consider not only predictive models, but also retrodictive models. We then demonstrate that it is possible to unify these two viewpoints and, in doing so, we discover several new properties of stationary stochastic dynamical systems. Along the way, we also rediscover, and recast, old ones. We first review minimal causal representations of stochastic processes, as developed by computational mechanics [1, 2]. We extend its (implied) forward-time representation to reverse-time. Then, we prove that the mutual information between a processs past and futurethe excess entropyis the mutual information between its forward- and reversetime representations. Excess entropy, and related mutual information quantities, are widely used diagnostics for complex systems. They have been applied to detect the presence of organization in dynamical systems [36], in spin systems [79], in neurobiological systems [10, 11], and even in language, to mention only a few applications. For example, in natural language the excess entropy (E) diverges with the number of characters L as E L1/2. The claim is that this reflects the long-range and strongly non-ergodic organization necessary for human communication [12, 13]. The net result is a unified view of information processing in stochastic processes. For the first time, we give an explicit relationship between the internal (causal) state information the statistical complexity [1]and the observed informationthe excess entropy. Another consequence is that the forward and reverse representations are two projections of a unified time-symmetric representation. From the latter it becomes clear there are important system properties that control how accessible internal state information is and how irreversible a process is. Moreover, the methods are sufficiently constructive that one can calculate the excess entropy in closed-form for finite-memory processes. Before embarking, we delineate the present contributions role within a collection of recent results. An announcement appears in [14] and [15] will provide complementary results that address measure-theoretic relationships between the above information quantities. A new classification scheme based on information accessibility of stochastic processes appears in [16]. Here, we lay out the theory behind [14] in detail, giving step-by-step proofs of the main results and the calculational methods. 2 Optimal Causal Models The approach starts with a simple analogy. Any process, P , is a joint probability distribution over the past and future observation symbols, Pr( X, X). This distribution can be thought of as a communication channel with a specified input distribution Pr( X)1: It transmits information from the past X = . . . X3X2X1 to the future X = X0X1X2 . . . by storing it in the present. Xt is the random variable for the measurement outcome at time t . should be interpreted as a shorthand for using XL and then taking an appropriate limit, such as limL or limL 1/L. Our goal is also simply stated: We wish to predict the future us (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs10955-009-9808-z.pdf

Christopher J. Ellison, John R. Mahoney, James P. Crutchfield. Prediction, Retrodiction, and the Amount of Information Stored in the Present, Journal of Statistical Physics, 2009, pp. 1005, Volume 136, Issue 6, DOI: 10.1007/s10955-009-9808-z