Modeling financial interval time series

PLOS ONE, Feb 2019

In financial economics, a large number of models are developed based on the daily closing price. When using only the daily closing price to model the time series, we may discard valuable intra-daily information, such as maximum and minimum prices. In this study, we propose an interval time series model, including the daily maximum, minimum, and closing prices, and then apply the proposed model to forecast the entire interval. The likelihood function and the corresponding maximum likelihood estimates (MLEs) are obtained by stochastic differential equation and the Girsanov theorem. To capture the heteroscedasticity of volatility, we consider a stochastic volatility model. The efficiency of the proposed estimators is illustrated by a simulation study. Finally, based on real data for S&P 500 index, the proposed method outperforms several alternatives in terms of the accurate forecast.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0211709&type=printable

Modeling financial interval time series

EGSPC. The variable names include "Date", "Open", "High", and "Low". The date ranges are from January Modeling financial interval time series Liang-Ching Lin 0 1 Li-Hsien SunID 1 0 Department of Statistics, National Cheng Kung University , Tainan , Taiwan , 2 The Graduate Institute of Statistics, National Central University , Taoyuan , Taiwan 1 Editor: Cathy W.S. Chen, Feng Chia University , TAIWAN In financial economics, a large number of models are developed based on the daily closing price. When using only the daily closing price to model the time series, we may discard valuable intra-daily information, such as maximum and minimum prices. In this study, we propose an interval time series model, including the daily maximum, minimum, and closing prices, and then apply the proposed model to forecast the entire interval. The likelihood function and the corresponding maximum likelihood estimates (MLEs) are obtained by stochastic differential equation and the Girsanov theorem. To capture the heteroscedasticity of volatility, we consider a stochastic volatility model. The efficiency of the proposed estimators is illustrated by a simulation study. Finally, based on real data for S&P 500 index, the proposed method outperforms several alternatives in terms of the accurate forecast. - Funding: This research was funded by the Ministry of Science and Technology (MOST 105-2628-M006 -001 -MY3 to LCL and MOST 106-2118-M008 -001 to LHS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Introduction There are a large number of models to develop in order to analyze financial data. Conventionally, most of well-proposed models are constructed by daily closing price. By doing so, some important valuable intra-daily information may be discarded such as maximum and minimum prices. According to the recent literature, we can treat the maximum and minimum prices as an interval valued observations. Symbolic data methodologies are applied to deal with this approach. For instance, Billard and Diday [ 1, 2 ] propose the evaluation of mean, variance, and covariance along with regression analysis based on interval valued observations. By integrating the time dependency factor, their method evolves into the analysis of interval time series. The recent research pays more attention to model and forecast the interval time series process. In this study, we propose an interval time series model, and apply the proposed model to forecast the consecutive interval. A na?ve method to approach the interval time series is considering the maximum and minimum processes as a vector. This leads to the vector autoregressive (VAR) model. However, uncontrollable noise terms can bring about larger predicted lower value than the upper value. To deal with this problem, one can change the interval time series process to a bivariate time series model based on the center and the radius. For example, Neto and Carvalho [ 3 ] fit the autoregressive models to the center and radius processes, separately. It is possible to ignore the correlation between the center and radius. Arroyo et al. [ 4 ] consider their VAR model based on the first order difference center process and the radius process. Similarly, Rodrigues and Salish [ 5 ] introduce the centered returns, which is the difference between the current interval and the center value of the previous interval. They propose the center-radius selfexciting threshold autoregressive (CR-SETAR) model. Some related researches for interval time series include Gonzalez-Rodriguez et al. [ 6 ], Blanco et al. [ 7 ], and Gonzalez-Rivera et al. [ 8 ]. However, the parameters in the above models are estimated by the traditional least square estimators. The strictly positive radius may violate the normal assumption for the innovation terms. Besides, the parameters in the above models may not have an intuitive understanding since the structure of the intervals is already destroyed. Alternatively, Teles and Brito [ 9 ] propose the space-time autoregressive (STAR) model. By constraining the parameters, STAR can ensure that the predicted maximum value will be larger than the minimum value. However, we find by simulation that the phenomenon of the lower value to be larger than the upper value may happen when generating the interval observations. In addition, Chou [ 10, 11 ] and Chen et al. [12] estimated the dynamic volatilities by using the ranges (the difference between the logarithmic maximum and minimum prices). For the former, based on the assumption of the asset to be driven by a geometric Brownain motion with stochastic volatility, Chou [ 10, 11 ] considered that the range as well as the upward and downward ranges (the difference between the logarithmic maximum/minimum and opening prices) follow a GARCH model. The parameters are obtained by quasi-maximum likelihood estimation in which the innovation term is given by an exponential distribution with the unit mean. For the latter, Chen et al. [12] further considered a threshold heteroskedastic model for the high/low ranges of asset prices. The innovation term is assumed to be a Weibull distribution. In this study, we propose a model for financial interval time series. Instead of following the practice in the literature, we regard the process as a continuous path, where all observations are unobservable, except for the opening, maximum, minimum, and closing prices. According to this notion, the continuous path can be treated as high frequency data. Referring to Andersen et al. [ 13 ] and A?t-Sahalia et al. [ 14 ], we adopt the stochastic differential equation to characterize this continuous path. In order to construct the likelihood function of the maximum and terminal values (if the process is a standard Brownian motion), the reflection principle and the Girsanov theorem in Shreve [ 15 ] provide a technique. We can derive the conditional likelihood function in an analogous way. The advantages of our approach are: 1. the predicting/fitting maximum values are always larger than the predicting/fitting minimum values; 2. no constraint on the parameters is required to ensure the predicted maximum value to be larger than the minimum value; 3. the assumption of a strictly positive process can be avoided, since we do not transfer the observations to a (positive) radius process. Based on the proposed likelihood function, we obtain estimations of the parameters, and predict the onestep maximum and minimum values. Compared to Chou [ 10, 11 ] and Chen et al. [12], we derive the exact joint distribution for the maximum, minimum, and closing prices. Therefore, we can obtain the more accurate parameter estimations. To capture the heteroscedasticity in volatility, we also consider a stochastic volatility model. In particular, the volatility follows a daily stochastic differential equation where the marginal distribution of the volatility is inverse gamma distributed. From the simulations, we found that our estimation is more efficient than conventional sample covariance in the case of constant volatility, and the estimator proposed by Chou [ 10, 11 ] in the stochastic volatility model in terms of the relative error (RE). This advantage is likely due to using whole observable information instead of the closing price only. We also compare our approach with the frequently used alternatives to demonstrate its advantages. 2 / 20 Main results Referring to Andersen et al. [ 13 ] and A?t-Sahalia et al. [ 14 ], the intra-daily log price, a.k.a. the high frequency data, on the i-th day follows the stochastic differential equation dYt ? mdt ? sdWt; i 1 < t < i; ?1? where Wt is a standard Brownian motion. In this study, we assume that all high frequency data are latent, except for the opening, maximum, minimum, and closing prices. Denote Xi = (Oi, Ui, Li, Ci) as the observed random vector on the i-th day where Oi, Ui, Li, and Ci are the log opening, maximum, minimum, and closing price, respectively. The log maximum and minimum values can be given by Ui = maxi?1<t<i Yt and Li = mini?1<t<i Yt. Applying the Girsanov theorem to Yt and the connection between the maximum and the closing price expressed by Theorem 3.7.3 of Shreve [ 15 ], we have the following result. Theorem 1 Suppose that the log price Yt satisfies the stochastic differential Eq (1), and let OddOi ? Yi 1, CddCi ? Yi and UddUi ? max i 1 t i Yt. Then the joint density of (U, C) conditional on O = o is c? o c?2 fU;CjO?u; cjo? ? 2?2up2fpfofs3 fifi fexipffi?f2uf2isf2fi m2 2s2 ? m?c s2 o? ; u o; u c: ?2? Analogously, we have the following probability density function of the minimum and the closing log prices. Similarly, we can obtain the joint distribution of the terminal and the minimum values. Theorem 2 Suppose that the log price Yt satisfies (1), and let OddOi ? Yi 1, CddCi ? Yi, and LddLi ? min i 1 t i Yt. Then the joint density of (L, C) conditional on O = o is ? m?c s2 o? ; o ?; c ?: In addition, according to Choi and Roh [ 16 ], denoting Wtu ? sup 0 s t Ws and Wtl ? inf 0 s t Ws, the joint distribution of ?Wt; Wtu; Wtl? is given by ? IPWt;Wtu;Wtl ?a Wtl Wtu b; Wt 2 dx? 1 X1 1 pfffiffifexipffif?xfi f2kf?bifaf??i2 exp 2pt k? 1 2t 1 ?x 2t 2b 2k?b a??2 dx; ?3? with a 0 and b 0. Applying the Girsanov theorem, we obtain the joint density of the maximum, minimum, and closing log prices in the following theorem. Theorem 3 Assume that the log price Yt satisfies (1), and the conditions of Theorem 1 and Theorem 2 hold. Then the joint density of (U, L, C) conditional on O = o is X1 4k?k ? 1? 2u 2k?u pffffifi 1fi f?cf?ioffi fs2fi 2ps3 ?c 2k?u ???2 p4fkf2ffifi1 fiffoifsf2 iffi 2ps3 ???2 f2; f1 3 / 20 ? m?c o? ; s2 o? : According to the results from Theorem 1 and Theorem 2, we can obtain the maximum likelihood estimators (MLEs) for the drift term ? and for volatility ?2 as follows. Proposition 1 Suppose that the conditions of Theorem 1 and Theorem 2. Let Xi = (Oi, Ui, Li, Ci) for i = 1, , n be the observed data on the i-th day for the realization Y. The likelihood function of (?, ?2) based on Theorem 1 is given by Similarly, using Theorem 2, the MLEs of ? and ?2 are ; cn?: The MLEs of ? 4 / 20 Proposition 2 From Proposition 1 and Remark 1, the one-step forecast of log maximum and minimum values are Ut?1? ? E?Ut?1jX1; . . . ; Xt Lt?1? ? E?Lt?1jX1; . . . ; Xt ? ? 2p exp f rffiffiffiffiffiffiffi 2s^2 m^2 m^ p exp 2s^2 ? Ot?1 ? m^F s^ rffffifi fiffiffiffi s^2 m^2 s^2 m^ 2p exp f 2s^2g 2m^ 1 2F s^ ; rffiffiffiffiffiffiffi 2s^2 m^2 m^ p exp f 2s^2g ? Ot?1 ? m^F s^ rffffifi fiffiffiffi s^2 m^2 s^2 ? g 1 2F ; 2s^2 2m^ m^ s^ where m^ and s^2 are the MLEs based on X1, . . ., Xt. Note that from Proposition 1 and Remark 1, we have the candidates for the MLE for ? (written as m^ and m^all), and the MLE for ?2 (written as s^l2, s^2u, and s^a2ll); see also the further discussion in Section: Simulations. Note that the quantity Ot+1 can be set to Ct or it can be known. This means that we can make any decision after Ot+1 is revealed. In real-life applications, it is reasonable to assume that the mean return of each day equals zero. Then, we can obtain a simplified form for the one-step prediction. Corollary 1 Let the assumptions of Proposition 1 hold, and further assume that ? = 0. Then we have Ut?1? ? E?Ut?1jX1; . . . ; Xt Lt?1? ? E?Lt?1jX1; . . . ; Xt ? ? Ot?1 ? Ot?1 rffiffiffiffiffiff 2s^2 ; p rffiffiffiffiffiff 2s^2 : p Stochastic volatility model A stochastic volatility model is constructed that the logarithmic price follows a stochastic diffusion equation and the volatility satisfies another diffusion processes. See, for instance, Hull and White [ 17 ], Stein and Stein [ 18 ], and Heston [ 19 ]. Define the stochastic volatility model as following: 8 < dYt : dst2 ? ? mdt ? stdWt; Y0 ? O; pfffifffiiffiffiffiffiffiffif b?st2?dt ? v?st2?dBt; s20 ? z; where (Bt, Wt)t>0 is a two-dimensional standard Brownian motion, and O is the initial log price and z is a random variable from the stationary distribution of st2 and independent of (Bt, Wt). Referred to Bibby et al. [ 20 ], we assume the drift function b( ) to satisfy the mean reverting function, that is, b?st2? ? r?y st2?. Then, the non-negative diffusion function v( ) is uniquely specified by the invariant density of st2. For example, if v(x) is proportion to a constant, x, or x2, the invariant density of st2 is respectively normal, gamma, or inverse gamma distributions. However, if the intra-daily volatility is a stochastic processes, the Girsanov theorem can not be applied straightforwardly. In this section, we consider that st2 is stochastic on the discrete time i = 1, 2, . . ., n, but has a stationary distribution during a fixed time interval t 2 [i ? 1, i]. 5 / 20 To illustrate, we study a particular model. Referred to Bibby et al. [ 20 ], for i = 1, volatility st2 satisfies the following diffusion processes , n, the where Z~i, i = 1, , n, are the standard normal random variables. By Bibby et al. [ 20 ], the stationary distribution of si2 is inverse gamma distributed. Then, given the i-th day volatility si2, the intra-daily log price Yt on i-th day satisfies the following stochastic volatility model, ? = 1/5000. The log opening, maximum, minimum, and closing prices are denoted by Yi = (Oi, Ui, Li, Ci), where 1 i n with n = 250, say. Set Ci = Oi+1, and repeat the above procedure for i = 1, 2, . . ., n ? 1. We consider three practically oriented experiments based on the real observable data. According to the empirical evidence, the higher annualized market volatility is around 0.24, in contrast, the lower one is around 0.04. We also consider one particular case of the moderate volatility with the annualized market volatility being 0.12, and two cases of more violent volatilities with the annualized market volatilities being 0.36 and 0.48. So the daily volatilities are given by 0.04/250, 0.12/250, 0.24/250, 0.36/250, and 0.48/250. In addition, for the setting of drift term, we study two cases for the coefficient of variation: ?/? = 1 (unit dispersion) and ?/? = 2 (over dispersion). We propose the MLE m^ for ? and ?s^2u; s^?2? for ?2 in Proposition 1. Theorem 3 provides the MLE m^all and s^a2ll for ? and ?2, respectively. For comparison, we consider the conventional MLE for ?2 based on discrete time closing prices given by where Ri = Ci ? Ci?1 are the log returns of closing prices and R ? ?n 1? 1 Pin?2 Ri. After 1000 replications, the relative error (RE, see for instance Helfrick and Cooper [ 21 ]) can be defined as where RMSE stands for the root mean square error between the estimators and the true values. The values of m^, m^all, s^2u, s^?2, s^a2ll, and s2 are shown in Tables 1 and 2. We can see that the RE of m^ is slightly less than that of m^all. This is possibly caused by truncating the infinite series into a finite sum of ?20 terms. On the other hand, the REs of s^2; s^?2, and s^a2ll are much less than s2. In u particular, the performance of s^a2ll is the best in terms of the smallest RE. Meanwhile, the relative efficiencies of s2 compared to s^2u and s^?2 are written as RREEs^s22u and RREEs^s22 ranging from 1.55 to ? 1.69. Furthermore, the relative efficiencies of s2 compared to s^a2ll given by RREEs^s22 are ranging all from 1.74 to 2.11. The results indicate that the proposed estimators (s^2, s^2, and s^a2ll) display u ? substantial improvements and is stable in different scenarios. We conclude that the easy to implement estimator m^ has lower relative error than m^all. In addition, the whole observations?/? = 2 ?2 = 0.24/250 12.57 12.76 5.32 5.33 4.99 9.02 1.69 1.69 1.81 based estimator s^a2ll has better accuracy than s^2u, s^?2, and even the conventional s2, in terms of relative error. For the stochastic volatility model, the parameters for the volatility term (cf. (4)) are given by ? = 0, ? = 1, ? = 5, and ? = 4V with V = 0.04/250 (low volatility case), 0.12/250 (moderate volatility case), and 0.24/250 (high volatility case). Note that the term V ? ab1 represents the long term means of the volatility processes. We obtain the MLEs a^ and b^ via Theorem 4 and the MLE for s^v given by s^v ? b^=?a^ 1?. We then consider the conventional estimator s2, s^a2ll discussed in Theorem 3 for the constant volatility case, and the volatility estimator proposed by Chou [ 10, 11 ], denoted as s^2C for comparison. Since the volatility estimator proposed by Chou [ 10, 11 ] based on the ranges, upward ranges, and downward ranges are quite similar, we only discuss one particular case among them. For simplify, we intend to fit GARCH(1,1) for s^2 and the results are shown in Table 3. C In the case of the high volatility, the relative efficiencies of s2 compared to s^v, s^a2ll, and s^2C are 2.19, 1.60, and 1.81, respectively. As we expected that more information (maximum/minimum prices) improve the accuracy of the estimation of the volatility. Besides, the estimators s^v and s^2 estimated under the stoC chastic volatility model perform better than the one based on the constant volatility model written as s^a2ll estimated in the constant volatility case. Meanwhile, s^v has the lowest relative errors since it is obtained from the exact likelihood function instead of the quasi-likelihood function. For the moderate and low volatility cases, the estimators s^v are still the best one with the lowest relative errors. Note that the estimator s^2 performs worse in the case of the C 8 / 20 moderate and low volatility cases. It may be due to the fewer observations or inadequate lags for the GARCH model. This is beyond the scope of this study and we omit the further discussion on it. Real application We present the one-step predictions of an interval valued time series for the S&P 500 index. According to Arroyo et al. [ 4 ], the daily high/low prices of the S&P 500 index are utilized to compare the prediction performances of various methods. We make an one-step prediction by applying the rolling window where the historical data of previous year is used to estimate the parameters. The most challenging period is the financial crisis occurred on year 2008. Therefore, we first study the performances of various methods in one-step prediction on year 2008. Besides, we want to investigate the effect on the historical data. We select the periods of 2006 and 2017. For the former, the historical data of previous year (2005) has the similar pattern as the current year (2006). For the latter, the volatility in the historical data (2016) is violent compared to the predicted period (2017). Therefore, the prediction and estimation time periods are set to be Fig 1 depicts the maximum/minimum prices with the corresponding centralized maximum/minimum returns (denoted by the difference between the logarithmic maximum/minimum and opening prices) in these three periods. From Fig 1, we can see that the volatility in the beginning of 2016 is higher than the whole year of 2017. Meanwhile, the volatilities have no significant difference between the years of 2005 and 2006. In the end of 2008, of course, the volatility is much violent than the usual situation. In order to quantify the accuracy of the one-step forecast, we adopt the measure of the mean distance error (MDE) defined as MDE ? XT ?Ut t?1 U^ t?2 ? ?Lt 2T L^ ?2!1=2 t ; where Xt = [Lt, Ut] is the true interval valued data and X^ t ? ?L^t; U^ t is the estimated one. Following Rodrigues and Salish (2011), descriptive statistics are also evaluated by 1. coverage rate: RC ? T 1 PtT?1 w?wX?tX\tX^? t?, 2. efficient rate: RE ? T 1 PT w?Xt\X^t?, t?1 w?X^t? 3. normalized symmetric difference: RN ? T 1 PtT?1 ww??XXtt\[XX^^tt??, where w(X) represents the length of an interval X. By using Proposition 2, we obtain the high and low prices by one-step forecasting. Then we can compare our results with those of Na?ve method, EWMA, k-NN, VAR(3), VECM(3), (cf. Arroyo et al. [ 4 ]) and CR-SETAR (cf. Rodrigues and Salish [ 5 ]). Let [U1, L1],. . .,[Un, Ln] be the observations and our goal is to forecast the 9 / 20 Fig 1. The maximum and minimum prices with the corresponding centralized maximum and minimum returns for the similar volatility period (upper panel), high volatility period (middle panel), and dissimilar volatility period (lower panel). interval on day n + 1, i.e., ?U^ n?1; L^n?1 . The Na?ve method predicts the intervals by using the previous one, that is, ?U^ n?1; L^n?1 ? ?Un; Ln . EWMA provides the predicted interval as follows k sequences with d points which are closest ones to the current ones in terms of MDEs, and then is to evaluate the average of the consecutive intervals of these k closest sequences. Let ?U; L t?d? ? ??Ut; Lt ; . . . ; ?Ut ?d 1?; Lt ?d 1? ?0 be the d-dimensional interval-valued vectors. Then, we locate ?U; L t?1d?; . . . ; ?U; L t?kd? for t1 < < tk < n in order to have the smallest MDEs compared to ?U; L ?nd?. Then, the predicted interval based on the equal weights k-NN method, 10 / 20 denoted as k-NN(eq), is given by Further, the proportion weights k-NN method, denoted as k-NN(prop), is given by ? Pkm?1 cm?. ?j is defined as the inverse of the MDE between ?U; L ?nd? and where wj ? cj=? ?U; L ?d? plus a small constant, say 10?8. Referred to Arroyo et al. [ 4 ], d = 2 for three consideration ptjeriods and k = 23, 18, and 26 for the similar volatility, high volatility, and dissimilar volatility periods, respectively. According to the results in Arroyo et al. [ 4 ], the VAR(3) model based on the vector of differenced center and radius time series can be written as DCt ! bc ! X30@ bcc;j brc;j 1A0@ DCt j 1 0 c;t 1 Rt j r;t ?U^ n?1; L^n?1 ? ?C^ n?1 ? R^n?1; C^ n?1 isfies where Ct = (Ut + Lt)/2, Rt = (Ut ? Lt)/2, and ?Ct = Ct ? Ct?1. Using the historical observations to fit the VAR(3) model and obtain all of the parameter estimations, the predicted interval is Rn?1 , where C^ n?1 ? DC^ n?1 ? Cn and ?DC^ n?1; R^n?1? sat^ 0 ^ 1 b c A ? 0 X3B b^cc;j @ j?1 b^rc;j b^rr;j 10 1 A: Rn?1 j Next, assuming that (Ut, Lt) satisfies the VECM(3) model, this implies DUt ! bu ! Ut 1 ! 0 buu;j blu;j 10 DUt j ? DLt j 1 0 u;t l;t 1 A; ? P DLt bl Ct Rt DL^n?1 ? 11 / 20 where ?Ut = Ut ? Ut?1 and ?Lt = Lt ? Lt?1. Using the historical observations to fit the VECM (3) model and to obtain all of the parameter estimations, the predicted interval is ?U^ n?1; L^n?1 ?0@?DDUU^^nn??11 ?1AU?n;0@DL^b^n?11? Ln , wUhenr!e??DU^X3n?10B@; Db^L^uu;nj?1? sat1isf0ies 1 A: Ln j?1 b^lu;j b^ll;j DLt j Following Rodrigues and Salish [ 5 ], the two-regime CR-SETAR model based on center and radius time seri!es is 2 0 10 13 4 2 ?4 ac ! ar ? bc ! br ? j?1 arc;j Xq0@ bcc;j j?1 brc;j arr;j brr;j Ct j A5IfRt d gg 13 0 c;t r;t 1 A; ? 4 ?U^ n?1; L^n?1 ? ?C^ n?1 ? R^n?1; C^ n?1 0 1 2 ^ @ Cn?1 A where I{} represents the indicator function. We choose p = 6, q = 8, and d = 1 same as the cases proposed by Rodrigues and Salish [ 5 ]. Using the historical observations to fit the CR-SETAR model and obtain all of the parameter estimations, the predicted interval is Rn?1 , where ?C^ n?1; R^n?1? satisfies ^ a^c ! 0 a^cc;j a^rc;j 10 Ct j 13 A5 ? Rt j 10 b^rr;j Rt j Based on the results of Tables 4, 5 and 6 in terms of the MDE measurement, our proposed method gives the better prediction based on the smaller MDE. Compared to the Na?ve method, 12 / 20 the improvement for MDE is around 26%, 23%, and 38% on the similar volatility period, high volatility period, and dissimilar volatility period, respectively. Meanwhile, compared to the best one among other methods, they are 17%, 20%, and 35% on the similar volatility period, high volatility period, and dissimilar volatility period, respectively. In addition, the measurements RE and RN of our proposed prediction method are also the largest one. The above results show that the proposed model presents the more accurate interval financial time series in the real world. Conclusion We propose the joint densities of daily log opening, maximum and closing prices and daily log opening, minimum and closing prices based on stochastic differential equations. Simulation studies show that the proposed estimators have higher efficiency than the conventional one using RE. In the real data analysis for S&P 500 index, the one-step forecasts of proposed method outperforms than several alternatives in terms of MDE, RE, and RN. The proposed methodology has several interesting extensions. In this paper, we study the stochastic volatility model on discrete time where the stochastic volatility is driven by a stationary distribution during a fixed time interval. In the literature, it is nature to consider the intradaily volatility is governed by stochastic processes. However, owing to the stochasticity feature of the volatility, the Girsanov theorem can not be applied straightforwardly. Based on Akahori et al. [ 22 ], during the small time interval, the asymptotic results can be used to simplify the Girsanov theorem by using the Taylor expansion. Then the likelihood function can be derived and the corresponding maximum likelihood estimators can be obtained. We left this issue as our future project. Alternatively, from the investment strategy point of view, it is also interesting to study the high dimensional financial interval time series for multiple assets leading to the corresponding estimation problem for the proposed high dimensional model. Appendix: Proofs Proof of Theorem 1 Let Yt = log St given by the dynamics Let Mt = sup0 s t Ys. The joint cdf of Yt and Mt is written as Let Y~ t ? Yts o given by and M~ t ? sup 0 s t Y~ s. Applying the Girsanov theorem implies h i E IfMts o<ms o;Yts o<ysogjY0 ? o h i ? E IfM~ t<ms o;Y~t<ysogjY0 ? o ? m 2 E IfM~ t<ms o;Wt<ysogesWt 2ms2tjY0 ? o dYt ? mdt ? sdWt; Y0 ? o: ?A:1? ? ? P?Mt < m; Yt < yjY0 ? o? E?IfMt<m;Yt<ygjY0 ? o h i E IfMts o<ms o;Yts o<ysogjY0 ? o : where Wt is a standard Brownian motion. Hence, using the joint pdf of Wt and sup0 s t we obtain the joint density of Yt and Mt is written as Ws, P?Mt < m; Yt < y? m2 ? By differentiating the Eq (A.3) with respect to l and w, we obtain fm;W?l; w? ? 2?twp2fp2ftfl?ifefxipffif?wfi2ft2lf?2gi;fwfi fl; fl<i 0: The rest part follows the same procedure as Theorem 1 to demonstrate this proof. Proof of Theorem 3 Given Mtu ? sup 0 s t Ys and Mtl ? inf 0 s t Ys, Similarly, applying the Girsanov theorem and using Y~ t ? Yts o given by dYt ? mdt ? sdWt; Y0 ? o; kX?11 4tks?3kp?2fp1ftf?iff1if?iyf?fo if2fmuisf2t2fk?imfufiml??2 f1 X 1 4k2 ?y o 2k?mu ml??2 pfffif1fififfifsf2tiffifffi2 k? 1 ts3 2pt ?y ? o 2mu ?y o o? m2 m?y 2s2 t ? s2 o? ; ?A:4? ?A:5? ?A:6? Based on the joint density of Theorem 1, the likelihood function of (?, ?) based on observations ?~u; ~o; ~c? is 2m?ci o ? i : Differentiating the Eq (A.6) with respect to ?, it implies X n i?1 ?m ?ci oi??=s2 ? 0: Then, the maximum likelihood estimator of ? is ?ci Next, Differentiating the Eq (A.6) with respect to ?2, we obtain 3n ? 2s2 Plugging (A.7) to (A.8), the maximum likelihood estimator of ?2 is Xn ?ci i?1 s4 oi? Plugging (A.7) and (A.9) to H, we obtain 6 Hjm?m^;s2?s^2u ? 64 oi Xn ?ci i?1 s4 oi? ci?2 ? m2 s6 2m?ci 1 2 m^ : 3 nm s4 3 0 and it is clear that H is a negative definite matrix. The maximum likelihood estimator of (?, ?2) based on Theorem 2 can be derived analogously and the proof is omitted here. Proof of Proposition 2 We derive the one step forecast for the log maximum value. The one step prediction of log minimum value can be obtained by using the same technique. By the joint density of Theorem 1, the marginal distribution of the log maximum variable given the log open variable is f ?ujo? ? ? ? ? ? exp exp exp Z u 2?2upffoffifi fexipffi?f2uf2isf2fi c? o c?2 1 2ps3 m2 2s2 ? m?c s2 o? dc 2m?u s2 2m?u s2 2m?u s2 o? o? o? Z u 2?2upffoffifi fexipffiff2iuf? o c? ?c 1 2ps3 2sf2i Z 1 2?x m? 2 pffffifeixfpiffx ifdfxiffi u o?m 2ps3 2s2 "rffiffiffiof?fm?i2ffiffiffi p2s2 exp ?u 2sm2 F u m?2 2s2 rffiffiffiffm?i2ffiffiffi p2s2 exp ?u 2os2 2sm2 F u o ? m s exp 2m?u s2 dc o ? m s o? ; # 16 / 20 o ps2 ? L1 L2: ?say? for u > o. Then the expectation of U given O = o is E?UjO ? o ? ue 2s2 du Z 1 rffiffiffiZff12imfufiffiffi 2 ?u o m?2 o s2 F s For the term L1, we obtain the results by changing the variable. L1 ? ? p To tackle the term L2, we exchange the order of integration as follows. L2 ? Z 1 2mu Z 1 o s2 Z 1 Z 1 m 2ps2 rffffifi fiffiffiffi 2p s2 m2 e 2s2 ? o ? m s2 2m m s F ? s2 2m h o 1 F m s i : Z 1 x2 ?A:12? Z 1 ?x ? o m s2=?2m?? ?x 2m?2 s2 e 2s2 pfffifffi ife f2si2fdfx?iffifofiffpifffidfxffi iffiffiffiff m 2ps2 2m m 2ps2 y y2 s2 Z 1 e 2ys22 s2 h m i pfffieff2sf2idyif?foif?fmiffiffpifffidfyf?fi iffioff1ifFfiffiffiffi 2m m 2ps2 2m s Combining (A.11) and (A.12), the conditional expectation of (A.10) becomes E?UjO ? o ? p rfffiffififfiffifrffifffiffiififfiffiffi 2s2 m2 m s2 m2 s2 h m i e 2s2 ? o ? mF e 2s2 1 2F : s 2p 2m s ?A:13? Finally, plugging the maximum likelihood estimators of ? and ?2 into (A.13), we claim the Proof of Corollary 1 By L?Ho?pital?s rule, the final term of Ut(1) is 2m p lim m!0 s2?1 2F?m=s?? ? lim m!0 2s ?m=s? 2 s ? pfff:fifi fiffiffiffi 2p lmi!m0 Ut?1? ? ? Ot?1 rfffiffififrffifffififffififffifi frifffffififififfiffiff 2s2 s2 2p s 2s2 ? pfff?fiOfit?f1?iffi f:fiffi 2p p Similar procedure can be applied to Lt(1) and we complete this proof. Proof of Theorem 4 By Theorem 1, we have the following joint density of (U, C) conditional on O = o and s2 ? s i fU;CjO;si2 ?u; cjo; s? ? 2?2u o c? ?2u o c? pffffifi fexipffiffiffi 2ps3=2 2s 2 m2 2s ? m?c o? s Since si2 follows (4), the stationary distribution of si2 is inverse gamma distribution, i.e., fs2 ?s? ? i ba G?a? exp f b=sg: Then, by using Bayesian method, we obtain the joint density of (U, C) conditional on O = o by combining (A.14) and (A.15) as follows. s s ?A:14? ?A:15? : ds : ds Analogously, by using Theorem 2, we can obtain the joint density of (L, C) conditional on O = ? ? ? ? ? ? Z 1 0 0 2?2u Z 1 2?2u Z 1 Z 1 0 0 Finally, by using Theorem 3, the joint density of (U, L, C) conditional on O = o is given below. 2u 2k?u ???2=2 ? m2=2 m?c 2u 2k?u ???2=2 ? m2=2 m?c m?c m?c o? ? 2b a?5=2 m?c m?c which completes the proof. Author Contributions Writing ? original draft: Liang-Ching Lin, Li-Hsien Sun. 19 / 20 1. Billard L. , and Diday E. From the Statistics of Data to the Statistics of Knowledge: Symbolic Data Analysis . Journal of the American Statistical Association . 2003 ; 98 ( 462 ): 470 - 487 . https://doi.org/10.1198/ 016214503000242 2. Billard L. , and Diday E. Symbolic Data Analysis: Conceptual Statistics and Data Mining 1st ed. Chichester , UK: John Wiley & Sons.; 2006 . 3. Neto E.A.L. , and De Carvalho F.A.T. Centre and Range method for fitting a linear regression model to symbolic interval data . Computational Statistics & Data Analysis . 2007 Jan; 52 ( 3 ): 1500 - 1515 . https:// doi.org/10.1016/j.csda. 2007 . 04 .014 4. Arroyo J. , Gonza? lez -Rivera G., and Mate? C. Forecasting with Interval and Histogram Data: Some Financial Applications . Handbook of Empirical Economics and Finance . 2011 ; 247 - 279 . 5. Rodrigues P. M. , and Salish N. Modeling and forecasting interval time series with Threshold models: An application to S&P500 Index returns . Working paper. 2011 . 6. Gonza?lez-Rodr??guez G., Blanco A ?., Corral N. , and Colubi A. Least squares estimation of linear regression models for convex compact random sets . Advances in Data Analysis and Classification . 2007 Mar; 1 ( 1 ): 67 - 81 . https://doi.org/10.1007/s11634-006-0003-7 7. Blanco A. , Colubi A. , Corral N. , and Gonzalez-Rodriguez G . On a linear independence test for intervalvalued random sets . Soft Methods for Handling Variability and Imprecision. Advances in Soft Computing . 2008 ; 111 - 117 . 8. Gonzalez-Rivera G. , Lee T .-H., and Mishra S. Jumps in cross-sectional rank and expected returns: a mixture model . Journal of Applied Econometrics . 2008 Aug; 23 ( 5 ): 585 - 606 . https://doi.org/10.1002/jae. 1015 9. Teles P. , and Brito P. Modeling Interval Time Series with Space-Time Processes . Communications in Statistics-Theory and Methods . 2015 ; 44 ( 17 ): 3599 - 3627 . https://doi.org/10.1080/03610926. 2013 . 782200 10. Chou R . Forecasting financial volatilities with extreme values: The conditional autoregressive range (CARR) model . Journal of Money, Credit and Banking . 2005 ; 37 : 561 - 582 . https://doi.org/10.1353/ mcb . 2005 .0027 11. Chou R . Modeling the asymmetry of stock movements using price ranges . Econometric Analysis of Financial and Economic Time Series (Advances in Econometrics , Volume 20 Part 1), Terrell Dek, Fomby Thomas B. (ed.). Emerald Group Publishing Limited. 2006 ; 231 - 257 . 12. Chen C.W.S. , and Gerlach R. , and Lin E.M.H. Volatility forecast using threshold heteroskedastic models of the intra-day range . Computational Statistics & Data Analysis . 2008 ; 52 : 2990 - 3010 . https://doi. org/10.1016/j.csda. 2007 . 08 .002 13. Andersen T. G. , Bollerslev T. , Diebold F. X. , and Ebens H. The Distribution of Realized Stock Return Volatility . Journal of Financial Economics . 2001 Jul; 61 ( 1 ): 43 - 76 . https://doi.org/10.1016/ S0304 -405X ( 01 ) 00055 - 1 14. Ait-Sahalia Y. , Mykland P. A. , and Zhang L. How Often to Sample a Continuous- Time Process in the Presence of Market Microstructure Noise . Review of Financial Studies . 2005 Jul; 18 ( 2 ): 351 - 416 . https://doi.org/10.1093/rfs/hhi016 15. Shreve S.E. Stochastic Calculus for Finance II: Continuous-Time Models 2nd ed . New York: Springer. 2004 . 16. Choi B. , and Roh J. On the trivariate joint distribution of Brownian motion and its maximum and minimum . Statistics and Probability Letters . 2013 Apr; 83 ( 4 ): 1046 - 1053 . https://doi.org/10.1016/j.spl. 2012 . 12 .015 17. Hull J. , and White A. ( 1987 ). The pricing of options on assets with stochastic volatilities . Journal of Finance . 1987 ; 42 : 281 - 300 . https://doi.org/10.1111/j.1540- 6261 . 1987 .tb02568.x 18. Stein E.M. , and Stein J.C. Stock price distributions with stochastic volatility: an analytic approach . Review of Financial Studies . 1991 ; 4 : 727 - 752 . https://doi.org/10.1093/rfs/4.4. 727 19. Heston S.L.A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options . The Review of Financial Studies . 1993 ; 6 : 327 - 343 . https://doi.org/10.1093/rfs/6.2. 327 20. Bibby B.M. , Skovgaard I.M. , and S?rensen M. Diffusion-type models with given marginal distribution and autocorrelation function . Bernoulli . 2005 ; 11 : 191 - 220 . https://doi.org/10.3150/bj/1116340291 21. Helfrick A.D. , and Cooper W.D. Modern Electronic Instrumentation and Measurement Techniques Prentice-Hall of India Pvt . Ltd., New Delhi. 1996 22. Akahori J. , Song X. , and Wang T.-H . Bridge representation and modal-path approximation . Stochastic Processes and their Applications . forthcoming.


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0211709&type=printable

Liang-Ching Lin, Li-Hsien Sun. Modeling financial interval time series, PLOS ONE, 2019, DOI: 10.1371/journal.pone.0211709