## 数据科学代写|金融统计代写Financial Statistics代考|Forecasting

In this section we make some considerations about forecasting and validation of the proposed model. Usually two data bases are used for testing tha forecasting ability of a model: one (in-sample), used for estimation, and the other (out-of-sample) used for comparing forecasts with true values. There is an extra complication in the case of volatility models: there is no unique definition of volatility. Andersen and Bollerslev (1998) show that if wrong estimates of volatility are used, evaluation of forecasting accuracy is compromised. We could use the realized volatility as a basis for comparison, or use some trading system.

We could, for example, have a model for hourly returns and use the realized volatility computed from 15 min returns for comparisons. In general, we can compute $v_{h, t}=\sum_{i=1}^{a_{h}} r_{t-i}^{2}$, where $a_{h}$ is the aggregation factor (4, in the case of 15 min returns). Then use some measure based on $s_{h}=\tilde{v}{h, t}-v{h, t}$ for example, mean squared error, where $\tilde{v}{h, t}$ is the volatility predicted by the proposed model. See Taylor and Xu (1997), for example. 32 J. Risk Financial Manag. 2020, 13,38 Now consider Model (3). The forecast of volatility at origin $t$ and horizon $\ell$ is given by \begin{aligned} \hat{\sigma}{t}^{2}(l) &=E\left(\sigma_{t+l}^{2} \mid X_{t}\right) \ &=E\left(C_{0}+C_{1}\left(r_{t+l-1}+\ldots+r_{t+l-a_{1}}\right)^{2}+\ldots+\right.\ &\left.+C_{m}\left(r_{t+l-1}+\ldots+r_{t+l-a_{m}}\right)^{2}+b_{1} \sigma_{t+l-1}^{2}+\ldots+b_{p} \sigma_{t+l-p}^{2} \mid X_{t}\right), \end{aligned}
where $X_{t}=\left(r_{t}, \sigma_{t}, r_{t-1}, \sigma_{t-1}, \ldots\right)$, for $l=1,2, \ldots$

## 数据科学代写|金融统计代写Financial Statistics代考|High Frequency Data

In this section we further elaborate on high frequency data and introduce the series that will be analyzed later. High frequency data are very important in the financial environment, mainly because there exist large movements in short intervals of time. This aspect represents an interesting opportunity for trading. Furthermore, it is well known that volatilities in different frequencies have significant cross-correlation. We can even say that coarse volatility predicts fine volatility better than the inverse, as shown in Dacorogna et al. (2001).

As an example, take the tick by tick foreign exchange (FX) time series Euro-Dollar, from January First 1999 to December 31, 2002. Returns are calculated using bid and ask prices, as
$$r_{t}=\ln \left(\left(p_{t}^{\text {bid }}+p_{t}^{a s k}\right) / 2\right)-\ln \left(\left(p_{t-1}^{\text {bid }}+p_{t-1}^{a s k}\right) / 2\right) .$$
We discard Saturdays and Sundays, and we replace holidays with the means of the last ten observations of the returns for each respective hour and day. After cleaning the data (see Dacorogna et al. (2001), for details) we will consider equally spaced returns, with sampling interval $\Delta t=15 \mathrm{~min}$. This seems to be adequate, as many studies indicate.

Figure 2 shows Euro-Dollar returns calculated as above. The length of this time series is 95,317 . The figure shows that the absolute returns present a seasonal pattern. This is due to the fact that physical time does not follow, necessarily, the same pattern as the business time. This is a typical behavior of a financial time series and we will use a seasonal adjustment procedure similar to that of Martens et al. (2002). However, we will use absolute returns instead of squared returns; that is, we will compute the seasonal pattern as
$$S_{d, s, h}=\frac{1}{s} \sum_{j=1}^{s} \mid\left(r_{d, j, h} \mid,\right.$$
where $r_{d s, h}$ is the return in the weekday $d$, week $s$ and hour $h$, and $s$ is the number of weeks from the beginning of the series. Therefore, $S_{d, N_{s}, h}$ is the rolling window mean of the absolute returns with the beginning fixed.

## 数据科学代写|金融统计代写Financial Statistics代考|Forecasting

$$\hat{\sigma} t^{2}(l)=E\left(\sigma_{t+l}^{2} \mid X_{t}\right) \quad=E\left(C_{0}+C_{1}\left(r_{t+l-1}+\ldots+r_{t+l-a_{1}}\right)^{2}+\ldots++C_{m}\left(r_{t+l-1}+\ldots+r_{t+l-a_{m}}\right)^{2}+b_{1} \sigma_{t+l-1}^{2}+\ldots+b_{p} \sigma_{t+l-p}^{2} \mid X_{t}\right)$$

## 数据科学代写|金融统计代写Financial Statistics代考|High Frequency Data

$$r_{t}=\ln \left(\left(p_{t}^{\text {bid }}+p_{t}^{a s k}\right) / 2\right)-\ln \left(\left(p_{t-1}^{\text {bid }}+p_{t-1}^{a s k}\right) / 2\right) .$$

(2001)），我们将考䖍等距回报，采样间隔 $\Delta t=15 \mathrm{~min}$. 正如许多研究表明的那样，这似乎是足够的。

(2002 年) 。但是，我们将使用绝对收益而不是平方收益; 也就是涚，我们将计算痵节性模式为
$$S_{d, s, h}=\frac{1}{s} \sum_{j=1}^{s} \mid\left(r_{d, j, h} \mid,\right.$$

