Which of the following statements about forecasting is not correct?

Chapter 5

1

Which of the following is a typical characteristic of financial asset return time-series?

a) Their distributions are thin-tailed
b) They are not weakly stationary
c) They are highly autocorrelated
d) They have no trend

Correct! Most asset return distributions are leptokurtic - that is, they are "fat-tailed", or have more of the distribution in the tails than would a normal distribution with the same mean and variance. They are usually more peaked at the mean as well. Although price series (or log-price series) are usually best characterised as unit root processes, returns definitely have no trend - either deterministic or stochastic. Therefore d is correct. Asset returns usually show very low autocorrelation. This is to be expected since if the returns were highly dependent upon their previous values, it would be easy to generate a trading rule that exploited this feature, such that the autocorrelation would quickly disappear.

Incorrect! Most asset return distributions are leptokurtic - that is, they are "fat-tailed", or have more of the distribution in the tails than would a normal distribution with the same mean and variance. They are usually more peaked at the mean as well. Although price series (or log-price series) are usually best characterised as unit root processes, returns definitely have no trend - either deterministic or stochastic. Therefore d is correct. Asset returns usually show very low autocorrelation. This is to be expected since if the returns were highly dependent upon their previous values, it would be easy to generate a trading rule that exploited this feature, such that the autocorrelation would quickly disappear.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

2

Which of the following is a DISADVANTAGE of using pure time-series models (relative to structural models)?

a) They are not theoretically motivated
b) They cannot produce forecasts easily
c) They cannot be used for very high frequency data
d) It is difficult to determine the appropriate explanatory variables for use in pure time-series models

Correct! However, pure time-series models don't have a strong theoretical motivation. Why should the current value of, say, a stock return, be related to its previous values and to the previous values of some random error process? It is much easier to justify why the stock returns should be related to the current and previous values of some macroeconomic variables that affect the profitability and therefore the valuation of firms. Time-series models can produce forecasts easily, and this is one of their main advantages. This is covered in a later topic, but producing forecasts from time-series models is simply a matter of iterating with the conditional expectations operator. On the other hand, producing forecasts from structural models would require the forecasting of all of the structural variables in the equation. By definition, time-series models don't use any explanatory variables so the issue of variable choice doesn't arise. Finally, time-series models can be applied whatever the frequency of the data, since the methodology would be the same. Thus the correct answer is a.

Incorrect! However, pure time-series models don't have a strong theoretical motivation. Why should the current value of, say, a stock return, be related to its previous values and to the previous values of some random error process? It is much easier to justify why the stock returns should be related to the current and previous values of some macroeconomic variables that affect the profitability and therefore the valuation of firms. Time-series models can produce forecasts easily, and this is one of their main advantages. This is covered in a later topic, but producing forecasts from time-series models is simply a matter of iterating with the conditional expectations operator. On the other hand, producing forecasts from structural models would require the forecasting of all of the structural variables in the equation. By definition, time-series models don't use any explanatory variables so the issue of variable choice doesn't arise. Finally, time-series models can be applied whatever the frequency of the data, since the methodology would be the same. Thus the correct answer is a.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

3

Which of the following conditions are necessary for a series to be classifiable as a weakly stationary process?

(i) It must have a constant mean

(ii) It must have a constant variance

(iii) It must have constant autocovariances for given lags

(iv) It must have a constant probability distribution

a)

(ii) and (iv) only

b)

(i) and (iii) only

c) (i), (ii), and (iii) only
d) (i), (ii), (iii), and (iv)

Correct! c is correct. (i) to (iii) are all required for a process to be classifiable as a weakly stationary (or covariance stationary - the two terms are equivalent) process. The final condition of having a constant probability distribution is a stronger condition than the first three, since it applies to the whole distribution whereas the first three conditions only apply to the first two moments of the distribution (the mean and the variance). This last condition thus encompasses the first three and is only required for a process to be classifiable as a strictly stationary process.

Incorrect! c is correct. (i) to (iii) are all required for a process to be classifiable as a weakly stationary (or covariance stationary - the two terms are equivalent) process. The final condition of having a constant probability distribution is a stronger condition than the first three, since it applies to the whole distribution whereas the first three conditions only apply to the first two moments of the distribution (the mean and the variance). This last condition thus encompasses the first three and is only required for a process to be classifiable as a strictly stationary process.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

4

A white noise process will have

(i) A zero mean

(ii) A constant variance

(iii) Autocovariances that are constant

(iv) Autocovariances that are zero except at lag zero

a) (ii) and (iv) only
b) (i) and (iii) only
c) (i), (ii), and (iii) only
d)

(i), (ii), (iii), and (iv)

Correct! (ii) and (iv) are correct. A white noise process must have a constant mean, a constant variance and no autocovariance structure (except at lag zero, which is the variance). It is not necessary for a white noise process to have a zero mean - it only has to be constant. A white noise process with a zero mean is called a zero mean white noise process (!) but this is a special case. Having autocovariances that are constant is not a sufficiently strong condition for it to be white noise - they must be zero (except for at lag zero again).

Incorrect! (ii) and (iv) are correct. A white noise process must have a constant mean, a constant variance and no autocovariance structure (except at lag zero, which is the variance). It is not necessary for a white noise process to have a zero mean - it only has to be constant. A white noise process with a zero mean is called a zero mean white noise process (!) but this is a special case. Having autocovariances that are constant is not a sufficiently strong condition for it to be white noise - they must be zero (except for at lag zero again).

Your answer has been saved.
Which of the following statements about forecasting is not correct?

5

Consider the following sample autocorrelation estimates obtained using 250 data points:

1) Lag 1 2 3

2) Coefficient 0.2 -0.15 -0.1

3) Assuming that the coefficients are approximately normally distributed, which of the coefficients are statistically significant at the 5% level?

a) 1 only
b) 1 and 2 only
c) 1, 2 and 3 only
d) It is not possible to determine the statistical significance since no standard errors have been given

Correct! Recall that an autocorrelation coefficient is termed statistically significant if it is outside of +or- 1.96/sqrt(T), where T is the number of observations, 250 in this case. Thus a coefficient would be defined as significant if it is smaller than -0.12, or larger than 0.12. Thus the autocorrelation coefficients for lags 1 and 2 are both statistically significant. In fact, we do not need to know the standard errors around the coefficients since they are correlations and under the null hypothesis they approximately follow a normal distribution with zero mean and unit variance.

Incorrect! Recall that an autocorrelation coefficient is termed statistically significant if it is outside of +or- 1.96/sqrt(T), where T is the number of observations, 250 in this case. Thus a coefficient would be defined as significant if it is smaller than -0.12, or larger than 0.12. Thus the autocorrelation coefficients for lags 1 and 2 are both statistically significant. In fact, we do not need to know the standard errors around the coefficients since they are correlations and under the null hypothesis they approximately follow a normal distribution with zero mean and unit variance.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

6

Consider again the autocorrelation coefficients described in question 5. The value of the Box-Pierce Q-statistic is

a) 0.12
b) 37.50
c) 18.12
d) 18.09

Correct! The correct answer is c. Note that the question asks for the Box-Pierce Q-statistic, not the modified Ljung-Box Q* version. Therefore, the correct formula is Q = 250* (0.2^2 + -0.15^2 + 0.1^2) = 18.12.

Incorrect! The correct answer is c. Note that the question asks for the Box-Pierce Q-statistic, not the modified Ljung-Box Q* version. Therefore, the correct formula is Q = 250* (0.2^2 + -0.15^2 + 0.1^2) = 18.12.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

7

Which of the following statements is INCORRECT concerning a comparison of the Box-Pierce Q and the Ljung-Box Q* statistics for linear dependence in time series?

a) Asymptotically, the values of the two test statistics will be equal
b) The Q test has better small-sample properties than the Q*
c) The Q test is sometimes over-sized for small samples
d) As the sample size tends towards infinity, both tests will show a tendency to always reject the null hypothesis of zero autocorrelation coefficients.

Correct! It is correct that asymptotically, the two test statistics will be equal in value. This can be seen since, in the Ljung-Box formulation, the (n+2) term will be equivalent to the (n-k) term as the sample size tends to infinity. It is also true that the original Box-Pierce Q-statistic is sometimes over-sized for small samples, which is what motivated the formulation of a modified version (the Q* statistic). Thus the Box-Pierce Q will tend to reject the null hypothesis of no autocorrelation more frequently than it should do given the nominal significance level assumed. Thus, if a 5% significance level is used, the null is sometimes rejected more than 5% of the time by chance alone when there is in fact no autocorrelation. Clearly, then answer b is the incorrect one. It is also true that, as the sample size tends to infinity, the size of the autocorrelation coefficients required to lead to rejection of the null hypothesis will fall. In other words, when the sample size gets very big, even an autocorrelation coefficient (on a -1 to 1 scale) of 0.01 can lead to rejection. This arises since calculating the test statistic involves multiplying by the sample size.

Incorrect! It is correct that asymptotically, the two test statistics will be equal in value. This can be seen since, in the Ljung-Box formulation, the (n+2) term will be equivalent to the (n-k) term as the sample size tends to infinity. It is also true that the original Box-Pierce Q-statistic is sometimes over-sized for small samples, which is what motivated the formulation of a modified version (the Q* statistic). Thus the Box-Pierce Q will tend to reject the null hypothesis of no autocorrelation more frequently than it should do given the nominal significance level assumed. Thus, if a 5% significance level is used, the null is sometimes rejected more than 5% of the time by chance alone when there is in fact no autocorrelation. Clearly, then answer b is the incorrect one. It is also true that, as the sample size tends to infinity, the size of the autocorrelation coefficients required to lead to rejection of the null hypothesis will fall. In other words, when the sample size gets very big, even an autocorrelation coefficient (on a -1 to 1 scale) of 0.01 can lead to rejection. This arises since calculating the test statistic involves multiplying by the sample size.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

8

Consider the following MA(3) process

yt = μ + Εt + θ1Εt-1 + θ2Εt-2 + θ3Εt-3 , where σt is a zero mean white noise process with variance σ2.

Which of the following statements are true?

i) The process yt has zero mean

ii) The autocorrelation function will have a zero value at lag 5

iii) The process yt has variance σ2

iv) The autocorrelation function will have a value of one at lag 0

a) (ii) and (iv) only
b) (i) and (iii) only
c) (i), (ii), and (iii) only
d) (i), (ii), (iii), and (iv)

Correct! (ii) and (iv) only are true. Even though epsilon_t has a zero mean, y_t will have a mean equal to the intercept in the MA equation, mu. Also, even though the disturbance term has a variance of sigma squared, the series y will have a variance that is also a function of the MA coefficients (the thetas). In fact, the variance will be (1+theta1^2 + theta2^2 + theta3^2) x sigma^2. Recall that an MA(q) process only has memory of length q. This means that all of the autocorrelation coefficients will have a value of zero beyond lag q. This can be seen by examining the MA equation, and seeing that only the past q disturbance terms enter into the equation, so that if we iterate this equation forward through time by more than q periods, the current value of the disturbance term will no longer affect y. Finally, since the autocorrelation function at lag zero is the correlation of y at time t with y at time t (i.e. the correlation of y_t with itself), it must be one by definition.

Incorrect! (ii) and (iv) only are true. Even though epsilon_t has a zero mean, y_t will have a mean equal to the intercept in the MA equation, mu. Also, even though the disturbance term has a variance of sigma squared, the series y will have a variance that is also a function of the MA coefficients (the thetas). In fact, the variance will be (1+theta1^2 + theta2^2 + theta3^2) x sigma^2. Recall that an MA(q) process only has memory of length q. This means that all of the autocorrelation coefficients will have a value of zero beyond lag q. This can be seen by examining the MA equation, and seeing that only the past q disturbance terms enter into the equation, so that if we iterate this equation forward through time by more than q periods, the current value of the disturbance term will no longer affect y. Finally, since the autocorrelation function at lag zero is the correlation of y at time t with y at time t (i.e. the correlation of y_t with itself), it must be one by definition.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

9

Consider a series that follows an MA(1) with zero mean and a moving average coefficient of 0.4. What is the value of the autocovariance at lag 1?

a) 0.4
b) 1
c) 0.34
d) It is not possible to determine the value of the autocovariances without knowing the disturbance variance.
Correct! In fact, d is correct. The autocovariance at lag 1 for the series will depend on the variance of the disturbances so that it is not possible to calculate it without this information.

Incorrect! In fact, d is correct. The autocovariance at lag 1 for the series will depend on the variance of the disturbances so that it is not possible to calculate it without this information.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

10

For an autoregressive process to be considered stationary

a) The roots of the characteristic equation must all lie inside the unit circle
b) The roots of the characteristic equation must all lie on the unit circle
c) The roots of the characteristic equation must all lie outside the unit circle
d)

The roots of the characteristic equation must all be less than one in absolute value

Correct! A stationary autoregressive process will have all the roots of its characteristic equation lying outside the unit circle - this is equivalent to saying that the roots must all be larger than one in absolute value. The "unit circle" terminology is sufficiently flexible that the same definition can still be applied if the roots of the characteristic equation are not real (i.e. they are complex). To offer a simple illustration, consider the following explosive AR(1) model:

y_t = 2 y_(t-1) + u_t.

The characteristic equation will be 1 - 2z = 0. Therefore the root of the characteristic equation will be 0.5, which is not outside the unit circle and hence the process y is non-stationary.

Incorrect! A stationary autoregressive process will have all the roots of its characteristic equation lying outside the unit circle - this is equivalent to saying that the roots must all be larger than one in absolute value. The "unit circle" terminology is sufficiently flexible that the same definition can still be applied if the roots of the characteristic equation are not real (i.e. they are complex). To offer a simple illustration, consider the following explosive AR(1) model:

y_t = 2 y_(t-1) + u_t.

The characteristic equation will be 1 - 2z = 0. Therefore the root of the characteristic equation will be 0.5, which is not outside the unit circle and hence the process y is non-stationary.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

11

Consider the following AR(2) process:

yt = 1.5 yt-1 - 0.5 yt-2 + ut

This is a

a) Stationary process
b) Unit root process
c) Explosive process
d) Stationary and unit root process

Correct! The characteristic equation for this AR(2) is 1- 1.5z + 0.5z^2 = 0, which factorises to (1-z)(1-0.5z) = 0. Thus the roots are z = 1 and z = 2. The first of these is clearly a unit root (it lies on the unit circle), while the other is a stationary root (lies outside the unit circle). In the context of a stochastic process, the smallest root dominates. That is, if one of the roots is a unit root and the other stationary, the series will behave as a unit root process. Also, if one of the roots is explosive and one a unit root, the series will behave as an explosive process. Thus it only takes one of the roots of the characteristic equation to be non-stationary for the series to be non-stationary. More generally, and AR(p) model will have p roots (although two or more of the roots may be the same), and in the case of p>2, any good econometrics computer package (e.g., EViews) could calculate the roots for you.

Incorrect! The characteristic equation for this AR(2) is 1- 1.5z + 0.5z^2 = 0, which factorises to (1-z)(1-0.5z) = 0. Thus the roots are z = 1 and z = 2. The first of these is clearly a unit root (it lies on the unit circle), while the other is a stationary root (lies outside the unit circle). In the context of a stochastic process, the smallest root dominates. That is, if one of the roots is a unit root and the other stationary, the series will behave as a unit root process. Also, if one of the roots is explosive and one a unit root, the series will behave as an explosive process. Thus it only takes one of the roots of the characteristic equation to be non-stationary for the series to be non-stationary. More generally, and AR(p) model will have p roots (although two or more of the roots may be the same), and in the case of p>2, any good econometrics computer package (e.g., EViews) could calculate the roots for you.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

12

Consider the following AR(1) model with the disturbances having zero mean and unit variance

yt = 0.2 + 0.4 yt-1 + ut

The (unconditional) mean of y will be given by

a) 0.2
b) 0.4
c) 0.5
d) 0.33
Correct! For an AR(1) process, the (unconditional) mean of y will be given by the intercept divided by (1 minus the autoregressive coefficient), which in this case is 0.2 / (1-0.4) = 0.33. Incorrect! For an AR(1) process, the (unconditional) mean of y will be given by the intercept divided by (1 minus the autoregressive coefficient), which in this case is 0.2 / (1-0.4) = 0.33. Your answer has been saved.
Which of the following statements about forecasting is not correct?

13

The (unconditional) variance of the AR(1) process for y given in question 12 will be

a) 1.19
b) 2.5
c) 1
d) 0.33

Correct! The (unconditional) variance of an AR(1) process is given by the variance of the disturbances divided by (1 minus the square of the autoregressive coefficient), which in this case is 1 / (1 - 0.4^2) = 1.19.

Incorrect! The (unconditional) variance of an AR(1) process is given by the variance of the disturbances divided by (1 minus the square of the autoregressive coefficient), which in this case is 1 / (1 - 0.4^2) = 1.19.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

14

The value of the autocovariance function at lag 3 for the AR(1) model given in question 12 will be

a) 0.4
b) 0.064
c) 0
d) 0.076

Correct! The value of the autocovariance function at lag k for any AR(1) process with autoregressive coefficient a1 is given by a1^k multiplied by sigma^2 divided by (1 minus a1^2) , which in this case is 0.4^3 x 1 / (1- 0.4^2) = 0.076.

Incorrect! The value of the autocovariance function at lag k for any AR(1) process with autoregressive coefficient a1 is given by a1^k multiplied by sigma^2 divided by (1 minus a1^2) , which in this case is 0.4^3 x 1 / (1- 0.4^2) = 0.076.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

15

The value of the autocorrelation function at lag 3 for the AR(1) model given in question 12 will be

a) 0.4
b) 0.064
c) 0
d) 0.076
Correct! The value of the autocorrelation function at lag k for any AR(1) process with autoregressive coefficient a1 is simply given by a1^k, which in this case is 0.4^3 = 0.064.Incorrect! The value of the autocorrelation function at lag k for any AR(1) process with autoregressive coefficient a1 is simply given by a1^k, which in this case is 0.4^3 = 0.064.Your answer has been saved.
Which of the following statements about forecasting is not correct?

16

Which of the following statements are true concerning the autocorrelation function (acf) and partial autocorrelation function (pacf)?

i) The acf and pacf will always be identical at lag one whatever the model

ii) The pacf for an MA(q) model will in general be non-zero beyond lag q

iii) The pacf for an AR(p) model will be zero beyond lag p

iv) The acf and pacf will be the same at lag two for an MA(1) model

a) (ii) and (iv) only
b) (i) and (iii) only
c) (i), (ii), and (iii) only
d) (i), (ii), (iii), and (iv)

Correct! The pacf measures the correlation between y_t and y_(t-k) after controlling for (removing) the effects of the intermediate lags on the current value. For example, the pacf at lag 3 measures corr(y_t, y_(t-3)) after removing the effects of y_(t-1) and y_(t-2) on y_t. Therefore, since at lag 1 there are no intermediate lags to remove, the acf and pacf will always be identical at lag 1 whatever the model, so (i) is correct. For an MA(q) model, the acf will be zero at all lags beyond q, but the MA(q) can be written as an AR(infinity). Therefore, the pacf will never be zero, but will decline geometrically, and thus (ii) is correct. For an AR(p), however, once the effects of y_(t-1), y_(t-2), ..., y(t-p) are removed, the correlation between y_t and y_(t-p-j) will be zero for all positive integer values of j. So, whilst the acf for an AR(p) will decline geometrically, the pacf will be zero after p lags and thus (iii) is true. Finally, although the acf will be zero at lag 2 for an MA(1), the pacf will not so (iv) is false.

Incorrect! The pacf measures the correlation between y_t and y_(t-k) after controlling for (removing) the effects of the intermediate lags on the current value. For example, the pacf at lag 3 measures corr(y_t, y_(t-3)) after removing the effects of y_(t-1) and y_(t-2) on y_t. Therefore, since at lag 1 there are no intermediate lags to remove, the acf and pacf will always be identical at lag 1 whatever the model, so (i) is correct. For an MA(q) model, the acf will be zero at all lags beyond q, but the MA(q) can be written as an AR(infinity). Therefore, the pacf will never be zero, but will decline geometrically, and thus (ii) is correct. For an AR(p), however, once the effects of y_(t-1), y_(t-2), ..., y(t-p) are removed, the correlation between y_t and y_(t-p-j) will be zero for all positive integer values of j. So, whilst the acf for an AR(p) will decline geometrically, the pacf will be zero after p lags and thus (iii) is true. Finally, although the acf will be zero at lag 2 for an MA(1), the pacf will not so (iv) is false.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

17

An ARMA(p,q) (p, q are integers bigger than zero) model will have

a) An acf and pacf that both decline geometrically
b) An acf that declines geometrically and a pacf that is zero after p lags
c) An acf that declines geometrically and a pacf that is zero after q lags
d) An acf that is zero after p lags and a pacf that is zero after q lags

Correct! For an ARMA(p,q) the AR part will dominate the acf in the sense that it will decline geometrically and will not be zero after either p or q lags. Obviously, the MA part will have an impact on the acf until q lags and then the acf thereafter will be identical to that of an AR(p). Similarly, the MA part will dominate the pacf in the sense that it will decline geometrically and will not be zero after either p or q lags. Obviously, the AR part will have an impact on the pacf until p lags and then the acf thereafter will be identical to that of an MA(q).

Incorrect! For an ARMA(p,q) the AR part will dominate the acf in the sense that it will decline geometrically and will not be zero after either p or q lags. Obviously, the MA part will have an impact on the acf until q lags and then the acf thereafter will be identical to that of an AR(p). Similarly, the MA part will dominate the pacf in the sense that it will decline geometrically and will not be zero after either p or q lags. Obviously, the AR part will have an impact on the pacf until p lags and then the acf thereafter will be identical to that of an MA(q).

Your answer has been saved.
Which of the following statements about forecasting is not correct?

18

The pacf is necessary for distinguishing between

a) An AR and an MA model
b) An AR and an ARMA model
c) An MA and an ARMA model
d) Different models from within the ARMA family

Correct! The pacf is not required to distinguish between an AR and an MA process. This can be achieved using the acf, since the AR(p) will have a geometrically declining acf while the MA(q) will have an acf that truncates after q lags. Similarly, the acf is all that is required to distinguish between an ARMA(p,q) and an MA(q) process , since the ARMA(p,q) will have a geometrically declining acf while the MA(q) will have an acf that truncates after q lags. It would be very difficult to use either the acf or the pacf to distinguish between models from within the ARMA(p,q) family since any values of p and q of at least one would lead to geometrically declining acf and pacf. So the most important use of the pacf is in distinguishing between AR(p) and ARMA processes, since for the former, the pacf would be zero after p lags while for the latter the decline in the pacf would be geometric.

Incorrect! The pacf is not required to distinguish between an AR and an MA process. This can be achieved using the acf, since the AR(p) will have a geometrically declining acf while the MA(q) will have an acf that truncates after q lags. Similarly, the acf is all that is required to distinguish between an ARMA(p,q) and an MA(q) process , since the ARMA(p,q) will have a geometrically declining acf while the MA(q) will have an acf that truncates after q lags. It would be very difficult to use either the acf or the pacf to distinguish between models from within the ARMA(p,q) family since any values of p and q of at least one would lead to geometrically declining acf and pacf. So the most important use of the pacf is in distinguishing between AR(p) and ARMA processes, since for the former, the pacf would be zero after p lags while for the latter the decline in the pacf would be geometric.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

19

The characteristic roots of the MA process

yt = -3ut-1 + ut-2 + ut

are

a) 1 and 2
b) 1 and 0.5
c) 2 and -0.5
d) 1 and -3

Correct! The roots of the characteristic equation are found for an MA process by first using the lag operator notation and gathering all of the terms in u together as y_t = -3L u_t + L^2 u_t + u_t. Then the characteristic equation will be z^2 - 3z + 1 = 0, which factorises to (1 - z)(1 - 2z) = 0, giving roots of 1 and 0.5 (so b is correct). Out of interest, this MA process is non-invertible since invertibility would require both roots to lie outside the unit circle while in this case there is one unit root and one explosive root. This MA process would therefore "blow up" under the AR(infinity) representation with the coefficients on the terms getting bigger and bigger on the lags further and further back into the past.

Incorrect! The roots of the characteristic equation are found for an MA process by first using the lag operator notation and gathering all of the terms in u together as y_t = -3L u_t + L^2 u_t + u_t. Then the characteristic equation will be z^2 - 3z + 1 = 0, which factorises to (1 - z)(1 - 2z) = 0, giving roots of 1 and 0.5 (so b is correct). Out of interest, this MA process is non-invertible since invertibility would require both roots to lie outside the unit circle while in this case there is one unit root and one explosive root. This MA process would therefore "blow up" under the AR(infinity) representation with the coefficients on the terms getting bigger and bigger on the lags further and further back into the past.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

20

Consider the picture below and suggest the model from the following list that best characterises the process:

a) An AR(1)
b) An ARMA(2,1)
c) An MA(2)
d) An AR(2)

Correct! This is in fact an AR(2) process. This, picture was generated from a simulated sample of data of length 100,000 observations. Therefore, the appropriate model can be determined much more easily than would be the case using a real sample of data. Not only is a real sample of data likely to be much smaller, but it is also likely to be "contaminated" with other features that make it much harder to determine the appropriate model order (e.g. jumps or structural breaks, volatility changes etc). Also, of course, no real data set would be generated by a model from the ARMA family anyway - all we would be trying to do would be to select the most appropriate model from this class to describe the important features of the data. Anyway, the picture given in this question has two significant peaks of the pacf (the fact that the second one is negative is irrelevant, and simply indicates that the signs of the coefficients in the model are different from one another). But the acf is declining fairly slowly (although it drops rapidly after lag 4). Overall, this information is consistent with an AR(2) process. In fact, the DGP was y_t = 0.5 y_(t-1) - 0.2 y_(t-2) + u_t.

Incorrect! This is in fact an AR(2) process. This, picture was generated from a simulated sample of data of length 100,000 observations. Therefore, the appropriate model can be determined much more easily than would be the case using a real sample of data. Not only is a real sample of data likely to be much smaller, but it is also likely to be "contaminated" with other features that make it much harder to determine the appropriate model order (e.g. jumps or structural breaks, volatility changes etc). Also, of course, no real data set would be generated by a model from the ARMA family anyway - all we would be trying to do would be to select the most appropriate model from this class to describe the important features of the data. Anyway, the picture given in this question has two significant peaks of the pacf (the fact that the second one is negative is irrelevant, and simply indicates that the signs of the coefficients in the model are different from one another). But the acf is declining fairly slowly (although it drops rapidly after lag 4). Overall, this information is consistent with an AR(2) process. In fact, the DGP was y_t = 0.5 y_(t-1) - 0.2 y_(t-2) + u_t.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

Which of the following statements about forecasting is not correct?

21

Consider the picture below and suggest the model from the following list that best characterises the process:

a) An MA(2)
b) An AR(2)
c) An ARMA(1,1)
d) An AR(1)

Correct! Here, the acf is significant only for the first two lags before rapidly dropping to zero, while the pacf declines geometrically. This information is therefore consistent with an MA(2) process. The DGP was y_t = 1.2 u_(t-1) - 1.5 u_(t-2) + u_t. In fact, this is a non-invertible MA, although it still has the characteristic shapes of the acf and pacf that any MA process would have.

Incorrect! Here, the acf is significant only for the first two lags before rapidly dropping to zero, while the pacf declines geometrically. This information is therefore consistent with an MA(2) process. The DGP was y_t = 1.2 u_(t-1) - 1.5 u_(t-2) + u_t. In fact, this is a non-invertible MA, although it still has the characteristic shapes of the acf and pacf that any MA process would have.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

Which of the following statements about forecasting is not correct?

22

Which of the following statements are true concerning the acf and pacf?

(i) The acf and pacf are often hard to interpret in practice

(ii) The acf and pacf can be difficult to calculate for some data sets

(iii) Information criteria represent an alternative approach to model order determination

(iv) If applied correctly, the acf and pacf will always deliver unique model selections

a) (ii) and (iv) only
b) (i) and (iii) only
c) (i), (ii), and (iii) only
d) (i), (ii), (iii), and (iv)

Correct! (i), (ii) and (iii) are correct. The acf and pacf are often hard to interpret in practice (implying that (i) is correct and (iv) is wrong). This arises for two reasons. First, it is very difficult use the acf and pacf to select models from within the ARMA(p,q) family, since all models with p and q both bigger than zero will have broadly the same characteristics. Second, it is quite often the case, when the acf or pacf are calculated, that the coefficients become insignificant and then significant again as the lag length is increased. For example, suppose that the acf coefficients are significant but declining from lags 1 through 5, while the pacf is significant for lags 1, 3 and 4 but not for 2 or 5. This is consistent with an AR(1) if we ignore lags 3 and 4, an AR(4) if we ignore the fact that the pacf for lag 2 is not significant, or an ARMA(p,q) if we believe that both the acf and pacf are declining geometrically. An alternative approach to model order determination would be to select that model which minimsed the value of an information criterion. Finally, the acf and pacf can be calculated fairly easily using a standard set of formulae, whatever the data series.

Incorrect! (i), (ii) and (iii) are correct. The acf and pacf are often hard to interpret in practice (implying that (i) is correct and (iv) is wrong). This arises for two reasons. First, it is very difficult use the acf and pacf to select models from within the ARMA(p,q) family, since all models with p and q both bigger than zero will have broadly the same characteristics. Second, it is quite often the case, when the acf or pacf are calculated, that the coefficients become insignificant and then significant again as the lag length is increased. For example, suppose that the acf coefficients are significant but declining from lags 1 through 5, while the pacf is significant for lags 1, 3 and 4 but not for 2 or 5. This is consistent with an AR(1) if we ignore lags 3 and 4, an AR(4) if we ignore the fact that the pacf for lag 2 is not significant, or an ARMA(p,q) if we believe that both the acf and pacf are declining geometrically. An alternative approach to model order determination would be to select that model which minimsed the value of an information criterion. Finally, the acf and pacf can be calculated fairly easily using a standard set of formulae, whatever the data series.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

23

Which of the following statements are true concerning the Box-Jenkins approach to diagnostic testing for ARMA models?

(i) The tests will show whether the identified model is either too large or too small

(ii) The tests involve checking the model residuals for autocorrelation, heteroscedasticity, and non-normality

(iii) If the model suggested at the identification stage is appropriate, the acf and pacf for the residuals should show no additional structure

(iv) If the model suggested at the identification stage is appropriate, the coefficients on the additional variables under the overfitting approach will be statistically insignificant

a) (ii) and (iv) only
b) (i) and (iii) only
c) (i), (ii), and (iii) only
d) (i), (ii), (iii), and (iv)

Correct! First, it is worth noting that the research by Box and Jenkins pre-dated all of the work done in the 1970's and 1980's on diagnostic testing for econometric models. Thus, in the Box-Jenkins world, diagnostic testing had a much more limited mandate - only checking for whether the model suggested was sufficient to capture the linear dependence in the data. Box and Jenkins proposed two similar approaches to diagnostic testing ARMA models, and in both cases, if the model proposed were adequate, non further structure should remain in the residuals of the estimated model. The first was "deliberate overfitting", which involved estimating a model larger than that suggested at the identification stage, and examining the statistical significance of the additional coefficients. If the model has captured all of the linear structure in the data, these should all be statistically insignificant. The other approach is termed "residual diagnostic testing" and this involved calculating the acf and pacf of the residuals from the estimated model. Again, if the model specified has captured all of the linear structure in the data, the acf and pacf on the residuals should suggest an ARMA(0,0) as the optimal model. In this framework, if a model that is too large (i.e. has too many MA or AR terms in it) has been suggested at the first stage, the Box-Jenkins approach to diagnostic testing could not inform you of this and there would be no way to detect it.

Incorrect! First, it is worth noting that the research by Box and Jenkins pre-dated all of the work done in the 1970's and 1980's on diagnostic testing for econometric models. Thus, in the Box-Jenkins world, diagnostic testing had a much more limited mandate - only checking for whether the model suggested was sufficient to capture the linear dependence in the data. Box and Jenkins proposed two similar approaches to diagnostic testing ARMA models, and in both cases, if the model proposed were adequate, non further structure should remain in the residuals of the estimated model. The first was "deliberate overfitting", which involved estimating a model larger than that suggested at the identification stage, and examining the statistical significance of the additional coefficients. If the model has captured all of the linear structure in the data, these should all be statistically insignificant. The other approach is termed "residual diagnostic testing" and this involved calculating the acf and pacf of the residuals from the estimated model. Again, if the model specified has captured all of the linear structure in the data, the acf and pacf on the residuals should suggest an ARMA(0,0) as the optimal model. In this framework, if a model that is too large (i.e. has too many MA or AR terms in it) has been suggested at the first stage, the Box-Jenkins approach to diagnostic testing could not inform you of this and there would be no way to detect it.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

24

Which of the following statements are true concerning information criteria?

(i) Adjusted R-squared is an information criterion

(ii) If the residual sum of squares falls when an additional term is added, the value of the information criterion will fall

(iii) Akaike's information criterion always leads to model orders that are at least as large as those of Schwarz's information criterion

(iv) Akaike's information criterion is consistent

a) (ii) and (iv) only
b) (i) and (iii) only
c) (i), (ii), and (iii) only
d) (i), (ii), (iii), and (iv)

Correct! An information criterion is simply a measure of model fit that trades off a closer fit to the data with increasing numbers of parameters. Therefore, adjusted R-squared clearly falls under this definition. It is not, however, a very widely used criterion since it is very lax (the penalty term for adding extra parameters is too weak) and would typically select very large model orders. Clearly, since information criteria trade off the RSS and the value of the penalty term, if the RSS falls only by a very little amount when a new parameter is added, the value of the information criterion will rise. It can be shown that AIC always leads to model orders that are at least as big as those selected by SBIC. AIC would never choose a model with a smaller number of parameters than that chosen by SBIC. Finally, AIC is in fact NOT consistent. That is, even if the sample size increases towards infinity, AIC will still on average deliver a model that is too big. An obvious question, therefore, is why is it used at all?! The answer is that the objective may be to fit all of the linear structure in the data, and therefore, since in practice we don't know what the true DGP is, to err on the side of forming too large a model. Also, AIC can be more efficient than SBIC, so that different samples from within the population can lead to similar model orders under AIC but more different model orders under SBIC.

Incorrect! An information criterion is simply a measure of model fit that trades off a closer fit to the data with increasing numbers of parameters. Therefore, adjusted R-squared clearly falls under this definition. It is not, however, a very widely used criterion since it is very lax (the penalty term for adding extra parameters is too weak) and would typically select very large model orders. Clearly, since information criteria trade off the RSS and the value of the penalty term, if the RSS falls only by a very little amount when a new parameter is added, the value of the information criterion will rise. It can be shown that AIC always leads to model orders that are at least as big as those selected by SBIC. AIC would never choose a model with a smaller number of parameters than that chosen by SBIC. Finally, AIC is in fact NOT consistent. That is, even if the sample size increases towards infinity, AIC will still on average deliver a model that is too big. An obvious question, therefore, is why is it used at all?! The answer is that the objective may be to fit all of the linear structure in the data, and therefore, since in practice we don't know what the true DGP is, to err on the side of forming too large a model. Also, AIC can be more efficient than SBIC, so that different samples from within the population can lead to similar model orders under AIC but more different model orders under SBIC.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

25

Consider the following ARMA(2,1) equation (with standard errors in parentheses) that has been estimated as part of the Box-Jenkins overfitting strategy for testing the adequacy of the chosen AR(1) mmodel.

Which of the following statements about forecasting is not correct?

Which model do you think, given these results, is the most appropriate for the data?

a) An AR(1)
b) An AR(2)
c) An ARMA(2,1)
d) The appropriate response to this set of diagnostic results would be to go back to the identification stage and propose a larger model.

Correct! In fact, d is the best response. Clearly the additional "overfitted" terms are statistically significant, so that the original AR(1) model cannot be viewed as adequate. However, even though both the y_(t-2) and u_t-1 terms are statistically significant, suggesting evidence for an ARMA(2,1) structure, once the model selected at the identification stage has been rejected, we need to go back to the drawing board. Although an ARMA(2,1) is probably better than an AR(1) in this case, it may be that any higher order model may capture the features of the data better, and whether this is true or not could only be determined by a re-examination of the acf and pacf of the original data.

Incorrect! In fact, d is the best response. Clearly the additional "overfitted" terms are statistically significant, so that the original AR(1) model cannot be viewed as adequate. However, even though both the y_(t-2) and u_t-1 terms are statistically significant, suggesting evidence for an ARMA(2,1) structure, once the model selected at the identification stage has been rejected, we need to go back to the drawing board. Although an ARMA(2,1) is probably better than an AR(1) in this case, it may be that any higher order model may capture the features of the data better, and whether this is true or not could only be determined by a re-examination of the acf and pacf of the original data.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

26

Which of the following statements are true concerning the class of ARIMA(p,d,q) models?

(i) The "I" stands for independent

(ii) An ARIMA(p,1,q) model estimated on a series of logs of prices is equivalent to an ARIMA(p,0,q) model estimated on a set of continuously compounded returns

(iii) It is plausible for financial time series that the optimal value of d could be 2 or 3.

(iv) The estimation of ARIMA models is incompatible with the notion of cointegration

a) (ii) and (iv) only
b) (i) and (iii) only
c) (i), (ii), and (iii) only
d) (i), (ii), (iii), and (iv)

Correct! (ii) and (iv) only are correct. The "I" in ARIMA stands for integrated. This is clearly related to the concept of a unit root, and d is the number of times that the series must be differenced to make it stationary. Usually, therefore, for financial time-series, the required value of d will be 0 or probably 1. It will almost certainly not be 2 and never 3. The way that ARIMA modelling is usually approached is to take the required number of differences first and then to estimate an ARMA model on the resulting differenced series. Thus an ARIMA(p,d,q) model is equivalent to an ARMA(p,q) model estimated on a series that has been differenced d times. So, estimating an ARMA(p,q) model on a series of log-returns (continuously compounded returns) is equivalent to estimating an ARIMA(p,1,q) model on the logs of the prices. Again, specification of ARIMA models pre-dated Engle and Ganger's work on cointegration, and if we believe that the long-run relationship between series is important, we should not estimate ARIMA models. Under the ARIMA model-building approach, the each series would be differenced the required number of times and then ARMA models applied separately to each series. Therefore, all of the long run relationships between series would be lost.

Incorrect! (ii) and (iv) only are correct. The "I" in ARIMA stands for integrated. This is clearly related to the concept of a unit root, and d is the number of times that the series must be differenced to make it stationary. Usually, therefore, for financial time-series, the required value of d will be 0 or probably 1. It will almost certainly not be 2 and never 3. The way that ARIMA modelling is usually approached is to take the required number of differences first and then to estimate an ARMA model on the resulting differenced series. Thus an ARIMA(p,d,q) model is equivalent to an ARMA(p,q) model estimated on a series that has been differenced d times. So, estimating an ARMA(p,q) model on a series of log-returns (continuously compounded returns) is equivalent to estimating an ARIMA(p,1,q) model on the logs of the prices. Again, specification of ARIMA models pre-dated Engle and Ganger's work on cointegration, and if we believe that the long-run relationship between series is important, we should not estimate ARIMA models. Under the ARIMA model-building approach, the each series would be differenced the required number of times and then ARMA models applied separately to each series. Therefore, all of the long run relationships between series would be lost.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

27

Which of the following statements is true concerning forecasting in econometrics?

a) Forecasts can only be made for time-series data
b) Mis-specified models are certain to produce inaccurate forecasts
c) Structural forecasts are simpler to produce than those from time series models
d) In-sample forecasting ability is a poor test of model adequacy

Correct! Forecasts can be made for cross-sectional as well as time-series data. For example, we could produce a forecast for the price of a house that is just going to be put onto the market, given its characteristics. Or we could predict the credit rating of a newly privatised company that wishes to issue debt. In both of those cases, the data from which the relevant models would be built is cross-sectional. Although it is likely that mis-specified models will produce poor forecasts, it is still possible that a mis-specified model could produce better forecasts when evaluated on an out-of-sample portion of data than a well-specified model. This will arise, for example, if a model is "overfitted" to the in-sample data and therefore captures some irrelevant specific features of that in-sample data. Structural forecasts are typically harder to produce than time-series forecasts since to form structural forecasts, we usually require predictions of the structural variables as well. It is true that in-sample forecasting ability is a poor test of a model. This arises since by using the same set of data to estimate the model and to evaluate the forecasts means that we have cheated. It is very easy to improve the in-sample forecast accuracy by fitting a larger model. It would be rather like a magician guessing which playing card you have selected from a pack if he has already seen it!

Incorrect! Forecasts can be made for cross-sectional as well as time-series data. For example, we could produce a forecast for the price of a house that is just going to be put onto the market, given its characteristics. Or we could predict the credit rating of a newly privatised company that wishes to issue debt. In both of those cases, the data from which the relevant models would be built is cross-sectional. Although it is likely that mis-specified models will produce poor forecasts, it is still possible that a mis-specified model could produce better forecasts when evaluated on an out-of-sample portion of data than a well-specified model. This will arise, for example, if a model is "overfitted" to the in-sample data and therefore captures some irrelevant specific features of that in-sample data. Structural forecasts are typically harder to produce than time-series forecasts since to form structural forecasts, we usually require predictions of the structural variables as well. It is true that in-sample forecasting ability is a poor test of a model. This arises since by using the same set of data to estimate the model and to evaluate the forecasts means that we have cheated. It is very easy to improve the in-sample forecast accuracy by fitting a larger model. It would be rather like a magician guessing which playing card you have selected from a pack if he has already seen it!

Your answer has been saved.
Which of the following statements about forecasting is not correct?

28

If a series, y, follows a random walk, what is the optimal one-step ahead forecast of y?

a) The current value of y
b) Zero
c) One
d) The average value of y over the in-sample period
Correct! If y follows a random walk, the optimal guess of the next value of y will be the most recently available value. The random walk for y can be written y_t = y_(t-1) + u_t. Now suppose that the current period is time (t-1). Then the forecast, made at this time for the next period, t, will be given by taking expectations: E[y_t] = E[y_(t-1) + u_t]. So, E[y_t] = E[y_(t-1)] + E[u_t]. But the expected value of the next period disturbance term is zero and the expected value of y_(t-1) is the actual value which we have already observed, y_(t-1). Therefore, E[y_t] = y_(t-1Incorrect! If y follows a random walk, the optimal guess of the next value of y will be the most recently available value. The random walk for y can be written y_t = y_(t-1) + u_t. Now suppose that the current period is time (t-1). Then the forecast, made at this time for the next period, t, will be given by taking expectations: E[y_t] = E[y_(t-1) + u_t]. So, E[y_t] = E[y_(t-1)] + E[u_t]. But the expected value of the next period disturbance term is zero and the expected value of y_(t-1) is the actual value which we have already observed, y_(t-1). Therefore, E[y_t] = y_(t-1Your answer has been saved.
Which of the following statements about forecasting is not correct?

29

If a series, y, follows a random walk with drift b, what is the optimal one-step ahead forecast of the change in y?

a) The current value of y
b) Zero
c) One
d) The average value of the change in y over the in-sample period

Correct! Note that we now want a forecast for the first difference of y, and not the level of y, and now the model has a drift, b: y_t = b + y_(t-1) + u_t. We can write this model in first differenced form as y_t - y_(t-1) = b + u_t, and we can write the first difference as dy_t = y_t - y_(t-1). Applying the same logic as above, and taking expectations at time (t-1) of the values in time t, E(dy_t) = E[b + u_t] = E[b] + E[u_t]. But the expected value of the constant b is b, and the expected value of the next period disturbance is zero, so E(dy_t) = b. b is the drift, which would simply be estimated as the average value of the change in y over the in-sample period.

Incorrect! Note that we now want a forecast for the first difference of y, and not the level of y, and now the model has a drift, b: y_t = b + y_(t-1) + u_t. We can write this model in first differenced form as y_t - y_(t-1) = b + u_t, and we can write the first difference as dy_t = y_t - y_(t-1). Applying the same logic as above, and taking expectations at time (t-1) of the values in time t, E(dy_t) = E[b + u_t] = E[b] + E[u_t]. But the expected value of the constant b is b, and the expected value of the next period disturbance is zero, so E(dy_t) = b. b is the drift, which would simply be estimated as the average value of the change in y over the in-sample period.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

30

An "ex ante" forecasting model is one which

a) Includes only contemporaneous values of variables on the RHS
b) Includes only contemporaneous and previous values of variables on the RHS
c) Includes only previous values of variables on the RHS
d) Includes only contemporaneous values of exogenous variables on the RHS

Correct! By definition, an "ex ante" ("before the event") model is one that uses only lagged values of variables on the RHS. These models are particularly useful for time-series forecasting since, even if the RHS variables are exogenous variables, we can produce forecasts from the model without requiring forecasts for the exogenous variables for as many steps ahead as the RHS variables are lagged. For example, if the RHS variables are all one-period lagged values while the LHS variable is a current value, we can produce one-step ahead forecasts without needing to have forecasts of the explanatory variables.

Incorrect! By definition, an "ex ante" ("before the event") model is one that uses only lagged values of variables on the RHS. These models are particularly useful for time-series forecasting since, even if the RHS variables are exogenous variables, we can produce forecasts from the model without requiring forecasts for the exogenous variables for as many steps ahead as the RHS variables are lagged. For example, if the RHS variables are all one-period lagged values while the LHS variable is a current value, we can produce one-step ahead forecasts without needing to have forecasts of the explanatory variables.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

31

Consider the following MA(2) model

yt = 0.3 + 0.5ut-1 - 0.4ut-2 + ut

What is the optimal two-step ahead forecast from this model, made at time t, if the values of the residuals from the model at time t and t-1 were 0.6 and -0.1 respectively and the values of the actual series y at time t-1 was -0.4?

a) 0
b) 0.3
c) 0.24
d) 0.64

Correct! What we want is a forecast for y_(t+2). Iterating the model forward in time for one and two time-steps, we would have

y_(t+1) = 0.3 + 0.5u_t - 0.4u_(t-1) + u_(t+1)

y_(t+2) = 0.3 + 0.5u_(t+1) - 0.4u_(t) + u_(t+2).

Taking expectations of these two equations gives respectively

E[y_(t+1)] = 0.3 + 0.5E[u_t] - 0.4E[u_(t-1)] + E[u_(t+1)]

E[y_(t+2)] = 0.3 + 0.5E[u_(t+1)] - 0.4E[u_(t)] + E[u_(t+2)]

If expectations are taken at time t, only quantities up to and including time t are known, so E[u_(t+1)] = 0, and E[u_(t+2)] = 0, while E[u_t] = u_t = 0.6 and E[u_(t-1)] = u_(t-1) = -0.1. Plugging these values into the last 2 equations above gives E[y_(t+1)] = 0.3 + (0.5 x 0.6) - (0.4 x -0.1) = 0.64, and E[y_(t+2)] = 0.3 - (0.4 x 0.6) = -0.24. Therefore the optimal 2-step ahead forecast is 0.24, and c is correct. Note that the value of the actual series at time t-1 is a "red herring" - that is, they are useless pieces of information since under an MA specification, the current value of the series depends on current and previous values of an error term, and not on the previous values of the series itself.

Incorrect! What we want is a forecast for y_(t+2). Iterating the model forward in time for one and two time-steps, we would have

y_(t+1) = 0.3 + 0.5u_t - 0.4u_(t-1) + u_(t+1)

y_(t+2) = 0.3 + 0.5u_(t+1) - 0.4u_(t) + u_(t+2).

Taking expectations of these two equations gives respectively

E[y_(t+1)] = 0.3 + 0.5E[u_t] - 0.4E[u_(t-1)] + E[u_(t+1)]

E[y_(t+2)] = 0.3 + 0.5E[u_(t+1)] - 0.4E[u_(t)] + E[u_(t+2)]

If expectations are taken at time t, only quantities up to and including time t are known, so E[u_(t+1)] = 0, and E[u_(t+2)] = 0, while E[u_t] = u_t = 0.6 and E[u_(t-1)] = u_(t-1) = -0.1. Plugging these values into the last 2 equations above gives E[y_(t+1)] = 0.3 + (0.5 x 0.6) - (0.4 x -0.1) = 0.64, and E[y_(t+2)] = 0.3 - (0.4 x 0.6) = -0.24. Therefore the optimal 2-step ahead forecast is 0.24, and c is correct. Note that the value of the actual series at time t-1 is a "red herring" - that is, they are useless pieces of information since under an MA specification, the current value of the series depends on current and previous values of an error term, and not on the previous values of the series itself.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

32

What is the optimal three-step ahead forecast from the MA(2) model given in question 31?

a) 0
b) 0.3
c) 0.24
d) 0.64

Correct! Iterating the equation given in question 11 forward three time steps gives

y_(t+3) = 0.3 + 0.5u_(t+2) - 0.4u_(t+1) + u_(t+3)

and taking expectations of this leads to

E[y_(t+3)] = 0.3 + 0.5E[u_(t+2)] - 0.4E[u_(t+1)] + E[u_(t+3)].

Since all of the terms on the RHS of this equation except the intercept involve disturbance terms at time (t+1) or further ahead, their expectations are all zero. Thus the 3-step ahead forecast collapses to the intercept (0.3) and b is correct. This is to be expected since an MA(q) only has memory of q so any forecasts made for more than q-steps into the future will reduce to the intercept (and zero if there is no intercept).

Incorrect! Iterating the equation given in question 11 forward three time steps gives

y_(t+3) = 0.3 + 0.5u_(t+2) - 0.4u_(t+1) + u_(t+3)

and taking expectations of this leads to

E[y_(t+3)] = 0.3 + 0.5E[u_(t+2)] - 0.4E[u_(t+1)] + E[u_(t+3)].

Since all of the terms on the RHS of this equation except the intercept involve disturbance terms at time (t+1) or further ahead, their expectations are all zero. Thus the 3-step ahead forecast collapses to the intercept (0.3) and b is correct. This is to be expected since an MA(q) only has memory of q so any forecasts made for more than q-steps into the future will reduce to the intercept (and zero if there is no intercept).

Your answer has been saved.
Which of the following statements about forecasting is not correct?

33

Which one of the following statements is true concerning alternative forecast accuracy measures?

a) Mean squared error is usually highly correlated with trading rule profitability
b) Mean absolute error provides a quadratic loss function
c) Mean absolute percentage error is a useful measure for evaluating asset return forecasts
d) Mean squared error penalises large forecast errors disproportionately more than small forecast errors

Correct! Neither mean squared error (MSE) nor mean absolute error (MAE), or any other statistical measure, will typically be highly correlated with the performance of trading rules based on statistical forecasts. Sometimes, the percentage of times that the forecast correctly predicts the sign of the next return is a more useful guide as to the performance of the forecasts when used in a financial trading rule. One possible explanation for this is that trading rules usually only require a buy or sell indicator (given by the predicted sign of the next return), while statistical measures such as MSE or MAE depend also upon how far the forecast is away from the actual value. MSE squares the forecast errors, and therefore it is this rather than MAE that provides a quadratic loss function. Mean absolute percentage error (MAPE) is pretty useless for any series when the actual series, y, can take on values close to zero. This arises since MAPE divides the forecast error by the actual value, so if the value is much smaller than the forecast error, the MAPE will blow up, while if the actual value is ever zero, the MAPE will by construction rise to infinity. Finally, it is true that MSE penalises large forecast errors disproportionately more than small errors since the forecast errors are squared! Whether this is useful or not will depend on whether large forecast errors are disproportionately more series for the forecaster or not.

Incorrect! Neither mean squared error (MSE) nor mean absolute error (MAE), or any other statistical measure, will typically be highly correlated with the performance of trading rules based on statistical forecasts. Sometimes, the percentage of times that the forecast correctly predicts the sign of the next return is a more useful guide as to the performance of the forecasts when used in a financial trading rule. One possible explanation for this is that trading rules usually only require a buy or sell indicator (given by the predicted sign of the next return), while statistical measures such as MSE or MAE depend also upon how far the forecast is away from the actual value. MSE squares the forecast errors, and therefore it is this rather than MAE that provides a quadratic loss function. Mean absolute percentage error (MAPE) is pretty useless for any series when the actual series, y, can take on values close to zero. This arises since MAPE divides the forecast error by the actual value, so if the value is much smaller than the forecast error, the MAPE will blow up, while if the actual value is ever zero, the MAPE will by construction rise to infinity. Finally, it is true that MSE penalises large forecast errors disproportionately more than small errors since the forecast errors are squared! Whether this is useful or not will depend on whether large forecast errors are disproportionately more series for the forecaster or not.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

34

Which one of the following factors is likely to lead to a relatively high degree of out-of-sample forecast accuracy?

a) A model that is based on financial theory
b) A model that contains many variables
c) A model whose dependent variable has recently exhibited a structural change
d) A model that is entirely statistical in nature with no room for judgmental modification of forecasts

Correct! A model that is based on a solid economic or financial theory is likely to produce good forecasts since there is a strong reason for thinking such a model will work in the out-of-sample period as well as for the sample of data that has been used to estimate the model parameters. A model that contains many variables is likely to have fitted to the "noise" as well as the "signal" in the in-sample data, meaning that it will not easily be able to generalise. To the extent that these patterns in the in-sample data will not be repeated in the out-of-sample period, such a large model could produce poor forecasts. Good out-of-sample forecasts are usually obtained from compact models. If a variable has recently exhibited a structural change, this could mean that the relationship it had with a set of other variables has broken down. If that is the case, any model estimated on a sample of data before or during the structural break is likely not to forecast well after the structural break. Finally, statistical forecasting models can and do break down, so pure statistical forecasts are likely not to work as well as those where there is room for a judgmental modification of the forecast if an expert thinks that the model may not be working well for some reason.

Incorrect! A model that is based on a solid economic or financial theory is likely to produce good forecasts since there is a strong reason for thinking such a model will work in the out-of-sample period as well as for the sample of data that has been used to estimate the model parameters. A model that contains many variables is likely to have fitted to the "noise" as well as the "signal" in the in-sample data, meaning that it will not easily be able to generalise. To the extent that these patterns in the in-sample data will not be repeated in the out-of-sample period, such a large model could produce poor forecasts. Good out-of-sample forecasts are usually obtained from compact models. If a variable has recently exhibited a structural change, this could mean that the relationship it had with a set of other variables has broken down. If that is the case, any model estimated on a sample of data before or during the structural break is likely not to forecast well after the structural break. Finally, statistical forecasting models can and do break down, so pure statistical forecasts are likely not to work as well as those where there is room for a judgmental modification of the forecast if an expert thinks that the model may not be working well for some reason.

Your answer has been saved.
Which of the following statements about forecasting is not correct?

Which of the following is not true about forecasting?

Answer: ans is d - short range forecast are less accurate than long range forecast.

Which of the following is not forecasting?

The only non-forecasting method is exponential smoothing with a trend.

Which of these is not a measure of forecasting accuracy?

Hence, mean sum of errors is not a good measure of forecasting accuracy.

Which of the following statements is not a factor in selecting a forecasting method?

Which of the following statements is not a factor in selecting a forecasting method? Data pattern. The pattern in the data will have little effect on the type of forecasting method selected.