3.2 Quarterly Macro and Financial Dataset
Although the monthly financial dataset has some advantages, it is missing a great many
of the variables that researchers claim have predictive power for exchange rates. To
include these, I switch to quarterly data, and give up on the “real-time” feature of the
monthly asset price dataset.
This larger dataset contains all the same variables as the monthly data, aggregated
to quarterly frequency. In addition it includes (i) relative GDP (foreign-US) (logs and log
differences), (ii) relative money supply (logs and log differences), (iii) the relative price
level (logs and log differences), (iv) the relative ratio of current account to GDP (level
and cumulated) and (v) the monetary fundamentals as defined by Mark (1995).
The data cover the quarters 1973:1 to 2002:4. The models I consider are the
driftless random walk model and all linear regression models in which the exchange rate
return is predicted by any one of these variables (plus a constant). This gives a total of 20
models. The pseudo-out-of-sample prediction exercise involves forecasting the exchange
rate for 1993:1 to 2002:4 as of h quarters previously, for 1,2,3,4h = .
3.3 Results for Equal Weighted Model Averaging
I first considered the out-of-sample mean square prediction error of the forecast obtained
by averaging the predictions across all the different models, giving all models equal
weight, relative to the out-of-sample mean square prediction error for the forecast
assuming that the exchange rate is a driftless random walk. Table 1 shows this equal
weighted relative out-of-sample root mean square prediction error (RMSPE) in both the
monthly and quarterly datasets. A number greater than 1 means that equal-weighted
model-averaging is forecasting less well than a random walk. Except for the Canadian
dollar, most entries in these tables are greater than 1.3 Simple equal-weighted model
averaging, that is such an effective strategy in many forecasting contexts, does not seem
to buy us very much in exchange rate forecasting, at least not with these models.
3.4 Results for Bayesian Model Averaging
I now turn to Bayesian Model Averaging, which weights the forecasts from different
models by their posterior probabilities. Table 2 shows the out-of-sample RMSPE for
Bayesian Model Averaging. In the monthly dataset, for sterling, the out-of-sample
RMSPE is uniformly slightly above 1 indicating that the random walk gives better
forecasts. But for the other three currencies, the RMSPE is nearly uniformly below 1 in
the monthly dataset, indicating that Bayesian Model Averaging gives better forecasts.
Similar results are obtained in the quarterly dataset, also shown in Table 2. The
addition of the macro variables in the quarterly dataset does little on net to either improve
(or worsen) predictive performance.
Although Bayesian Model Averaging can give good results for some currency-
horizon pairs with a large value of φ , overall the best results are obtained with a smaller
value of φ (e.g. 1φ = ). In other words, a fairly informative prior with substantial
shrinkage improves the forecasting performance of model averaging. In this sense, it
does not pay to try to make the prior as uninformative as possible.
For small φ it is fair to say that Bayesian Model Averaging can help quite a bit,
but cannot hurt much. For example if 1φ = , it can lower mean square prediction error by
3 I do not show results for the out-of-sample RMSPE for the individual models but, not surprisingly, although the RMSPE is below 1 for some models and currency-horizon pairs, there is no model for which it is below 1 on average across all currency-horizon pairs, in either the monthly or quarterly datasets.
up to 12%, while the worst case is that it raises mean square prediction error by 2%,
relative to the random walk benchmark.
Bootstrap p-values of the hypothesis that the out-of-sample RMSPE is one are
reported in Table 3. In each bootstrap sample an artificial dataset is generated in which
the exchange rate is by construction a driftless random walk, using the bootstrap
methodology described in Appendix B. The p-values in Table 3 represent the proportion
of bootstrap samples for which the RMSPE is smaller than that which was actually
observed in the data. These are therefore one-sided p-values, testing the null of equal
predictability against the alternative that Bayesian Model Averaging gives a significant
improvement over the driftless random walk. The null is rejected for several currency-
horizon pairs in both the monthly and quarterly datasets, at conventional significance
Researchers are rightly suspicious of significant p-values in a test of the
hypothesis that a particular model forecasts the exchange rate better than a random walk.
The key reason is that these p-values ignore the data mining that was implicit in choosing
the particular model to use. Researchers publish the results of these tests only if they find
a model which forecasts the exchange rate significantly better than a random walk, and
thus “significant” results can be expected to crop up from time to time even if the
exchange rate is totally unpredictable. But to the extent that the Bayesian Model
Averaging approach is starting out with a set of models that spans the space of all models
researchers would ever want to consider, the results and specifically the p-values in the
forecast comparison test are then immune to any such data-mining critique.