The track record of survey-based macroeconomic forecasting
I’m interested in forecasting, and one of the areas where plenty of forecasting has been done is macroeconomic indicators. This post looks at what’s known about macroeconomic forecasting.
Macroeconomic indicators such as total GDP, GDP per capita, inflation, unemployment, etc. are reported through direct measurement every so often (on a yearly, quarterly, or monthly basis). A number of organizations publish forecasts of these values, and the forecasts can eventually be compared against the actual values. Some of these forecasts are consensus forecasts: they involve polling a number of experts on the subject and aggregating the responses (for instance, by taking an arithmetic mean or geometric mean or appropriate weighted variant of either). We can therefore try to measure the usefulness of the forecasts and the rationality of the forecasters.
Why might we want to measure this usefulness and rationality? There could be two main motivations:
A better understanding of macroeconomic indicators and whether and how we can forecast them well.
A better understanding of forecasting as a domain as well as the rationality of forecasters and the inherent difficulties in forecasting.
My interest in the subject stems largely via (2) rather than (1): I’m trying to understand just how valuable forecasting is. However, the research I cite has motivations that involve some mix of (1) and (2).
Within (2), our interest might be in studying:
The usefulness and rationality of individual forecasts (that are part of the consensus) in absolute terms.
The usefulness and rationality of the consensus forecast.
The usefulness and rationality of individual forecasts relative to the consensus forecasts (treating the consensus forecast as a benchmark for how easy the forecasting task is).
The macroeconomic forecasting discussed here generally falls in the near but not very near future category in the framework I outlined in a recent post.
Here is a list of regularly published macroeconomic consensus forecasts. The table is taken from Wikipedia (I added the table to Wikipedia).
Organization name | Forecast name | Number of individuals surveyed | Number of countries covered | List of countries/regions covered | Frequency | How far ahead the forecasts are made for | Start date |
---|---|---|---|---|---|---|---|
Consensus Economics[2][3] | Consensus ForecastsTM | More than 700[2][3] | 85[2][3] | Member countries of the G-7 industralized nations, Asia Pacific, Eastern Europe, and Latin America.[2][3] | Monthly[2][3] | 24 months | October 1989[4] |
FocusEconomics[5] | FocusEconomics Consensus Forecast[6] | Several hundred[6] | More than 70[6] | Asia, Eastern Europe, Euro Area, Latin America, Nordic economies[6] | Monthly[6] | ? | 1998[7] |
Blue Chip Publications division of Aspen Publishers[8] | Blue Chip Economic Indicators[8] | 50+[8] | 1 | United States | Monthly[8] | ? | 1976[8] |
Federal Reserve Bank of Philadelphia | Survey of Professional Forecasters[9][10] | a few hundred | 1 | United States | Quarterly[9] | 6 quarters, plus a few more long-range forecasts | 1968[9][10] |
European Central Bank | ECB Survey of Professional Forecasters[11][12] | ? | ? | Europe | Quarterly[11] | Two quarters and six quarters from now, plus the current and next two years | 1999[11][12] |
Federal Reserve Bank of Philadelphia | Livingston Survey[13] | ? | 1 | United States[13] | Bi-annually (June and December every year)[13] | Two bi-annual periods (6 months and 12 months from now), plus some forecasts for two years | 1946[13] |
Strengths and weaknesses of the different surveys
Time series available: The surveys that have been around longer, such as the Livingston Survey (started 1946), Survey of Professional Forecasters (started 1968) and the Blue Chip Economic Indicators (started 1976) have accumulated a larger time series of data. This allows for more interesting analysis.
Number of regions for which macroeconomic indicators are forecast: The surveys that cover a larger number of countries, such as the Consensus ForecastsTM (85 countries) and the FocusEconomics Consensus Forecast (over 70 countries) can be used to study hypotheses about differences in the accuracy and bias in forecasts based on country.
Time that people are asked to forecast ahead, frequency of forecast, and number of different forecasts (at different points in time) for the same indicator: Surveys differ in how far ahead people have to forecast, how frequently the forecasts are published, and the number of different times a particular quantity is forecast. For instance, the Consensus ForecastsTM includes forecasts for the next 24 months, and is published monthly. So we have 24 different forecasts of any given quantity, with the forecasts made at time points separated by a month each. This is at the upper end. The Survey of Professional Forecasters publishes at a quarterly frequency and includes macroeconomic indicator forecasts for the next 6 quarters. This is a similar time interval to the Consensus ForecastsTM but a smaller number of forecasts for the same quantity because of a lower frequency of publication.
Evaluation of individual versus consensus forecasts: For some forecasts (such as those published by the Survey of Professional Forecasters), the published information includes individual forecasts, so we can measure the usefulness and rationality of individual forecasts rather than that of the consensus forecast. For others, such as Consensus ForecastsTM, only the consensus is available, so only more limited tests are possible. Note that the question of the value of individual forecasts and the question of the value of the consensus forecast are both important questions.
The history of research based on consensus forecast sources
There has been a gradual shift in what consensus forecasts are used in research studying forecasts:
Early research on macroeconomic forecasting, in the 1970s, began with a few people collecting their own data by polling experts.
In the 1980s, the Livingston bi-annual survey was used as a major data source by researchers.
In the late 1980s and through the 1990s, researchers switched to the Survey of Professional Forecasters and the Blue Chip Economic Indicators Survey, with the focus shifting to the latter more over time. Note that the Blue Chip Economic Indicators had been started only in 1976, so it’s natural that it took some time for people to have enough data from it to publish research.
In the 2000s, research based on Consensus ForecastsTM was added to the mix. Note that Consensus Economics started out in 1989, so it’s understandable that research based on it took a while to start getting published.
There has also been a gradual shift in views about forecast accuracy:
Early literature in the 1970s and early 1980s found evidence of inaccuracy and bias in forecasts.
In the 1990s, as the literature started looking at forecasts that polled more people and had higher frequency, the view shifted in the direction of consensus forecasts having very little inaccuracy and bias, whereas the topic of bias in individual forecasts is more hotly contested.
Tabulated bibliography (not comprehensive, but intended to cover a reasonably representative sample)
Paper | Forecast used | Conclusion about efficiency and bias of individual and consensus forecast |
---|---|---|
McNees (1978) | Own data (3 people, 4 quarterly forecasts) |
Some forecasts are biased, and forecasters are not rational |
Figlewski and Wachtel (1981) | Livingston Survey | Inflationary expectations are more consistent with the adaptive expectations hypothesis than the rational expectations hypothesis. The paper was critiqued by Dietrich and Joines (1983), and the authors responded in Figlewski and Wachtel (1983). |
Keane and Runkle (1990) | Survey of Professional Forecasters (called the ASA-NBER survey at the time) | Individual forecasters appear rational, although rationality is not established conclusively. Methodological problems are noted with past literature arguing for irrationality and bias in individual forecasts. |
Swidler and Ketchler (February 1990) | Blue Chip Economic Indicators | Consensus forecasts are unbiased and efficient. Does not appear to look at individual forecasts. |
Batchelor and Dua (November 1991) | Blue Chip Economic Indicators | Consensus forecasts are unbiased, but some individual forecasts are biased. |
Ehrbeck and Waldmann (1996) | North-Holland Economic Forecasts | The abstract: “Professional forecasters may not simply aim to minimize expected squared forecast errors. In models with repeated forecasts the pattern of forecasts reveals valuable information about the forecasters even before the outcome is realized. Rational forecasters will compromise between minimizing errors and mimicking prediction patterns typical of able forecasters. Simple models based on this argument imply that forecasts are biased in the direction of forecasts typical of able forecasters. Our models of strategic bias are rejected empirically as forecasts are biased in directions typical of forecasters with large mean squared forecast errors. This observation is consistent with behavioral explanations of forecast bias.” |
Stark (1997) | Survey of Professional Forecasters | Attempts to replicate, for the Survey of Professional Forecasters, the results of Lamont (1995) for the Business Week survey that forecasters get more radical as they gain experience. Finds that the results do not replicate, and posits an explanation for this. |
Laster, Bennett, and Geoum (1999) | Blue Chip Economic Indicators | Individual forecasters are biased. The paper describes a theory for how such bias might be rational given the incentives facing forecasters. The empirical data is a sanity check rather than the focus of the paper. |
Batchelor (2001) (ungated early draft here) | Consensus ForecastsTM | Does not discuss bias in Consensus ForecastsTM per se, but notes that it is better than the IMF and OECD forecasts and that incorporating information from those forecasts does not improve upon Consensus ForecastsTM. |
Ottaviani and Sorensen (2006) | (none, discusses general theoretical model) | Abstract: “We develop and compare two theories of professional forecasters’ strategic behavior. The first theory, reputational cheap talk, posits that forecasters endeavor to convince the market that they are well informed. The market evaluates their forecasting talent on the basis of the forecasts and the realized state. If the market expects forecasters to report their posterior expectations honestly, then forecasts are shaded toward the prior mean. With correct market expectations, equilibrium forecasts are imprecise but not shaded. The second theory posits that forecasters compete in a forecasting contest with pre-specified rules. In a winner-take-all contest, equilibrium forecasts are excessively differentiated.” |
Batchelor (2007) | Consensus ForecastsTM | Consensus forecasts are unbiased, some individual forecasts are biased. But the persistent optimism and pessimism of some forecasters seems inconsistent with existing models of rational bias. |
Ager, Kappler, and Osterloh (2009) (ungated version) | Consensus ForecastsTM | There are consistently biased forecasts for some countries, but not for all. A lack of information efficiency is more severe for GDP forecasts than for inflation forecasts. |
The following overall conclusions seem to emerge from the literature:
For mature and well-understood economics such as that of the United States, consensus forecasts are not notably biased or inefficient. In cases where they miss the mark, this can usually be attributed to issues of insufficient information or shocks to the economy.
There may however be some countries. particularly those whose economies are not sufficiently well-understood, where the consensus forecasts are more biased.
The evidence on whether individual forecasts are biased or inefficient is more murky, but the research generally points in the direction of some individual forecasts being biased. Some people have posited a “rational bias” theory where forecasters have incentives to choose a value that is plausible but not the most likely in order to maximize their chances of getting a successful unexpected prediction. We can think of this as an example of product differentiation. Other sources and theories of rational bias have also been posited, but there is no consensus in the literature on whether and how these are sufficient to explain observed individual bias.
Some addenda
A Forbes article recommends using the standard sources for forecasts to business people who need economic forecasts for their business plan, rather than aiming for something more fancy.
There are some other forecasts I didn’t list here, such as the Greenbook forecasts, IMF’s World Economic Outlook, and OECD Economic Outlook. As far as I could make out, these are not generated through a consensus forecast procedure. They involve some combination of models and human judgment and discussion. The bibliography I tabulated above includes Batchelor (2001), that found that the Consensus ForecastsTM outperformed the OECD and IMF forecasts. Some research on the Greenbook forecasts can be found in the footnotes on the Wikipedia page about Greenbook. I didn’t think these were sufficiently germane to be included in the main bibliography.
- How deferential should we be to the forecasts of subject matter experts? by 14 Jul 2014 23:41 UTC; 22 points) (
- Some historical evaluations of forecasting by 7 May 2014 2:42 UTC; 14 points) (
- Tentative tips for people engaged in an exercise that involves some form of prediction or forecasting by 30 Jul 2014 5:24 UTC; 14 points) (
- Domains of forecasting by 9 Jul 2014 13:45 UTC; 11 points) (
“For mature and well-understood economics such as that of the United States, consensus forecasts are not notably biased or inefficient. In cases where they miss the mark, this can usually be attributed to issues of insufficient information or shocks to the economy.”
Maybe it’s the allure of alarmism, but aren’t we mostly concerned with predicting catastrophe? This is kind of like saying you can predict the weather except for typhoons and floods.
I think the analogy goes the other way. A weather forecast that didn’t cover catastrophes would still be useful. I like knowing if it’s going to be rainy or sunny, wet or dry.
Similarly, I find it useful to know in a general sense which way short-term interest rates are going, how much inflation to expect over the next few years, and whether the job market is getting better or worse from quarter to quarter.
Yes, sometimes there are external shocks or surprising internal developments, but an imperfect prediction is still better than none.
Except that the shocks usually have a disproportionate effect on the economy. The forecasting is useful, but any strategy contingent upon the forecasting will have to take into the account that when your forecasts fail, it won’t just be a little, it will be massive.
Actually, such numbers are usually determined through sampling. They are also subject to definitions and methodologies change (see e.g. inflation).
What does this mean? Specifically, what are your definitions and criteria of being “biased” and “inefficient” in this context?
Sounds like No True Scotsman :-/
Yes, you’re right that it’s not possible to measure everything, so sampling is often used in lieu of direct measurement. I had mentioned sampling in my earlier post.
I’m using the same definitions as used in the literature. The “bias” concept is discussed in the cited papers, plus in my earlier post http://lesswrong.com/lw/k2a/the_usefulness_of_forecasts_and_the_rationality/
The “efficiency” criterion is more difficult to define, but here it means roughly “makes use of all the available information”—sort of synonymous with rationality.
The meanings of the terms are of course up for debate, and the different papers don’t quite agree on the right meaning.
It’s certainly a flaw that they can’t predict shocks, but to the extent that a few shocks explain most forecasting error, that would have different implications than if the forecasts were wrong in all sorts of small ways.
The “insufficient information” refers to the quality of existing data they have access to. In some cases, people made wrong forecasts because the data about current indicator values that they were working with had errors, or was incomplete (e.g., they didn’t have information on a particular indicator value for a particular month).
How do you know? Or, more explicitly, on the basis of which evidence are you willing to make the claim that consensus macro forecasts “make use of all the available information”?
Besides, just having information is necessary but not sufficient. You also need models which will take this information as inputs and will output the forecasts. These models can easily be wrong. Is the correctness of models used included in your definition of efficiency?
It is difficult to conclusively demonstrate efficiency, but it is easy to rule out specific ways that forecasts could be inefficient. That’s what the papers do.