Author: Andrew Berger-Gross
As researchers, our work is greatly enriched by the wealth of economic data published by federal statistical agencies such as the Bureau of Labor Statistics. The government is able to reliably monitor economic conditions using random sample surveys, administrative data, and statistical modeling, sparing taxpayers the enormous expense (and inconvenience) of conducting a population-wide census every month. However, a byproduct of these cost-saving measures is uncertainty about the accuracy of our economic indicators (or, in statistical terms, “error.”)
Unemployment rates for states and local areas are produced using an estimation model. We know that, based on the uncertainty inherent in the model and its inputs, the resulting numbers are ballpark estimates. The unemployment rate for North Carolina currently has a 90 percent confidence interval of approximately 1.3 percentage points. In plain English, this means that if the published unemployment rate is 6.3 percent, we can determine with 90 percent confidence that the actual unemployment rate is somewhere between 5.7 percent and 7.0 percent. There is a 10 percent chance that the rate is even higher (or lower) than that.
There are several other, less predictable sources of error in official economic statistics, in addition to the uncertainty introduced by model-based error (or its close cousin, ”sampling error,” which results from surveying only a portion of the population.) Dr. Charles F. Manski of Northwestern University wrote an excellent article earlier this year on uncertainty in economic statistics, describing in particular sources of “nonsampling error” such as:
- Transitory uncertainty, which occurs when the government has insufficient data (or resources) to produce accurate estimates immediately, but then revises these estimates at a future date in order to incorporate new information or more resource-intensive methodology. This is why state and local unemployment rates are revised on a monthly and annual basis.
- Permanent uncertainty, which occurs when there is a defect in the estimation process — such as a significant portion of households failing to respond to a survey — that is not eventually resolved.
- Conceptual uncertainty, which occurs when there is disagreement about how to even define a concept that is being measured. This is the case for seasonal adjustment, where we are trying to eliminate “normal seasonal variation” from economic data, but there is often uncertainty about what constitutes “normal” (especially after the Great Recession.)
These sources of error are (for the most part) unavoidable in the data business. The takeaway for data users is to be aware of both the strengths and limitations of economic statistics. Remember to always consider published error measures (such as “margins of error”) and other sources of error that are not measured (such as conceptual uncertainty) when drawing your own conclusions from the data. In particular — you should focus on the long-term trends in economic indicators, not on their volatile (and error-prone) month-to-month movements.