Missing Survey Data in End-Use Energy Models: An Overlooked Problem

Abstract

Although missing data are found in all types of data sets, surveys are particularly prone to produce data sets in which values of some respondent variables are missing (see, e.g., Cochran, 1977; Ericson, 1967; Kalton, 1983; and Hutcheson and Prather, 1977). Survey data collected for end-use energy demand models are no exception; high frequencies of nonresponse occur for many variables. This issue is, however, generally disregarded in the end-use literature, and analysts working with end-use models often discard cases in which values are missing for variables required by their models (see e.g., U.S. Government, 1983; Pacific Gas and Electric, 1983; Hirst and Carney, 1978; and EPRI, 1977). Discarding cases with missing values has important consequences. It implicitly assumes that the missing values occur randomly rather than systematically. If, however, missing values do not occur randomly, discarding cases with missing values will result in misspecified models and biased forecasts. Furthermore, by discarding cases, the detail appropriate for a given end-use model can be lost.

Keywords

End use energy demand models Missing data Misspecified models biased forecasts

Get full access to this article

View all access options for this article.

References

Boen

J.R.

Zahn

D.A.

(1982). The Human Side of Statistical Consulting. Belmont, California: Lifelong Learning Publications, Wadsworth Press.

Cochran

W.G.

(1977). Sampling Techniques, 3rd Edition, New York: John Wiley &Son, Inc.

Deming

W.E.

(1953). “On a Probability Mechanism to Attain as Economic Balance Between the Resultant Error of Non-Response and the Bias of Non-Response.” Journal of The American Statistical Association 48 (December): 743-72.

EPRl (1977). Forecasting and Modelling Time-Of-Day and Seasonal Electricity Demand. EPRI EA-57B-5R (December). Palo Alto, California: Electric Power Research Institute.

Ericson

W.A.

(1967). “Optimal Design with Non-Response.” Journal of the American Statistical Association 62 (March): 63-78.

Felligi

I.D.

Holt

(1976). “A Systematic Approach to Automatic Edit and Imputation.” Journal of the American Statistical Association 71 (March): 17—35.

Greenlees

J.S.

Reece

W.S.

Zieschang

K.D.

(1982). “Imputation of Missing Values When the Probability of Response Depend on the Variables Being Imputed.” Journal of the American Statistical Association 77 (June): 251-61.

Hirst

Carney

(1978). The ORNL Engineering-Economic Model of Residential Energy Use. ORNL/CON-24. Oak Ridge, Tennessee: Oak Ridge National Laboratory.

Hutcheson

J.D.

Prather

J.E.

(1977), “Assessing the Effects of Missing Data.” Proceedings of the Social Statistics Section, American Statistical Association, pp. 279-83. '

10.

Kalton

(1983), Compensating For Missing Survey Data. Ann Arbor: Survey Research Center, Institute for Social Research, The University of Michigan.

11.

Little

R.J.A.

(1982). “Models for Nonresponse in Sample Surveys.” Journal of the American Statistical Association 77 (June): 237—50.

12.

National Academy of Sciences (1979). Proceedings of the Symposium on Incomplete Data. Washington, D.C.: National Academy of Sciences.

13.

Pacific Gas and Electric Company (1983). 1981 RASS Analysis Project Technical Documentation. Berkeley, Calif.: Minimax Research Corporation.

14.

Rockwell

R.C.

(1975). “An Investigation of Imputation and Differential Quality of Data in the 1970 Census.” Journal of the American Statistical Association 70 (March): 39-41.

15.

Sande

I.G.

(1982). “Imputation in Surveys: Coping with Reality.” The American Statistician 36 (August): 146 52.

16.

U .S. Government (1983). Analysis of Hlectric Utility Load Forecasting. General Accounting Office. RCED-83-170 (June 22). Washington, D.C.: U.S. Government Printing Office.