Abstract
The Consumer Price Index (CPI) serves as a critical economic indicator reflecting the overall prices of goods and services. While surveys of household consumption behaviors have been reliable to date, the survey-based method of compiling the CPI has recently been under strain. Price measurement has become more complex, thereby increasing demands on the data needed to attain accuracy, coverage, and timeliness. 1 This study introduces an integrated data pipeline incorporating a time-sensitive composite similarity model tailored for multivariate time-series correlation analysis to generate an alternative CPI utilizing price data from an e-commerce portal. The pipeline also offers an implementation of the preferred CPI forecasting model derived from an automated machine learning tool. The composite e-commerce index is 90.75% correlated with the official national food and non-alcoholic beverages CPI. Furthermore, ridge regression resulted in an R-squared value of 0.74 and an MAE of 1.269, hence proving that this novel approach can offer a robust complement to traditional methods for deriving a composite index and forecasting the CPI.
Get full access to this article
View all access options for this article.
