Abstract
Background
Retail involves directly delivering goods and services to end consumers. Natural disasters and epidemics/pandemics have significant potential to disrupt supply chains, leading to shortages, forecasting errors, price increases, and substantial financial strains on retailers. The COVID-19 pandemic highlighted the need for retail sectors to prepare for crisis impacts on sales forecasts by regularly assessing and adjusting sales volumes, consumer behavior, and forecasting models to adapt to changing conditions.
Methods
This study explores strategies for adapting sales forecasts and retail approaches in response to such crises. By employing different machine learning (ML) methods, we analyze consumer behavior changes and sales impacts across various product categories, including bottom wear, top wear, one piece, accessories, outwear, and shoes during the COVID-19 pandemic.
Results
The gradient boosting and CatBoost algorithms excelled in product groups with significant sales changes during the pandemic. The Multi-Layer Perceptron (MLP) algorithm performed well in low-volume categories like accessories and footwear. Meanwhile, MLP, LightGBM, and XGBoost were effective in medium-volume categories such as outerwear and underwear.
Conclusion
The findings highlight the efficacy of these models in adapting sales forecasts to crisis conditions, offering a practical approach to enhancing retail resilience against future disruptions. This study offers an effective approach for adapting sales forecasting to shifting consumer behaviors during crises.
Introduction and background
Crisis periods can profoundly affect the retail sector, presenting various business challenges. It is crucial to analyze pre-crisis and crisis period data together to meet customer needs better and make forecasting models sensitive to these changes. The COVID-19 pandemic can be considered the most significant crisis in modern times. 2020 was extremely dynamic for the world economy, with lockdowns, job losses, and supply chain disruptions being among the main issues caused by the pandemic. 2 During this crisis, COVID-19 also significantly impacted global stock markets. 3 Experts struggled to obtain reliable economic forecasts, reducing prediction performance. 4 The retail sector is one of the most adversely affected by the COVID-19 pandemic, facing challenges such as store closures and changing consumer demand. 5 During COVID-19 restrictions, consumers quickly identified categories like shoes and accessories as “non-essential” and significantly reduced their spending on such items. 6 This situation has led to issues in inventory management, 7 pricing, and marketing activities. 8
A crisis can deeply impact the retail industry, as seen during the COVID-19 pandemic, which caused global disruptions and shifted consumer demand for non-essential items. To navigate such challenges, it is crucial for retailers to understand pre-crisis conditions and employ accurate sales forecasting models. Using machine learning algorithms like Gradient Boosting, CatBoost, and MLP can improve predictions by accounting for rapid changes in consumer behavior. This study combines pre-crisis and crisis data to refine forecasting models, helping retailers better prepare for future crises.
The machine learning (ML) algorithms used in this study were selected to predict consumer behavior and sales trends more accurately during crises. Specifically, algorithms like Gradient Boosting, CatBoost, and Multi-Layer Perceptron (MLP) are known for high performance across different datasets and problem types. There are several reasons for selecting these algorithms: Gradient Boosting and CatBoost perform well on complex and variable datasets. They can capture intricate patterns and accurately predict product categories with high sales volume and variability. Gradient Boosting algorithms can easily adapt to the specific needs of applications by learning according to different loss functions. 9 MLP algorithm can solve regression and classification problems using artificial neural networks (ANN). This algorithm shows high success in product categories with more straightforward and stable sales patterns. The layered structure of MLP allows for better processing of different features in the dataset. 10 Mitra et al. (2022) compared five different ML methods (Random Forest (RF), extreme gradient boosting (XGBoost), gradient boosting, adaptive gradient boosting (AdaBoost), and ANN algorithms) and a hybrid model for sales forecasting with weekly sales data of a USA-based retail company. 10 It is observed that the hybrid model predicts more accurately than the other models according to various performance metrics. Krishna et al. (2018) analyzed forecasting a retail store's sales using different ML techniques. 12 They presented Regression Analysis and Boosting techniques comparatively and stated that boosting algorithms give better results. Cunha et al. (2023) analyzed the effects of the COVID-19 pandemic on the retail experience of e-commerce customers in the supermarket industry. 13 Their study aimed to understand how the pandemic has affected customer satisfaction in this sector. Ilieva (2022), in his research on e-commerce customer satisfaction, examined how the COVID-19 pandemic has shaped customer behavior in various retail sectors. 14 Similarly, studies examining the contribution of machine learning techniques to retail sales forecasting during crisis periods have increased. For example, Sleiman et al. (2022) used machine learning models to forecast sales in the fashion sector in France during the COVID-19 pandemic and analyzed the effects of the crisis in detail. 15 Likewise, Kim et al. (2023) examined the impacts of the pandemic on different retail categories in South Korea and compared the performance of traditional time series models with machine learning methods. 16
This study addresses the gap in the literature regarding sales forecasting during crises, as most existing research focuses on normal conditions. The need for this study arises from the inadequacy of traditional models in predicting consumer behavior during unexpected crises like COVID-19. This research demonstrates how machine learning can offer more accurate and adaptable forecasts by integrating pre-crisis and crisis data. In doing so, it fills a critical gap in the literature on crisis management and sales forecasting, helping retail businesses better prepare for future disruptions.
This study combines pre-crisis and crisis (COVID-19) data, offering more accurate results than traditional methods. Data from a women's clothing company with 284 stores and six product categories were used. The primary objective of this study is to analyze changes in consumer behavior in the retail sector during crisis periods (e.g. during the COVID-19 pandemic) and to forecast sales using different machine-learning methods. The study aims to enhance prediction accuracy under crisis conditions by combining pre-pandemic and pandemic data. The research hypotheses include that the accuracy of machine learning models will vary across different product groups during crisis periods and that some algorithms will perform better than others.
The Turkish retail sector was chosen for this study due to its broad and diversified market structure, store types catering to various income groups, and significant shifts in consumer behavior during the COVID-19 pandemic. Turkiye is a major player in the global women's clothing retail sector in terms of production capacity and export potential. As of 2021, it held a 3.7% share in apparel and textile exports, making it the 6th largest exporter in the world. 1 These characteristics provide an ideal setting for observing the differential impacts of the pandemic on luxury and standard consumption and for analyzing the performance of machine learning algorithms across various product categories.
Methodology
Used machine learning algorithms
ML algorithms are widely used in forecasting and problem-solving during various crisis periods. In this study, sales predictions were performed for a company operating in the retail sector with ML algorithms using data from the pandemic period. Gradient Boosting, CatBoost, and LightGBM algorithms from boosting methods, CART algorithm from decision tree methods, MLP from ANN methods, and KNN methods were used to predict sales amounts depending on product groups.
Gradient boosting algorithms are powerful ML techniques that have significantly succeeded in various applications. These algorithms can be easily adapted to the specific needs of applications, e.g. they can be learned according to different loss functions. 11 CatBoost is a member of boosting ML algorithms and a powerful tool for classification and regression tasks. CatBoost is a suitable algorithm for categorical and heterogeneous data. 17 Cat-Boost uses weighted voting to increase the weight of data associated with minor errors, improving the predicted results’ accuracy. 18 LightGBM is a boosting-based algorithm and aims to improve prediction accuracy by combining weak classifiers. Compared to standard boosting tree algorithms, LightGBM uses histogram optimization to segment continuous features, which saves memory and speeds up computation. 19 The CART algorithm is one of the most successful classification and regression analysis algorithms. It is flexible and robust and does not depend on the distribution type because it is non-parametric. 20 MLP is a kind of ANN trained using back-propagation algorithms. It can be used to solve both regression and classification problems. In an MLP structure, there are many neurons with different functions. The input layer is the first layer and contains the input variables for processing. Based on the data obtained from the hidden layers, the output layer calculates the output or requested values. 11 Regression problems are concerned with predicting the outcome of the dependent variable given a set of independent variables. KNN predictions are also based on a voting scheme where the winner is used to label the query. 21 The methodology of this study involves analyzing pre-crisis and crisis period data to make retail sales forecasts through ML algorithms. The study utilizes data from a retail company operating in Turkiye's women's clothing sector, with 284 stores. The company's data covers six main product types: top wear, one-piece wear, outwear, shoes, accessories, and bottom wear.
Data management policy
Data collection
Sales data were collected daily from January 2017 to December 2022, covering both pre-pandemic (2017-2020) and pandemic (2020-2022) periods. The data includes sales figures from 284 stores in 47 cities across Turkiye, where approximately 67% of the country's population resides. The data includes daily sales figures across six main product categories from 2017 to 2022, covering pre-crisis and crisis periods, including the COVID-19 pandemic. The authenticity and reliability of the data are ensured as it represents real-world sales transactions and store performance over an extended period, providing a solid foundation for the analysis. The company was chosen for its comprehensive sales data, covering various product categories and pre-crisis and crisis periods, offering valuable insights into consumer behavior shifts during the COVID-19 pandemic. While this study primarily focuses on traditional brick-and-mortar retail, the analysis does not include cross-border e-commerce data. The exclusion of cross-border e-commerce from the scope of this study is due to the focus on physical store sales data, which provides a more comprehensive and representative analysis of consumer behavior in this context. However, the importance of e-commerce, especially post-pandemic, is acknowledged, and this limitation is noted. In future research, cross-border e-commerce data may be considered for a more holistic analysis. Additionally, three new parameters thought to impact the model were included: GDP per capita, population size of the target audience in the store region, and storage area in the stores as three new independent variables. GDP per capita indicates economic prosperity, directly influencing consumer purchasing power and, consequently, store sales; stores in regions with higher GDP per capita typically achieve greater sales. The population size of the target audience in the store region shapes consumer behavior and store sales by determining the potential customer base; densely populated areas offer higher customer traffic and sales potential. The storage area enhances a store's stocking capacity and product variety, enabling quick responses to consumer demand and supporting sales.
Data cleaning and preparation
The data pre-processing steps are as follows:
Step 1: Encoding was conducted for categorical variables such as which brand the store sells and the type of store.
Step 2: GDP per capita in the regions where the stores are located was added to the data as an input variable.
Step 3: The female population aged 18–45 in the target group of the stores was added to the model as an input variable.
Step 4: Warehouse areas of the stores were added to the model as an input variable.
Step 5: 25 product groups were grouped under 6 main product groups.
Pre-processing steps addressed missing data analysis and inconsistencies. Once the data were made suitable for analysis, normalization procedures were conducted. Following this, three ML algorithms known for their high performance across different datasets and problem types were selected: Gradient Boosting, CatBoost, and MLP.
Model training and evaluation
The data were divided into training and test sets. The training set was used to train the models, while the test set evaluated their performance. Each algorithm was trained separately for different product categories, with cross-validation to optimize parameters. Model performance was assessed using metrics like R-square, MAE, RMSE, and MSE, and differences between product categories and store types were analyzed. The results highlight the most accurate algorithms, offering valuable insights into changing consumer behavior and helping retailers improve sales forecasts and strategic decisions during crises.
Performance measurement of models
The difference between classification and regression is the values that the dependent variable can take. In classification, the dependent variable can have only two values, usually coded as 0 and 1, whereas in regression, it has a continuous value. Although regression analysis has been used in many ML studies, there has yet to be a consensus on a standard measure to evaluate the regression results. Many studies use Mean Squared Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). 22 Table 1 shows the metrics used in performance metrics in prediction studies.
Metrics used in prediction studies.
In this study, MAE, MSE, RMSE, R2, and MAPE metrics were used to evaluate the performance of the algorithms.
Mean Squared Error (MSE) is a widely used metric in regression analysis representing the average squared difference between the predicted and actual values. Similarly, Root Mean Squared Error (RMSE) is another common metric that measures the differences between the values predicted by a model (or an estimator) and the values observed).
Application
Data collection
This study uses 2017-2022 sales data of a company operating in the retail sector. Monthly sales data for 25 different products were acquired. The retail company sells three brands (I, T, and M). At the same time, the retail company has exclusive (E), high (H), standard (S), and outlet (O) type stores. Raw data are categorized as follows: Store Name (283 stores) as String; Year (2017-2022) as Integer; Month (monthly based) as Integer; Product Type (25 different products) as Categorical; Sales Amount (currency) as Float; Sales Quantity (number of products) as Integer; Indicator (I, T, M) as Categorical; Store Type (E, S, H, O) as Categorical.
Data analysis
Changes to store types
When the data based on store type is analyzed, the average sales in (E)-Exclusive type stores pre-pandemic period was 13409.84. In contrast, the average sales decreased by 23% to 10311.58 during the pandemic. Average sales in (H)-High-type stores decreased by 21% from 9435.38 to 7414.53 during the pandemic. The (O)-Outlet store type experienced a smaller decline, with average store sales falling by 19% from 12357.08 to 9974.471. Average sales in (S)-Standard type stores decreased by 15% to 6750,784 from 7952,456 before the pandemic. (E)-Exclusive type stores experienced the highest decline in average sales, while (S)-Standard type stores experienced the lowest drop. These results indicate that people cut down on their luxury needs more than their standard needs.
Changes according to indicator
The change in indicator-based sales of stores in the pre-pandemic period and the pandemic period is given in Table 2.
Indicator-based average sales (quantity-based) changes.
When the change in average sales amounts in Table 2 is analyzed, only the stores selling I, T, and M brands simultaneously have seen a 31% increase in average sales amount. The average sales amounts of all other stores decreased. The most significant decrease was in the stores selling the brand, which was selling the most luxurious in type M. Type Stores, with indicator type “O” representing outlet sales. Outlet sales can be one of the three brands. There was a 19% decrease in “O” type sales.
Changes according to products
Pre-pandemic and pandemic average sales by product are given in Table 3.
Changes according to products.
Table 3 shows sales decreased by 20% on average during the pandemic. The decrease in demand for products with above-average decreases shows that there has been a change in the purchasing behavior of customers during the pandemic period. Other outerwear, boots, and cosmetics showed a significant decrease of over 50%. In addition, beach products were not offered to the market by the company, considering that there would be no sales.
While knitwear sales only fell by 3%, much lower than the general decrease, demand remained steady. Sales of sweatshirts and textile accessories (scarves, shawls, hats, etc.) increased by over 35%, reflecting a shift in customer preferences during the pandemic as more people worked from home. Products like overalls, slippers, jewelry, and knitwear saw demand decrease below the average, but the decline was insignificant.
Sales prediction by machine learning algorithms
Adebola and Onyekwelu (2019) used ML methods to detect trends in customer behavior. 44 They developed a model with information such as invoice number, stock code, product information, sales amount, invoice day, unit price, customer number, and country of sale as variables. This study used variables such as product information, product sales amount, and sales amount.
In this part of the study, ML algorithms were used to predict 2023 sales based on 2017-2022 data. Key steps include encoding categorical variables like store type and brand. Additional independent variables, such as store type, monthly product sales, GDP per capita, target group population, and warehouse space, were added to the model. These variables help capture correlations affecting sales. Finally, 25 product groups were categorized into six main groups for analysis: Accessories (Leather, Textile and Other Accessories, Jewelry, Cosmetics, and Beachwear Staff), Bottom wear (Skirts, Jeans, Pants, Shorts), Out wear (Other outwear, Coat, Topcoat), One Piece (Jacket, Dresses, Jumpsuit), Top wear (Blouse, Basics, Sweatshirt, Knitwear), and Shoes (Shoes, Boots, Slipper, Sandals).
Several ML algorithms have been used to model crisis/natural disaster periods. CART, 45 MLP, 4 XGBooost, 46 LightGBM, 47 Catboost, 48 KNN, 49 Gradient Boosting, 50 and ML algorithms were used for crisis periods such as COVID-19 pandemic, terrorist attacks, and earthquakes. In this study, CART, LIGHTGBM, CatBoost, KNN, Gradient Boost, XGBoost, and MLP algorithms were used as in the above studies. 2023`s sales data were predicted on a product group basis with seven different ML algorithms. 2023`s sales values and the performance measurement metrics calculated accordingly by ML algorithms and error values are presented in Table 4.
Performance metrics.
The values where each model performed the best according to the five performance metrics are highlighted in bold.
When the results in Table 4 are analyzed, the highest R2 value was 94% in the Gradient Boosting algorithm in the top wear category product group. On the other hand, the MLP algorithm gave the lowest MAPE value of 0.20 in the shoe product group. CatBoost, Gradient Boosting, XGBoost, and MLP algorithms gave the lowest RMSE value, 0.04.
In the accessories product group, the MLP algorithm gave the best prediction results with a MAPE value of 0.20, and the MLP algorithm had the highest R2 value of 0.68. In the Bottomwear product group, the MLP algorithm with an R2 value of 0.74, XGBoost with a MAPE value of 0.38, and the MLP algorithm with 0.42 gave the lowest error. In the Outerwear product category, the MLP algorithm had an R2 value of 0.93, and the LightGBM algorithm had a MAPE value of 0.11. In the one-piece product category, the XGBoost algorithm had an R2 value of 0.92, and the CatBoost algorithm had a MAPE value of 0.33. In the Topwear product category, significant metric values were obtained by the Gradient Boosting algorithm with an R2 value of 0.94 and the CatBoost algorithm with a MAPE value of 0.21. In the Shoe product category, the significant performance metric values belong to the MLP algorithm with an R2 value of 0.92 and the MLP algorithm with a MAPE value of 0.20.
Product group-based average sales values for pre-pandemic and pandemic periods were analyzed to evaluate the data better. In Figure 1, product-based average sales amounts are presented.

Product-based sales average for pre-pandemic vs pandemic.
Figure 1 shows that shoes and accessories had the lowest sales, while topwear and one-pieces had the highest. The largest sales difference between the pre-pandemic and pandemic periods occurred in topwear and one-piece. Gradient Boosting and CatBoost performed best for product groups with the biggest sales changes, while MLP excelled in the accessories and shoes category, where sales remained steady. MLP, LightGBM, and XGBoost performed well for moderate sales changes in outwear and bottomwear.
Comparison of different machine learning algorithms for sales forecasting and their performances are provided in Appendix 1 and 2, Table A1 and Table A2.
Algorithms like Gradient Boosting and XGBoost provide fast and effective results for large datasets, while MLP, a neural network-based method, excels in learning complex data relationships and performs well with large datasets. Although LightGBM delivers fast results on large datasets, it may underperform on smaller ones. CatBoost demonstrates strong results when working with categorical data, whereas the KNN algorithm, despite being simple and interpretable, tends to be slow with large datasets and computationally expensive. The advantages and disadvantages of each algorithm should be considered based on the characteristics of the application and the dataset being used.
Gradient Boosting, MLP, and CatBoost are the algorithms that generally deliver the best performance. Particularly, Gradient Boosting and MLP are notable for their high R2 and low RMSE values in categories with high sales volumes.
Variation by categories: We observe that each algorithm performs better in certain categories. For example, MLP excels in Shoes and Accessories, while Gradient Boosting stands out in the Top Wear and One Piece categories.
Weaker algorithms: KNN and LightGBM generally perform less than other algorithms. Although these algorithms might be more effective with larger datasets, their performance is limited for the data used in this study.
In conclusion, for predicting sales across various product categories in your system, algorithms such as Gradient Boosting, MLP, and CatBoost appear to be the most reasonable choices. Additionally, the high performance shown by algorithms such as Gradient Boosting and CatBoost in certain product groups indicates that retailers could strategically benefit from using these algorithms for inventory management and demand forecasting. This provides an advantage in terms of flexibility and quick response capabilities during crisis periods.
Discussion and conclusions
In the first year of the COVID-19 pandemic, tens of millions officially contracted COVID-19. This global pandemic has affected the lives of billions of people and has been the subject of academic studies in many different fields. Academic studies have been published on the effects of the pandemic on the retail sector, from stocking products to using personal shopping and distribution services. 51 Firms that do not consider changing consumer behavior cannot analyze demand trends correctly and fail to respond to customer needs. Companies that cannot make accurate forecasts will encounter difficulties. 52
This study analyzed product groups affected differently during the pandemic, using sales data from a major women's clothing retailer in Turkiye. Overall sales decreased by 20%, with outerwear, boots, and cosmetics dropping by over 50%. Beach products were not offered for sale. Knitwear saw a smaller decline of 3%, while textile accessories and sweatshirts increased by over 30%. Daily use items like overalls, slippers, and accessories had smaller decreases, reflecting a shift towards practical, everyday clothing over luxury and business attire. Analyzing sales by store type, E-type luxury stores saw a 23% drop, indicating customers avoided luxury purchases during the pandemic. S-type stores offering affordable seasonal products had the lowest sales loss. O-type outlet stores experienced a 19% decrease, while H-type stores, selling more luxurious items, saw a 21% drop. The most significant decline, 41%, occurred in the M brand, which sells luxury fabrics like silk and cashmere. ITM stores, selling all three brands, had a 31% loss due to their indoor locations. Stores selling I and M brands had the least impact, as they catered to middle-income consumers in outdoor shopping centers. Overall, demand shifted towards everyday products rather than luxury or handmade items like silk and cashmere.
Sousa et al. (2023) compared three Gradient Boosting algorithms (CatBoost, LightGBM, and XGBoost) for predicting sales demand for a company in the cosmetics, perfumes, and toiletries market and made predictions for three different periods (one, five, and ten periods, ahead). 53 XGBoost algorithm performed more consistently in all prediction periods and achieved the lowest error rate. In our study, Gradient Boost, Catboost, and LightGBM algorithms were also selected as boosting algorithms. In general, boosting algorithms performed better in our dataset. MLP algorithm has shown significant success in the accessories and shoes product category with low sales volume change. The gradient boosting and CatBoost algorithms provided better results than others in product groups whose sales volume changes the most. MLP, LightGBM, and XGBoost algorithms performed better in product groups with medium sales volumes, such as outwear and boot wear.
This study offers theoretical contributions to understanding consumer behavior changes and the applicability of sales forecasting models in the retail sector during crises. Unlike previous literature, which often focuses on normal conditions, this research combines pre-crisis and crisis data to show how machine learning methods can generate more accurate forecasts during crises. By analyzing the effects on different product categories, the study provides deeper insights into consumer behavior shifts and contributes significantly to crisis management and sales forecasting literature.
This study's findings indicate that using machine learning algorithms can enhance the accuracy of retail sales forecasting during crisis periods. Retailers can optimize inventory management and improve demand forecasting accuracy by using algorithms such as Gradient Boosting, CatBoost, and MLP, especially during crises. This can aid in making more effective strategic decisions and responding quickly to customer demands. Integrating such models into retail management systems can provide flexibility during fluctuating sales, minimizing revenue losses and increasing customer satisfaction.
Theoretical contributions
The theoretical contributions of this study stand out through the new perspectives it provides on retail sales forecasting during crisis periods. While the effectiveness of machine learning methods in sales forecasting during the pandemic aligns with previous studies, the novelty of this research lies in combining crisis period data with normal period data to produce more accurate forecasts. Unlike many studies in the literature that focus on forecasting methods under normal conditions, this study offers an integrated approach specifically for crisis periods.
Sleiman et al. utilized sales data from a French fashion retailer for 2019–2020 to model the impact of the pandemic on sales. They focused solely on pandemic data, dividing the pandemic period into three phases: the normal period, the lockdown period, and the recovery period. By implementing their methodology, they conducted short- and mid-term forecasting and compared the error rates of their model with traditional forecasting methods
Limitations and future research
This study has some limitations. First, the dataset analyzed belongs to a single women's clothing retailer operating in Turkiye, which may limit the generalizability of the findings to other geographical regions or retail sectors. Second, the study focuses solely on physical store sales data; online sales data were not included in the analysis. Future studies could consider examining the impacts of other factors beyond the pandemic, such as political instability and supply chain disruptions.
Footnotes
Author contributions
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
