Two-step model based on XGBoost for predicting artwork prices in auction markets

Abstract

Art markets globally have grown, making artwork an investment of note. Precise valuation is pivotal for optimal returns. We introduce a two-step model with a two-level regressor, utilizing extreme gradient boosting (XGBoost) for accurate artwork price prediction. The model encompasses a price-class classifier and regressors for individual categories. This captures diverse factor influences, combining predictions to reduce misclassification risks. Visual features further enhance accuracy through the second-step two-level regressor. Experiments on Korean art auction data demonstrate the superiority of our two-step model with the two-level regressor over one-step and two-step alternatives, as well as the hedonic pricing model. While visual features affected one- and two-step models’ training, they boosted performance when integrated into the second-level decision tree, reducing first-level residuals. This emphasizes the two-level regressor’s efficacy in incorporating visual elements for artwork valuation. Our study highlights the potential of our approach in the field of artwork valuation.

Keywords

Korean art market art investment XGBoost two-step model price prediction

1. Introduction

In recent years, the sizes of art markets have increased, and artwork has served globally as an investment option [1]. Global sales of art and antiques reached an estimated 50.1 billion USD in 2020, corresponding to a decrease of 22% compared with 2019, which may have been due to the COVID-19 pandemic [2]. However, the global art market recovered in 2021. Global sales of art and antiques reached an estimated 65.1 billion USD – up by 29% from 2020 – and surpassed the pre-pandemic level in 2019 [3]. A similar trend was observed in the Korean art auction market. The sales of art and antiques for eight Korean art auction companies reached 329.4 billion KRW1

¹
As of March 2023, 1000 KRW was approximately 0.77 USD.

(approximately 253 million USD) in 2021, which exceeded the 2020 total of 115.3 billion KRW and even the 2019 total of 156.5 billion KRW [4]. Consequently, in Korea and the rest of the world, artwork has attracted increasing attention as an investment opportunity.

Studies on artwork appraisal or returns on art investments have been conducted for decades. In previous studies on the valuation of artwork through art price index estimation, a hedonic pricing model was generally employed to estimate the prices of artworks according to their characteristics [5, 6]. This model has been extensively applied to art markets in many countries, artworks by different artists, and art trends (e.g., Impressionism and Cubism). [7] estimated the price indices of artworks created by English, Dutch, and Italian artists using the hedonic pricing model. Another traditional approach for the valuation of artwork is the repeat sales model. This model systematically analyzes price changes over time for each work; thus, it can control the uniqueness of each artwork, in contrast to the hedonic pricing model [1]. In [8], the repeat sales model was used to obtain the price indices of artworks in the Chinese art market. In [9], a pseudo-repeat sales model was proposed to address the limitation that the repeat sales model requires transaction information about artwork sold multiple times. The pseudo-repeat sales model introduced imperfect matching to increase the sample size for model training, and it was validated using data from South African art auctions [9].

Recently, research has been conducted using machine learning techniques to estimate the prices of artworks more accurately based on the various characteristics of artworks and information related to artists and auction houses. In [10], neural networks were used to incorporate both visual and non-visual characteristics of artworks in the valuation of artworks. To extract the features from images of artworks, ResNet, which is a well-known structure for convolutional neural networks (CNNs) [11], was used, and a multilayer perceptron was employed to estimate the prices of artworks by using the extracted image features and non-visual features as inputs. In [12], a CNN was used for images of artworks, similar to [10], and the bidirectional encoder representations from transformers (BERT) model was used to extract features from the text descriptions of artworks.

Although research using machine-learning techniques to increase the predictability of art prices has been gradually increasing, few studies have been performed on machine-learning techniques for price prediction in the art market relative to the amount of research focusing on economic phenomena in the art market. In addition, neural networks have proven to be superior to other machine-learning algorithms for some applications using images and text, but their complex structures necessitate large training datasets and result in high computational costs. Moreover, according to previous studies [10], the contribution of the visual features of artworks to the estimation of art prices was limited. Therefore, a specific methodology to utilize both non-visual and visual features effectively in the price prediction of artworks should be developed for improving the predictability of art prices using machine-learning algorithms.

To address the limitations of previous studies related to predicting artwork prices in the auction market, we propose a two-step model based on extreme gradient boosting (XGBoost) that combines classification and regression models to increase the prediction accuracy for artwork prices. By building separate regression models for artwork belonging to different price classes, the proposed model can capture the different effects of various factors on the valuation of artwork depending on the price class. Moreover, the proposed two-step model combines the predicted values of all the regression models corresponding to different price classes to avoid the risk of price class misclassification, which can cause large errors in price prediction. To incorporate the image features of artwork effectively, we propose a two-level regression model that utilizes non-visual and visual features separately in each level for the second step. The performance of the proposed algorithm in predicting artwork prices in the auction market was validated based on the artworks sold in the Korean art market. In addition, we verified the effectiveness of key parts of the proposed two-step method with the two-level regressor by comparing it with other methods missing some of the key parts.

The remainder of this paper is organized as follows. Section 2 provides a comprehensive review of research on the art market. Section 3 describes the data used in this study as well as the proposed two-step model. The results of experiments that involved predicting the prices of artworks in Korean art auctions are presented in Section 4. Section 5 concludes the study and provides suggestions for future research.

2. Literature review

Research on the art market has been conducted for many years, and it has been analyzed from different perspectives. In several studies, the returns on art investments were evaluated according to the price indices of artworks. In such research, artwork has been treated as an asset, similar to stocks and real estate. Many researchers have attempted to estimate the returns on art investments in different countries and have compared the returns with those of other assets. In [13], the price indices for pre-World War II paintings were calculated using the capital-asset pricing model, and it was found that the returns on the paintings in the United States and United Kingdom were lower than those of stocks for the same period. However, [14] reported that the annual return on art investment between 1900 and 1986 was 17.5%, which exceeded the capital appreciation of stocks, total returns on bonds, and inflation in the United Kingdom, according to the price indices computed using the repeat sales model. In [15], the returns on major paintings of various styles, e.g., Contemporary, French Impressionist, and Modern European, over the period of 1976–2001 were examined, and it was found that the returns on paintings were far lower and the risk was far higher than those of the conventional investment markets for corporate and government bonds and company stocks.

To obtain the price indices of artworks precisely for comparison with other assets in terms of returns, the repeat sales and hedonic pricing model were used as representative models [16, 17, 1], as mentioned in Section 1. The repeat sales model measures the price change of the same artwork between two periods; thus, it requires multiple transactions for the same work [18]. In addition, this model does not require the attributes of the work to obtain its price index, which is advantageous in that it makes the model easy to understand but disadvantageous in that the sample-selection bias is large and there are many unrepresentative data samples. In the hedonic pricing model, the log of the artwork price is typically explained by linear relationships with various attributes related to the artist, artwork, and market [18]. In other words, it is assumed that that the values of artworks can be determined using numerous factors, such as the size, the subject matter, the style, the reputation and nationality of the artist, whether the work contains the signature of the artist, and the auction house where the sale occurs [19, 18]. The hedonic pricing model has been applied to auction markets in several countries, such as the United States, many European countries, and China [20, 21, 22, 23].

Although research on the art market from an economic and financial perspective has been conducted for decades, the use of machine-learning techniques to estimate the price indices of artworks is relatively new. In the literature, to improve the price prediction performance while giving up some explanatory power for explaining and understanding the economic phenomena in the art market, machine-learning algorithms more complex than the traditional methods for obtaining the price indices of artworks have been utilized. One approach is to use neural networks to reflect the visual characteristics of artworks. CNNs have been widely applied to extract the image features of artwork [10, 12]. However, in [10, 12], the use of the visual characteristics of artworks failed to improve the prediction performance, and in [12], it was found that image-based prediction may not be as effective as text-based prediction. In [24], various decision tree-based ensemble algorithms, such as random forest, gradient-boosted trees, and XGBoost, were used for the price prediction of artworks in the Korean auction market. It was found that the prices predicted by the trained models were more accurate than the presale auction estimates.

3. Data and methodology

This section describes the data and explanatory variables and outlines the proposed methodology and experimental procedure.

3.1 Data

Korean art auction market data were collected from K-Artprice site.2

²
http://kartprice.net/.

The K-Artprice site was the first website in South Korea to provide the auction prices of various artworks sold in eight Korean auction houses. From this site, we collected the auction information of artworks knocked down from January 2016 to June 2021, including the price, author, bidding date, auction house, title, size, genre, and medium. This dataset included information on a few artworks created during the Joseon Dynasty, but we only used data for modern and contemporary artworks. In addition, we only selected paintings and prints and excluded other types of artworks, such as sculptures and craftwork, from the collected data. To utilize visual features of the artworks, we also excluded artworks whose images were not provided in the K-Artprice site. Finally, the total number of paintings used in this study was 20,071, which accounted for approximately 92% of the total data.

To increase the explanatory power of the prediction model for the auction sales prices of artworks, instead of introducing dummy variables for authors, we collected additional information related to different authors, such as solo or group exhibition records and award histories. Table 1 presents the explanatory non-visual variables used in this study. In addition, we extracted the visual features of individual artworks from images of artworks, and the method to extract the visual features is described in Section 3.2.2.

Table 1

Description for input features

Category	Variable	Description
Artwork	is_painting	Variable indicating whether the artwork is a painting (painting $=$ 1, print $=$ 0)
	support_[X]	Variable indicating the support medium of the artwork (X represents the support medium, e.g., canvas, paper, fabric, hardboard, wood, silver paper, or metal)
	media_[X]	Variable indicating the medium of the artwork (X represents the medium, e.g., oil, pencil, ink, watercolor, enamel, oil pastel, pigment, or acrylic paint)
	elapsed	Number of time (in years) from the production of the artwork to the day of the auction
	area	Size of the artwork (cm²)
	height	Height of the artwork (cm²)
	width	Width of the artwork (cm²)
Artist	is_death	Variable indicating whether the artist is deceased as of the day of the auction (deceased $=$ 1, alive $=$ 0)
	award	Number of awards received before the day of the auction
	exhb_solo	Number of solo exhibitions before the day of the auction
	exhb_group	Number of group exhibitions before the day of the auction
	historical_price	Average unit price for artworks sold before the day of the auction
	historical_sales_freq	Number of artworks sold in the auction marker before the day of the auction
	age	Age of the artist on the day of the auction
	era	Variable indicating whether the artist was mainly active during the Japanese colonial period (born before 1925 $=$ 1, born after 1925 $=$ 0)
Auction house	is_online	Variable indicating whether the auction is an online auction (online $=$ 1, offline $=$ 0)
	auction_house[X]	Variable to denote different auction houses (X is an integer between 1 and 8, with different numbers representing different auction houses in Korea)

3.2 Methodology

The main objective of this study was to develop an accurate prediction model for the prices of artworks in the Korean art auction market. To increase the accuracy of the predicted artwork prices, we used XGBoost, an improved ensemble algorithm based on gradient-boosted trees [25] that has exhibited good generalization performance and accuracy compared with other tree-based ensemble algorithms, such as the random forest and gradient-boosted tree algorithms [26, 27, 28]. Moreover, a study indicated that XGBoost was superior to the random forest and gradient-boosted tree techniques for the price prediction of artworks [24]. Similar to other boosting algorithms, XGBoost combines weak learners through the sequential training process of a single decision tree, and it has the advantage of avoiding overfitting owing to a regularization term in the objective function. One advantage of XGBoost, similar to other tree-based ensemble algorithms, is its ability to detect important features to estimate the target. Hence, it is possible to identify the most useful factors for predicting the price of artwork using XGBoost.

In the Korean art market, the prices of paintings are commonly evaluated according to their sizes; i.e., the price of a painting is roughly proportional to its size [29]. Therefore, in this study, we aimed to predict the prices of paintings in KRW (denoted as “unit price” herein) according to their sizes (cm²). Figure 1a presents a histogram of the unit prices of the artworks. As shown, the distribution of unit prices is highly right-skewed. To reduce the deviation in the unit prices and make the target variable normally distributed, we used the log-transformed unit prices as the targets, as follows:

$\displaystyle y_{i}=\log_{10}(\textit{price}_{i}+1)$ (1)

where $\textit{price}_{i}$ represents the unit price of the $i$ -th artwork. Figure 1b shows a histogram of the log-transformed unit prices of artworks. The distribution of the log-transformed unit prices is more symmetric than that of the original unit prices, as expected. Table 2 presents statistics for the unit prices and log-transformed unit prices. In this table, “Std. dev.” denotes the standard deviation, and “Q1,” “Q2,” and “Q3” refer the first, second, and third quartiles, respectively.

Table 2

Summary statistics for prices of artworks

Case	Mean	Std. dev.	Min	Q1	Q2	Q3	Max
Unit price	9472.35	68741.73	12.93	363.64	1069.52	3631.40	3,679,175.86
Log-transformed unit price	3.11	0.73	1.14	2.56	3.03	3.56	6.57

Figure 1.

The histogram of unit prices.

3.2.1 Proposed two-step prediction model

Herein, we propose a two-step prediction model for estimating the sales prices of artwork in the auction market. The model uses both classification and regression models to compute the predicted $y_{i}$ , which is denoted as $\hat{y}_{i}$ . The first step is to calculate the probabilities of an artwork belonging to different price classes. The second step is to estimate the log-transformed unit price of the artwork using the predicted values computed from different regression models corresponding to different price classes. Our motivation for proposing the two-step model is that the factors that significantly affect the value of artwork may depend on the price class of the artwork; i.e., the value of expensive artwork may be influenced by different factors from that of low-priced artwork. In the proposed model, the classifier classifies artwork into different price classes, and each regression model corresponding to a specific price class estimates the target value under the assumption that the artwork belongs to the corresponding price class. Another key point of the proposed two-step model is that $\hat{y}_{i}$ is computed using all the predicted values of the different regression models, as follows:

$\displaystyle\hat{y}_{i}=\sum_{j=1}^{K}\hat{p}_{ij}\cdot\hat{y}_{ij}$ (2)

where $\hat{p}_{ij}$ represents the probability that artwork $i$ belongs to the $j$ -th price class and $\hat{y}_{ij}$ represents the predicted value of artwork $i$ obtained using the regression model for the $j$ -th price class. The reason for using the ensemble prediction method that combines the prediction values obtained from all the regressors in the proposed model using Eq. (2) is that the prediction accuracy can be significantly reduced if the classifier incorrectly classifies the price class of the artwork. $\hat{p}_{ij}$ indicates the degree of confidence that artwork $i$ belongs in the $j$ -th price class. Therefore, when the probability of a certain price class is not dominant over the other price classes, prediction using one regression model may be risky. Equation (2) can alleviate the risk of misclassification.

Figure 2 illustrates the process of the proposed two-step model.

Figure 2.

Proposed two-step prediction model.

According to the results of preliminary experiments, including visual features in the classifier training degraded the classification accuracy; thus, the non-visual features were only used to build the classifier for the proposed model. A similar result was observed for the regression model as the second step of the proposed model. According to similar studies that built boosting models using image features extracted from deep learning models such as CNNs [30, 31], the accuracy of CNNs could be improved by replacing fully connected layers with tree-based boosting algorithms. Nevertheless, the reason that adding the visual features failed to improve the prediction accuracy could be that non-visual features have greater impacts on the prices of artworks than visual features, which can be supported by the results of previous studies [10, 12], which showed that the visual characteristics of artworks were not effective for estimating their prices. Additionally, if non-visual and visual features are simultaneously used in a decision tree model, only a few of the visual features may be selected, which lower the generalization performance of XGBoost.

Figure 3.

Two-level XGBoost regressor in the proposed two-step model.

Hence, we propose a two-level XGboost regressor for the two-step model. Figure 3 illustrates the key concept of the proposed regressor. In this figure, “ $x_{nv}$ ” and “ $x_{v}$ ” represent the non-visual and visual features, respectively. In each iteration of XGBoost, the decision tree model is firstly trained using the non-visual features. Then, the residuals are computed based on the predicted and observed values using the tree based on the non-visual features as follows:

$\displaystyle r_{i,t}=y_{i,t}-\hat{y}_{i,t,nv}$ (3)

where $r_{i,t}$ is the residual of the $i$ -th artwork at iteration $t$ and $y_{i,t}$ and $\hat{y}_{i,t,nv}$ are the true target and predicted target obtained from the tree based on the non-visual features at iteration $t$ , respectively. These residuals are used as target values for the second-level decision tree at each iteration. For the second-level tree, only visual features are utilized. In other words, the visual features are used to reduce the prediction errors and explain the remaining part of artwork prices that cannot be explained by the non-visual features. At each iteration, the final predicted values based on the two-level model can be computed as follows:

$\displaystyle\hat{y}_{i,t}=\hat{y}_{i,t,nv}+\hat{y}_{i,t,v}$ (4)

where $\hat{y}_{i,t,v}$ denotes the predicted value of the $i$ -th artwork by the second-level tree at iteration $t$ . Based on Eq. (4), residuals for the corresponding iteration are computed and used as targets at iteration $t+1$ of XGBoost.

3.2.2 Visual feature extraction method

In this study, we decided to leverage a pre-trained image model for feature extraction, because the number of images was approximately 20,000, which may be not sufficient to train an accurate image model for feature extraction. This study utilized Very Deep Convolutional Networks for Large-Scale Image Recognition (VGG-16), one of the most popular pre-trained models for image classification tasks [32]. VGG-16 was developed for image classification, which was not the goal of this study. Hence, we conducted fine-tuning of the pre-trained model for the price prediction task. The target variable for fine-tuning was set as the log-transformed price, which is the same as the target variable of the proposed two-step prediction model.

Figure 4.

Fine-tuning of the VGG-16.

Figure 4 shows the architecture of the VGG-16 for fine-tuning. For fine-tuning, the pre-trained convolutional and pooling layers were frozen and the fully connected layers were re-trained. We added two dense fully connected layers before a output layer. The first and second dense layers had 4,096 and 1,024 nodes, respectively, which was the best condition we tested for this study. In addition, the rectified linear unit as an activation function. No activation function was applied for the output layer with 1 node.

3.3 Experimental design

For the proposed two-step prediction model, we defined the price range for each price class for the classifier target. Considering the distributions of the log-transformed unit prices shown in Fig. 1b, we divided the log-transformed unit price values into four different price classes: (1) PC1: $\leqslant$ 300, (2) PC2: 300–1,000, (3) PC3: 1,000–5,000, and (4) PC4: $>$ 5,000. The numbers of samples in each price class are presented in Table 3. As shown, the imbalance in the sizes of the price classes is not severe, and each price class includes at least 3,000 samples. With this definition for price classes, the classifier of the proposed model was trained to classify artworks into price classes using all the samples. Then, individual regression models corresponding to each price class were trained using artworks belonging to each price class. In total, we trained one classifier and four regressors for the proposed two-step model.

Table 3
The number of samples for each price class

	PC1	PC2	PC3	PC4
# of samples	4,184	5,574	6,339	3,974

We compared the proposed two-step with the two-level regressor model by utilizing the following four machine-learning approaches:

One-step model (M1): One regression model was trained using XGBoost with all samples.

Hard two-step model without the two-level regressor (MH2): The target price was estimated by the regressor of the estimated price class according to the classifier of the proposed model, instead of Eq. (2), and the regressor for each price classes was a typical XGBoost model.

Soft two-step model without the two-level regressor (MS2): The target price was estimated using Eq. (2), but regression models corresponding to individual price classes were trained using the typical XGBoost approach.

Hard two-step model with the two-level regressor (MH2-2LR): The classifiers and regressors were trained to build the two-step model like in the proposed method, but the target price was estimated by the regressor of the estimated price class according to the classifier of the proposed model, instead of Eq. (2).

The key difference between the hard and soft two-step models was the prediction step. To reduce the prediction errors raised by the classification of the price classes, we proposed the ensemble prediction method, the weighted average of the predicted values of all regressors using the probabilities of the price classes estimated by the classifier. The soft two-step model was the model employed to calculate the predicted target values using the ensemble prediction method, whereas the hard two-step model predicted the target values using only one regressor and determined the regressor for prediction using the classifier. For both hard and soft two-step models without the two-level regressor, the regressor model was trained using the typical XGBoost that learns a single decision tree at each iteration.

To verify the impacts of the visual features on artworks pricing in the auction market, we trained the models using different training sets with and without the visual features for the three comparison methods. Similar to the two-step models without the two-level regressor, the predicted target values were also in the hard approach for the proposed two-step model with the two-level regressor, instead of using Eq. (2). In addition to the XGBoost-based models, we compared the proposed model with the hedonic pricing model.

We used three different evaluation metrics: the root-mean-square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), which are defined as follows:

$\displaystyle\textit{RMSE}=\sqrt{\frac{\sum_{i=1}^{N}(y_{i}-\hat{y}_{i})^{2}}{% N}}$ (5) $\displaystyle\textit{MAE}=\frac{\sum_{i=1}^{N}|y_{i}-\hat{y}_{i}|}{N}$ (6) $\displaystyle\textit{MAPE}=\frac{100\%}{N}\sum_{i=1}^{N}\left|\frac{y_{i}-\hat% {y}_{i}}{y_{i}}\right|$ (7)

where $N$ represents the total number of samples in the test dataset. To compare the different models, we employed 10-fold cross-validation. We fine-tuned VGG-16, trained the prediction methods including the proposed and comparison methods using a training set, and computed the evaluation metrics using a validation set, which was repeated 10 times.

Regardless of the type of learning (classification or regression), we set the number of base estimators to 100 and the learning rate to 0.1 for XGBoost. The maximum depth of the base estimators was determined by a grid search through 10-fold cross-validation based on the training sets from $\{10,12,14,16,18,20\}$ regardless of the type of learning.

4. Experimental results

4.1 Prediction performance

Table 4 presents the evaluation results for different models based on 10-fold cross-validation. Here, HP and M1 denote the hedonic pricing and one-step models, respectively, and MH2 and MS2 denote the hard and soft two-step models without the two-level regressor, respectively. The proposed two-step models with the two-level regressor are denoted as either MH2-2LR or Prop, depending on whether the class probability values are used to predict the target. MH2-2LR estimated the prices of artworks using the two-level regressor of the estimated price class according to the classifier in the proposed model, whereas Prop estimated the prices of artworks by combining the predicted values of all the two-level regressors using Eq. (2). In addition, “NV” and “NV $+$ V” denote two different feature sets. “NV” only includes the non-visual features, and “NV $+$ V” consists of both non-visual and visual features. “All” represents the evaluation results based on all samples, whereas “PC1,” “PC2,” “PC3,” and “PC4” denote the evaluation results for different price classes.

Table 4
Evaluation results

	Model	HP	M1		MH2		MS2		MH2-2LR	Prop
	Data	NV	NV	NV $+$ V	NV	NV $+$ V	NV	NV $+$ V	NV $+$ V	NV $+$ V
RMSE	All	0.3971	0.2339	0.3733	0.2463	0.3900	0.2270	0.3698	0.2388	0.2201
	PC1	0.4036	0.2190	0.3700	0.2212	0.3872	0.2087	0.3685	0.2145	0.2025
	PC2	0.3189	0.2128	0.2967	0.2360	0.3217	0.2113	0.3092	0.2288	0.2049
	PC3	0.3660	0.2285	0.3230	0.2411	0.3677	0.2176	0.3443	0.2338	0.2110
	PC4	0.5149	0.2804	0.5176	0.2882	0.4968	0.2753	0.4718	0.2794	0.2670
MAE	All	0.2979	0.1679	0.2685	0.1716	0.2794	0.1613	0.2697	0.1699	0.1596
	PC1	0.3131	0.1508	0.2515	0.1476	0.2577	0.1422	0.2512	0.1461	0.1408
	PC2	0.2491	0.1535	0.2190	0.1654	0.2364	0.1504	0.2295	0.1637	0.1488
	PC3	0.2719	0.1686	0.2477	0.1722	0.2776	0.1596	0.2629	0.1704	0.1579
	PC4	0.3927	0.2049	0.3879	0.2050	0.3648	0.1996	0.3558	0.2028	0.1975
MAPE	All	9.9516	5.5850	8.8453	5.7021	9.2790	5.3557	8.9598	5.5284	5.1934
	PC1	14.5357	7.0094	11.8526	6.7894	12.0889	6.5804	11.8079	6.5830	6.3813
	PC2	9.1238	5.6156	8.0183	6.0475	8.6469	5.5017	8.3949	5.8628	5.3348
	PC3	8.1669	5.0714	7.4332	5.1907	8.3248	4.8039	7.8863	5.0324	4.6578
	PC4	9.1481	4.8617	9.0696	4.8939	8.7052	4.7472	8.4475	4.7452	4.6038

Regardless of the evaluation metric and price class, the proposed two-step model with the two-level regressor using the ensemble prediction method (Prop) outperformed the one- and two-step models without the two-level regressor. However, the difference between MS2 based on the non-visual features and Prop is smaller than that between MS2 using the non-visual features only and MS2 using the non-visual and visual features together. From this result, it can be inferred that the visual features had a marginal impact on artwork pricing in the auction market. Moreover, MH2-2LR showed slightly better performance than MH2 based on the non-visual features only, which implies that the two-level regressor was effective in improving the prediction accuracy of the regressors. In other words, the image features helped slightly reduce the prediction errors computed by the predicted target values obtained using the non-visual features. The limited contribution of the visual features to the price prediction is the opposite of that of the visual features based on deep learning techniques such as CNNs, which have been successfully used to classify the styles of artworks and determine artists [33, 34, 35]. The degradation caused by adding visual features in this study may be because the artworks used in this study consisted only of modern works in South Korea, which are typically far from masterpieces with distinct artistic styles.

In general, the hedonic pricing model exhibited the worst performance for all the evaluation metrics. In particular, the MAPEs of the hedonic pricing model were almost twice those of the proposed two-step model with the two-level regressor (Prop) for all price classes. Moreover, the one-step models generally achieved smaller RMSEs, MAEs, and MAPEs than the hard two-step model without the two-level regressor when the same feature set was used. This difference may exist because the hard two-step model predicted the target values with relatively large errors for the misclassified samples in terms of price classes. In contrast, the soft two-step model without the two-level regressor utilized all the regressors in the prediction, which may have reduced the risk of misclassification. In terms of the feature set, including the visual features, the prediction performance was generally degraded regardless of the model. When the visual features were used to train the models in addition to the non-visual features, the soft two-step model without the two-level regressor showed worse performance than the hard two-step model without the two-level regressor, because the classification accuracy decreased when both non-visual and visual features were used.

Table 5 presents the classification accuracy of the classifier in the two-step model depending on the feature set. Overall, the classification accuracy for all the samples was close to 80% when only non-visual features were used. However, the classification accuracy for all the samples decreased to approximately 65% when the visual features were used in addition to the feature set with non-visual features. This result implies that the visual features are useless for discriminating the price classes of artworks. Although PC1 and PC4 had fewer samples than PC2 and PC3, the classification accuracies for PC1 and PC4 were higher than those for PC2 and PC3. This result implies that inexpensive and expensive artworks per unit area have relatively distinct characteristics compared with other artworks.

Table 5

Classification accuracy

	All	PC1	PC2	PC3	PC4
NV	0.7897	0.8300	0.7279	0.7945	0.8259
NV $+$ V	0.6497	0.6713	0.6150	0.6508	0.6751

Table 6

Ideal performance of the hard two-step model with and without the two-level regressor

	MH2						MH2-2LR
	NV			NV $+$ V			NV $+$ V
	RMSE	MAE	MAPE	RMSE	MAE	MAPE	RMSE	MAE	MAPE
All	0.1577	0.1176	3.8765	0.2020	0.1504	4.9456	0.1530	0.1140	3.7595
PC1	0.1418	0.1053	4.9266	0.1804	0.1351	6.4379	0.1375	0.1020	4.7747
PC2	0.1250	0.0979	3.5812	0.1530	0.1201	4.3938	0.1213	0.0950	3.4752
PC3	0.1559	0.1216	3.6554	0.1978	0.1560	4.6605	0.1512	0.1179	3.5445
PC4	0.2094	0.1521	3.5490	0.2764	0.2005	4.6060	0.2031	0.1476	3.4430

Figure 5.

Feature importance: Classification.

Figure 6.

Feature importance: Regression.

To determine whether the classification performance affected the prediction performance, we evaluated the ideal performance of the hard two-step models. In contrast to the hard two-step models, the regressor for estimating the target value of the specific artwork was determined by the real price class of the artwork to measure the ideal performance; i.e., the target was predicted by the regressor for PC1 if the artwork belonged to PC1. The ideal performance of the hard two-step model is presented in Table 6.

As shown in Table 6, the ideal performance was far better than the performance of the hard and soft two-step models. The difference in the ideal performance between MH2 using the non-visual features and MH2-2LR is slightly smaller than the difference between MS2 using the non-visual features and Prop. This result implies that if the exact price class of an artwork can be determined, the visual features become less effective for determining the prices of artworks. Similar to the results shown in Table 4, adding the visual features into a training set for MH2 degraded the prediction performance. Therefore, training regressors using the non-visual and visual features simultaneously does not help improve the prediction performance.

4.2 Feature importance

Furthermore, we evaluated the feature importance of the classifiers and regressors in the proposed two-step model with the two-level regressor for the non-visual features. The feature importance of the classifier indicates what factors affect the price classes of artworks, as well as what factors determine the prices of artworks in different price classes.

Figure 5 shows the feature importance of the top 20 features for the classifier, and Fig. 6 presents the feature importance of the top 20 features for individual first-level regressors corresponding to different price classes. Except for the regressor for PC4, the variable that indicates whether the artwork is a painting (is_painting) is the most important feature. The data demonstrate that the prices of prints are generally lower than those of paintings. For prints, approximately 43.7% of artworks are in PC1, whereas for paintings, only 13.5% of artworks are in PC1. This difference may exist because prints can have several copies, in contrast to paintings. Whether the auction is an online auction is also an important feature for classifying the price classes and prices of artworks. The results indicate that artworks sold in online auctions are typically less expensive than those sold in offline auctions. Another result observed in Fig. 6 is that historical_price is more important for PC3 and PC4 than for PC1 and PC2. It is the second-most important feature for PC4 and the third-most important feature for PC3. The historical prices of artworks in PC3 and PC4 – particularly PC4 – are higher than those of artworks in PC1 and P2. This result implies that artworks created by artists whose previous works are regarded as highly valuable can be sold at high prices.

5. Conclusion

In recent years, interest in the art market has increased. However, research on the use of machine-learning techniques for the valuation of artworks has been scarce. To address this research gap, we developed a two-step model with a two-level regressor based on XGBoost for accurately predicting the prices of artworks in the auction market. The main underlying assumption of the proposed model is that the value-determining characteristics of artworks may differ depending on the price class. To apply this assumption in price prediction, the proposed model builds a classifier to classify artworks into price classes and the regressors corresponding to the different price classes and then calculates the prices of the artworks by combining the predicted values of all the regressors. Additionally, to incorporate the visual features of artworks into the price prediction effectively, we proposed a two-level XGBoost regressor, considering previous studies showing that the contributions of the visual features of artworks to the estimation of art prices are limited.

The proposed two-step model with the two-level regressor was validated using data for artworks in the Korean auction market between January 2016 and June 2021. To validate the effectiveness of the two-step approach, we compared the proposed algorithm with the one-step model. In addition, the effect of the two-level regressor in the proposed model was verified by comparing it with the two-step model without the two-level regressor. For the comparison methods, such as the one-step model and two-step model without the two-level regressor, we tested two different feature sets depending on whether the visual features were included. The experimental results indicate that the proposed two-step model with the two-level regressor outperformed the other methods and that the XGBoost-based models were more accurate than the hedonic pricing model, which implies that the hedonic pricing model is unsuitable for the accurate valuation of artworks. The ensemble prediction method was effective in improving the prediction accuracy. The ideal performance of the hard two-step model implies that the proposed two-step model can be enhanced by increasing the accuracy of the classifier. However, it was observed that even though the artwork price classification is not perfect, the risk of misclassification may be alleviated through the ensemble prediction method. Interestingly, including the visual features for model training degraded the prediction performance for both the one- and two-step models. Nevertheless, the evaluation results based on the proposed two-step model with the two-level regressor showed that the visual features could be utilized to reduce the prediction errors further.

The proposed method offers several advantages. Firstly, by leveraging XGBoost, it can effectively capture nonlinear relationships between different factors related to artworks and their corresponding prices. This enhances the model’s ability to grasp complex interactions between these factors and artwork prices. Secondly, the method enables the identification of significant factors that strongly influence artwork prices, providing valuable insights for both researchers and market participants. Lastly, the approach of building separate regression models for distinct price classes allows for a nuanced understanding of how diverse factors impact prices across different segments of the art market. This flexibility in modeling contributes to a more accurate and adaptable valuation framework. However, this method also has its limitations. Firstly, the visual features were obtained from a model trained for image recognition, making it challenging to precisely capture the relationship between visual features and prices. Furthermore, the implementation of a single classifier and multiple regressors per price class in the proposed two-step model introduces a higher computational overhead compared to the one-step model.

This study has some limitations. Firstly, we exclusively validated the proposed two-step model using XGBoost, though the classifier and regressors within the model could potentially be implemented with various machine-learning algorithms. Consequently, evaluating the proposed model with alternative machine-learning techniques is warranted to enhance model accuracy. Moreover, such research will ascertain the reliability of the proposed two-step prediction framework in boosting prediction performance across different machine learning algorithms, compared to the one-step model. Secondly, the effectiveness of the proposed model might be influenced by how the price range is segmented into price classes. Thus, identifying the optimal division of price ranges is essential. Thirdly, our study omitted the integration of textual descriptions or critical reviews of artworks. In subsequent research, we plan to gather textual data relevant to artworks and employ deep-learning models like BERT to extract textual features, assessing whether such features enhance artwork valuation precision. Lastly, we selected VGG-16 from several pre-trained image models to extract visual features from artworks. Hence, alternative image models could potentially yield superior visual features. A comparative analysis of the proposed model’s performance based on the image model used for feature extraction will be undertaken.

Footnotes

Acknowledgments

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. RS-2023-00239374).

References

Renneboog

. Pricing art and the art of pricing: On returns and risk in art auction markets. European Financial Management. 2022; 28(5): 1139–1198. Available from: doi: 10.1111/eufm.12348.

McAndrew

. The Art Market 2021. Art Basel & UBS. 2021.

McAndrew

. The Art Market 2022. Art Basel & UBS. 2022.

The Korean Art Auction Market Broke Its Record in 2021. K-ARTNOWCOM. 2022. Available from: https://k-artnow.com/the-korea-art-auction-market-broke-its-record-in-2021/.

Anderson

. Paintings as an Investment. Economic Inquiry. 1974 mar; 12(1): 13.

Chanel

Gerard-Varet

Ginsburgh

. Prices and returns on paintings: An exercise on how to price the priceless. The Geneva Papers on Risk and Insurance Theory. 1994; 19(1): 7–21. Available from: doi: 10.1007/BF01112011.

Buelens

Ginsburgh

. Revisiting Baumol’s ‘art as floating crap game’. European Economic Review. 1993; 37(7): 1351–1371. Available from: https://www.sciencedirect.com/science/article/pii/001429219390060N.

Wang

Zheng

. The comparison of the hedonic, repeat sales, and hybrid models: Evidence from the Chinese paintings market. Cogent Economics & Finance. 2018 jan; 6(1): 1443372. Available from: doi: 10.1080/23322039.2018.1443372.

Binge

Boshoff

. Measuring alternative asset prices in an emerging market: The case of the South African art market. Emerging Markets Review. 2021; 47: 100788. Available from: https://www.sciencedirect.com/science/article/pii/S1566014120305975.

10.

Aubry

Kräussl

Manso

Spaenjers

. Biased Auctioneers. The Journal of Finance. 2023. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/jofi.13203.

11.

Zhang

Ren

Sun

. Deep Residual Learning for Image Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016.

12.

Liu

. An Analysis of Multi-Modal Deep Learning for Art Price Appraisal. In: 19th IEEE International Symposium on Parallel and Distributed Processing with Applications, 11th IEEE International Conference on Big Data and Cloud Computing, 14th IEEE International Conference on Social Computing and Networking and 11th IEEE Internation. 2021. pp. 1509–1513.

13.

Stein

. The monetary appreciation of paintings. Journal of Political Economy. 1977 jan; 85(5): 1021–1035. Available from: http://www.jstor.org/stable/1830343.

14.

Goetzmann

. Accounting for taste: Art and the financial markets over three centuries. The American Economic Review. 1993 jan; 83(5): 1370–1376. Available from: http://www.jstor.org/stable/2117568.

15.

Worthington

Higgs

. Art as an investment: Risk, return and portfolio diversification in major painting markets. Accounting & Finance. 2004 jul; 44(2): 257–271. Available from: doi: 10.1111/j.1467-629X.2004.00108.x.

16.

Collins

Scorcu

Zanola

. Reconsidering hedonic art price indexes. Economics Letters. 2009; 104(2): 57–60. Available from: https://www.sciencedirect.com/science/article/pii/S0165176509001165.

17.

Wang

. Which Part of the Chinese Art Market Is More Worth Investing In? Applying the Quantile Regression to Analyze Chinese Oil Paintings 2000–2014. Emerging Markets Finance and Trade. 2017 jan; 53(1): 44–53. Available from: doi: 10.1080/1540496X.2016.1145113.

18.

Kräussl

. Art Price Indices. In: Fine Art and High Finance. 2012. pp. 63–86. Available from: doi: 10.1002/9781119204688.ch3.

19.

Agnello

. Investment returns and risk for art: Evidence from auctions of american paintings. Eastern Economic Journal. 2002 jan; 28(4): 443–463. Available from: http://www.jstor.org/stable/40325391.

20.

Agnello

Pierce

. Financial returns, price determinants, and genre effects in American art investment. Journal of Cultural Economics. 1996; 20(4): 359–383. Available from: doi: 10.1007/s10824-005-0383-0.

21.

Higgs

Worthington

. Financial returns and price determinants in the australian art market, 1973–2003*. Economic Record. 2005 jun; 81(253): 113–123. Available from: doi: 10.1111/j.1475-4932.2005.00237.x.

22.

Witkowska

. An application of hedonic regression to evaluate prices of polish paintings. International Advances in Economic Research. 2014; 20(3): 281–293. Available from: doi: 10.1007/s11294-014-9468-x.

23.

Renneboog

Spaenjers

. Buying beauty: On prices and returns in the art market. Management Science. 2013; 59(1): 36–53.

24.

Jang

Park

. Art price prediction using decision tree-based machine learning methods. Korean Management Review. 2021; 50(2): 357–381. Available from: http://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE10547363.

25.

Chen

Guestrin

. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16. New York, NY, USA: ACM. 2016. pp. 785–794. Available from: doi: 10.1145/2939672.2939785.

26.

Fan

Yue

Zhang

Cai

Wang

, et al. Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agricultural and Forest Meteorology. 2018; 263: 225–241. Available from: https://www.sciencedirect.com/science/article/pii/S0168192318302855.

27.

Omer

Shareef

. Comparison of decision tree based ensemble methods for prediction of photovoltaic maximum current. Energy Conversion and Management: X. 2022; 16: 100333. Available from: https://www.sciencedirect.com/science/article/pii/S2590174522001568.

28.

Sahin

. Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Applied Sciences. 2020; 2(7): 1308. Available from: doi: 10.1007/s42452-020-3060-1.

29.

Nahm

. Price determinants and genre effects in the Korean art market: A partial linear analysis of size effect. Journal of Cultural Economics. 2010; 34(4): 281–297.

30.

Liu

Wang

Chen

Zhang

Xiao

. A hyperspectral image classification approach based on feature fusion and multi-layered gradient boosting decision trees. Entropy. 2021; 23(1).

31.

Bui

Chou

Hoang

Fang

Huang

, et al. Gradient Boosting Machine and Object-Based CNN for Land Cover Classification. 2021; 13(14).

32.

Simonyan

Zisserman

. Very Deep Convolutional Networks for Large-Scale Image Recognition. In: Bengio Y, LeCun Y, editors. 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. 2015. Available from: http://arxiv.org/abs/1409.1556.

33.

Sandoval

Pirogova

Lech

. Two-stage deep learning approach to the classification of fine-art paintings. IEEE Access. 2019; 7: 41770–41781.

34.

Jiang

Wang

Jin

Han

Sun

. DCT-CNN-based classification method for the Gongbi and Xieyi techniques of Chinese ink-wash paintings. Neurocomputing. 2019; 330: 280–286. Available from: https://www.sciencedirect.com/science/article/pii/S0925231218313201.

35.

Chen

Yang

. Recognizing the Style of Visual Arts via Adaptive Cross-Layer Correlation. In: Proceedings of the 27th ACM International Conference on Multimedia. MM ’19. New York, NY, USA: Association for Computing Machinery. 2019. pp. 2459–2467. Available from: doi: 10.1145/3343031.3350977.

Two-step model based on XGBoost for predicting artwork prices in auction markets

Abstract

Keywords

1. Introduction

1 As of March 2023, 1000 KRW was approximately 0.77 USD.

3. Data and methodology

3.1 Data

2 http://kartprice.net/.

Table 3 The number of samples for each price class

4.1 Prediction performance

Table 4 Evaluation results

5. Conclusion

Footnotes

Acknowledgments

References

¹
As of March 2023, 1000 KRW was approximately 0.77 USD.

²
http://kartprice.net/.

Table 3
The number of samples for each price class

Table 4
Evaluation results