Abstract
Using 34 products from China’s commodity futures market, this study examines the impact of social network attention and sentiment on its futures market returns. A machine learning text analysis algorithm was used to construct social network investor sentiment in consultation with three search volume indices. We find that: social network sentiment is a good predictor of commodity futures returns, investor attention has a significant positive impact on returns and absolute returns, and the Baidu index is better at forecasting returns than the Sogou and 360 indices. In addition, we examine how social network sentiment affects returns at different levels. We find that extremely high, market social network sentiments of investors changed the predicted results significantly; thus, the bases of the specified trading strategies of investors were altered. Regulators should therefore incorporate investor sentiment into regulatory targets and enhance retail investor education.
Introduction
Financial modeling applications and forecasting face many challenges. Research has begun to focus on return forecasting possibilities, and the relationship between investor behavior and financial market movements (Coval & Shumway, 2001; Dumas et al., 2009; Fang et al., 2020; Liu et al., 2022; Tetlock, 2007). The findings from these studies indicate that investor search volume and investor social network sentiment, which are unrelated to the fundamentals, could significantly impact financial market movements.
Although there have been a few studies on the effect of investor sentiment and attention on price movements in the Chinese commodity futures market, this study speculates that investor social network sentiment and investor search volume can be used to forecast returns from the commodity futures market. To test whether investor behavior affects commodity futures market prices in China, this study analyzes the following questions: How does investors’ online behavior affect China’s commodity futures returns? Does investor sentiment have an asymmetric effect on commodity futures market returns? Does the impact of investor attention on Chinese commodity futures market returns change when the level of investor sentiment changes?
This study focuses on China’s commodity futures markets. As demand for commodities continues to rise, the Chinese commodity futures market has come to play an increasingly important global role in stabilizing prices and the pricing of benchmarked commodities. In 2021, China’s commodity futures trading volume had been at the forefront of the global futures market for eight consecutive years. Therefore, examining the price formation mechanisms that affect China’s commodity futures market has gained importance.
This study makes several contributions to the field. First, it examines the impact of investor behavior on Chinese commodity futures markets. We studied many of China’s futures market varieties and extracted 34 for further examination. Chinese markets are highly speculative, with individual and institutional futures market transactions accounting for 57.8% and 42.2% of all transactions, respectively. The structure of Chinese investors differs from that in developed countries. Because of the high proportion of small- and medium-sized investors, their investment behaviors can impact market volatility significantly.
Second, network sentiments were constructed for each commodity futures variety to study their impacts on the commodity futures returns. Many previous studies have used Baker and Wurgler’s (2006, 2007) classic BW index; but as it is constructed from monthly low-frequency data, its use is inappropriate for this study due to the nature of the commodity futures market. With the development of the Internet and big data, many scholars have considered the use of Internet data to construct online investor sentiments. Investors’ trading behavior may be influenced by online public opinion, which can ultimately affect the markets (Antweiler & Frank, 2004; Li & Liu, 2021; L. Sun et al., 2016).
Therefore, to build a social network sentiment (SNS) for the Chinese futures market, an innovative natural language processing (NLP) machine learning approach was developed to crawl the Eastmoney forum for daily investor commentary related to commodity futures. Eastmoney is a social network for Chinese investors and has a dedicated comments section for each futures market commodity. Because of the increase in the number of individual retail investors in China, the Eastmoney forums attract millions of users. Therefore, 160 million reviews were collected, after which the Tencent Natural Language Processing (NLP) platform was employed to develop daily SNS indicators for each Chinese futures market commodity. It was found that the SNS had better prediction effects for all commodity futures returns and that, in general, SNSs in all markets had a significant positive impact on the metal and energy futures markets. The impact of a single SNS on absolute commodity futures returns was found to be better than that of all market sentiments.
Third, this study used three search indices—the Baidu index, Sougou index, and 360 index—to study the impact of investor attention on the commodity futures market returns. We find that investor attention has a significant positive impact on the futures returns, implying that higher search intensities would result in higher futures returns. The Baidu index was found to be better at forecasting futures returns than the Sogou and 360 indices.
Fourth, the interaction effects between investor SNS and attention were comprehensively investigated. It was found that the SNS was closely related to investor attention; for example, when many investors were actively discussing asset A online, investors reading the online discussion began to pay attention to asset A and search for information on it. This increase in online investor SNS toward a certain asset appears to lead to increased investor attention; therefore, this study specifically focuses on the impact of investor attention and investor SNS on market movements. The correlations between investor attention and SNS were then examined. It was found that extremely high market sentiment changed the predicted direction of the results. Owing to the development and structural characteristics of China’s futures market, these findings could be a useful reference for investors when making Chinese commodity futures market decisions. The conceptual framework of this study is shown in Figure 1.

Conceptual framework.
The remainder of this paper is organized as follows. Section 2 reviews the existing literature on the subject, and Section 3 describes the methodological background, data, and variables. Section 4 presents and discusses the results, and Section 5 concludes and provides future research directions.
Literature Review
Efficient market hypothesis (EMH) theory assumes that all investors are rational, and that prices respond immediately and correctly to new information. However, psychological research has found that in addition to obtaining relevant information, emotions, sentiment, and attention can also significantly impact human decision-making (Bechara et al., 1994; Camerer, 2003; Dolan, 2002; Kahneman, 1973; Kahneman & Tversky, 1979). Therefore, according to behavioral finance theory, retail investors are insensitive to information, and their irrational behavior significantly impacts the market. When market future uncertainty increases, retail traders exhibit significant overreaction, which results in greater market volatility (Coval & Shumway, 2001; Dumas et al., 2009; Kumar, 2009). Baker and Stein (2004) found that investor sentiment is negatively correlated with asset portfolio returns and has less capacity to react to information in bull markets. Investor sentiment can also have a significant impact on market liquidity; as market conditions change, investor sentiment impacts liquidity differently (Fang et al., 2020; Liu et al., 2022; Tetlock, 2007).
Owing to developments in information technology, social networking sites have become important places for information dissemination. Some social networking sites (such as Google, Eastmoney, Baidu, etc.) provide important channels for both professional and retail investors to search for and exchange information. Thus, the financial consequences of information technology have received extensive research attention. Studies focusing on attention theory have concluded that social networking information can drive changes in investor attention and affect investor decision-making (Andrei & Hasler, 2015; Ben-Rephael et al., 2017; Pham & Huynh, 2020; Wang et al., 2021). Other studies examining investor sentiment, investor comments, and shared social networking experiences have concluded that these can reflect the current investor moods regarding the underlying asset (Bollen et al., 2011; Naeem et al., 2021; Naughton et al., 2019; Y. Sun et al., 2021).
Impact of Investor Attention on Financial Markets
To study the impact of investor behavior on financial markets, it is necessary to identify the proxies for investor behavior. Many studies have been conducted on measuring investor sentiment and attention. Direct and indirect metrics of investor behavior exist. Indirect proxies for measuring investor attention include trading volume, advertising expenses, news and headlines, media coverage, and extreme returns (Barber & Odean, 2008, 2008; Baker & Stein, 2004; Chemmanur & Yan, 2019; Engelberg & Parsons, 2011; Gervais et al., 2001; Grullon et al., 2004; Hou et al., 2008). However, Da et al. (2011) argued that in the information age, these indirect proxies may fail to capture factors relevant to investor attention. Therefore, they propose a novel, direct measure using aggregate search frequencies in Google, such as its Search Volume Index (SVI) to assess investor attention; they find the SVI to be more suitable for measuring investor attention. Consequently, significant research using SVIs has been conducted to analyze the impact of investor attention. Zheng et al. (2022) constructed an indicator of the carbon market attention (CMA) using the Google SVI, finding a reverse relationship lag effect between the CMA and the European Union’s carbon emission allowance returns. X. Zhang, Lu et al. (2021) used Google’s SVI to measure “Bitcoin,” and found significant Granger causality between Bitcoin returns and internet attention. Chinese studies that used the Baidu SVI to measure investor social networking attention found that Chinese stock returns were significantly correlated with it (Ying et al., 2015).
Impact of Investor Sentiment on Financial Markets
Psychological researches claims that sentiment plays an important role in decision-making. For example, behavioral finance research finds a strong relationship between sentiment and financial decisions. Some market-based indices, such as liquidity (Baker & Stein, 2004), BW (Baker & Wurgler, 2006), and confidence indices (Lemmon & Portniaguina, 2006) have been used to assess investor sentiment. Recent studies are extracting social sentiment from online sources to study the implications for financial markets. For example, to assess Twitter’s ability to predict changes in the Dow Jones Industrial Average (DJIA) over time, Bollen et al. (2011) determined the public’s mood using daily Twitter posts. They found that the Twitter mood time series correlated with the DJIA. Naeem et al. (2020) measured investor sentiment using the Twitter happiness index and found that it significantly affected the future volatility of the country’s VIX indices. X. Zhang et al. (2018) applied the Naïve Bayes algorithm to extract social sentiment from Xueqiu (a specialized social network for Chinese stock market investors) and found that the accuracy of Chinese stock returns predictions could be significantly improved by including social sentiment analyses.
While previous studies have provided insights into investor attention and sentiment from different perspectives, no study has yet investigated the effect of social network investor attention and sentiment on commodity futures market returns.
Methodology
The Vector Auto Regression model has been widely used to examine the influencing factors on time series returns (Baker & Stein, 2004; Da et al., 2011; Pham & Huynh, 2020; Piccoli, 2022; P. Zhu et al., 2021). The advantage of the VAR model is that because it can judge the relationships between two or more variables, a large number of control variables is not needed, which reduces the data demand. Therefore, this study examines the impact of investor attention and SNS on Chinese commodity futures returns using the following VAR models.
where
To determine the volatility changes, the change law of absolute return on assets in Kou et al. (2018) was used, with the abnormal search volume index (ASVI) effect on the absolute futures return being determined as follows.
where
It was surmised that extreme individual investor SNS could affect investor behavior, that is, excessive optimistic or pessimistic moods could push investors to look for more relevant information. Therefore, it was necessary to account for the ASVI effect on the commodity futures returns under different SNS levels, for which a set of dummy variables was introduced, as follows:
where
An interaction term was then applied to test whether there was a difference in the ASVI effect on the futures returns (absolute returns) when the past emotion had been excessive, for which the following models in Equations 6 and 7 were developed:
where
Data
The sample comprises 34 Chinese commodity futures from the metal, energy, agricultural product, and chemical sectors, against which three data types were examined: investor attention, SNS, and transaction data.
Investor Attention
Investors’ internet search volume is an appropriate indicator of investor attention. Three of China’s most popular search volume engines, Baidu, 360, and Sogou were queried. Similar to the GSVI, each index is the weighted sum of the search volumes for certain keywords. The names of the futures being examined were entered as search keywords, as these are directly related to pricing.
As quarterly and monthly data could result in seasonal effects and inaccurately portray investor attention, only weekly investor attention data were chosen, as shown in Figure 2a to c. The Baidu index provides search volume index data from January 2011 to July 2020, and the 360 and Sogou indices provide search volume index data from January 2016 to July 2020 (the 360 and Sogou indices have only been available since 2016).

Search volume index time series for 32 commodity futures: (a) Baidu Index, (b) Sogou Index, and (c) 360 Index.
Table 1 provides the descriptive SVI statistics, demonstrating that the mean search index value was significantly different for the different futures markets. Overall, the Baidu and Sogou search index volumes were greater than that of the 360 search index. To ensure that the ASVI was robust to recent jumps, Da et al.’s (2011) methodology was adopted, in which the ASVI is defined as:
where
Summary Statistics for the SVI.
Note. Table presents the sample summary statistics for weeky Baidu, Sougo, and 360 SVI. Our sample consists of 34 Chinese commodity futures spanning the period January 2016 to July 2020.
Social Network Sentiment
Investor SNS was extracted from the Eastmoney Guba Financial Social Network Forum, which is similar to Twitter. Each commodity future has its own forum in which individual investors can express their opinions. Individual investor posts were collected from Eastmoney Guba using web crawler technology, and the SNS in investor opinions was analyzed using the Tencent NLP, which is a deep learning textual analysis program supported by TencentCloud. The TencentCloud NLP deeply integrates Tencent’s top NLP technology and relies on the accumulation of hundreds of billions of Chinese language corpora. The NLP platform provides sentiment analysis and text classification. Therefore, the sentiment analysis function of the NLP platform was used to partially solve the problem of processing text data generated by online users. The SNS scores from the word segments were compiled based on the Tencent dictionary, which includes sentiment scores for Chinese-language words; they were given a numerical score ranging from a strongly positive mood (1.0) to an extremely negative mood (0.0).
Daily investor comments related to all Chinese commodity futures were crawled from January 1, 2011, to July 1, 2020, from which 160 million comments were extracted. The daily, whole commodity futures markets SNS was then constructed using Tencent’s NLP platform. Figure 3a shows the SNS for commodity futures market.

Social network sentiment: (a) SNS for all commodity futures markets and (b) histogram of posts by category.
Different commodity futures markets have different activity levels, with some being extremely inactive. Therefore, some commodity futures were eliminated from the analysis. Finally, 20 main commodity futures from July 1, 2019 to July 1, 2020 were retained, and each daily SNS was processed using Tencent’s NLP platform. Figure 3b shows the posted volumes for the different commodity futures, from which it is observed that the metal and energy futures attracted the most and least discussions, respectively. Table 2 gives the descriptive SNS statistics; the average network sentiment for some futures varieties was above 0.5, indicating that these varieties were highly speculative. The maximum sentiment for all futures varieties was close to 1, indicating that each variety had experienced periods of abnormally high sentiment.
Summary Statistics for Daily Social Network Sentiment.
Note. This table represents the mean value, maximum value, minimum value, skewness, kurosis and Jarque-Bera statistic of daily SNS for each commodity futures. Due to the lack of activity in some futures forums, the comment data for these futures is greatly flawed. So in this section, our sample consists of 20 Chinese commodity futures spanning the period January 2019 to July 2020.
Relationship Between SNS and Investor Attention
As shown in Figure 4, there is a negative relationship between the SVI (ASVI) and the commodity futures market SNS. In depicting the relationships between the commodity futures markets SNS and the (abnormal) aggregate search volumes, a negative relationship between the SVI (ASVI) and the commodity futures markets SNS was found.

Association between SNS and the SVI (Baidu/Sogou/360).
Commodity Futures Market Data and Summary Statistics
Daily trading data for the Chinese commodity futures market were used to ensure consistency with the weekly SVI and daily social sentiment. The sample period is January 2019 to July 2020. The weekly data period was from January 2016 to July 2020. The daily and weekly returns for 34 commodity futures were calculated based on logarithmic closing prices. Data were extracted from the Wind Data Feed Service. Table 3 shows the summary statistics of the daily returns for each futures market.
Daily Returns Summary Statistics for Each Futures.
Note. This table represents the mean value, maximum value, minimum value, skewness, kurosis and Jarque-Bera statistic of daily returns for each commodity futures. The daily returns spanning the periods 2019 to July 2020.
Empirical Results
The data were subjected to a two-part analysis. First, the impact of the three search volume indices and two social sentiment types on commodity futures returns and absolute returns were analyzed. Then, to account for the potential impact of investor social sentiment on investor attention, the impact of each SVI on the commodity futures returns and absolute returns was examined. Before the data analysis, stationarity was confirmed for all variables in the VAR models using a unit-root test.
Effect of Social Network Attention on Commodity Futures Returns
Table 1 shows the regression results for Model (1) and Model (3). Thirty-four listed and traded varieties in China’s commodity futures market were examined; however, due to space limitations, Table 4 only lists the varieties on which network attention had the most significant impact on returns. Table 4 shows that ASVI had a significant positive impact on the following week’s returns and absolute returns. These results imply a higher search intensity when futures returns and absolute returns are higher. Notably, the ASVI influence coefficients for the futures returns were smaller than those for the absolute futures returns, that is, the impact of the ASVI on the absolute returns is stronger.
Abnormal SVI (ASVI) and Commodity Futures Returns/Absolute Returns.
Note. T statistic in parentheses *p < .1, **p < .05, ***p < .01. The dependent variable in each Vector Autoregression are Baidu ASVI, Sogou ASVI and 360 ASVI, the independent variables are the weekly return and weekly absolute return of the commodity futures market. Number in brackets are t-statistics corresponding to the coefficients. Numbers in brackets under the coefficients are corresponding t-values.
The analysis of the metal futures market showed that the Baidu ASVI had the most significant positive impact on gold futures and ironstone futures returns, and a significant impact on the absolute rebar futures yields; the Sogou ASVI had a significant and positive impact on the return of the copper, nickel, and aluminum futures markets. The overall impact of the 360 ASVI on metal futures returns was smaller than that of the Baidu and Sogou ASVIs. The Baidu and 360 ASVIs had significant and positive impacts on the soybean meal, bean oil, rapeseed meal, rapeseed oil, and cotton futures markets returns. However, except for bean oil and cotton, the Sogou ASVI had little impact on the agricultural product futures returns. The Baidu and Sogou indices also had significant positive impacts on all six energy and chemical futures market returns.
While many studies have focused on the Baidu and Google search indices and general stock markets (Y. Zhang, Chu et al., 2021; Y. Zhang & Tao, 2019; Z. Zhu et al., 2020), few studies have focused on commodity futures markets. However, the conclusions in this section are consistent with previous research (Kou et al., 2018; Mišečka et al., 2019), in suggesting a short-run causal effect of attention-driven behavior in these commodity markets. This study examined three search indices and found that they had different effects in different commodity markets. The Baidu ASVI was found to be better than the Sogou and 360 ASVIs in forecasting futures returns and absolute returns. The Sogou and 360 ASVIs had particularly prominent impacts on metal and agricultural futures, respectively.
Effect of SNS on Commodity Futures Returns
De Long et al. (1990) find that traders’ optimism (pessimism) results in a temporary upward (downward) bias in stock prices. To determine the relationship between SNS and commodity futures returns, Equations 2 and 4 were applied, with five chosen as the VAR model’s largest lag, because there are five trading days per week. Thus, the relationship between aggregate SNSs, single futures markets, commodity futures returns, and attention and commodity futures returns under extreme SNS, is examined below.
Relationship Between sentmarket and Commodity Futures Returns
The sentiment indicator for the overall commodity futures market (
Commodity Futures Market’ SNS and Commodity Futures Returns.
Note. t statistic in parentheses *p < .1, **p < .05, ***p < .01. The dependent variable in each Vector Auto-regression are the different lag term of aggregate commodity futures market social sentiment, the independent variables are the daily return and daily absolute return of each commodity futures. Number in brackets are t-statistics corresponding to the coefficients. Numbers in brackets under the coefficients are corresponding t-values.
Model (4) in Table 5 shows the effect of
Relationship Between senti and Individual Commodity Futures Returns
This section examines the effects of individual commodity futures market SNS (
Individual Commodity Futures Market SNS and the Returns.
Note. t statistic in parentheses *p < .1, **p < .05, ***p < .01. The dependent variable in each Vector Auto-regression are the different lag term of each commodity futures’ own social sentiment, the independent variables are the daily return and daily absolute return of each commodity futures. Number in brackets are t-statistics corresponding to the coefficients. Numbers in brackets under the coefficients are corresponding t-values.
Effect of ASVI on the Commodities Futures Market Returns Under High and Low SNS
This section examines the effect of the ASVI on commodity futures market returns under high and low SNS. Table 7 shows that the interaction terms had both positive and negative effects at the first lag, which suggests that the excessive emotions influenced the ASVI impact direction, and that a higher ASVI under extreme SNS generated uncertain (higher or lower) returns. It was found that SNS was closely related to investor attention; for example, when many investors were actively discussing asset A online, investors reading the online discussion began to pay attention to asset A and search for information about it. This increase in online investor SNS for a certain asset could then lead to increased investor attention toward it.
ASVI Under Extreme Emotions and the Commodity Futures Returns.
Note. t statistic in parentheses *p < .1, **p < .05, ***p < .01. The dependent variable in each Vector Auto-regression are the ASVI in extreme emotions, the independent variables are the daily return and daily absolute return of each commodity futures. Number in brackets are t-statistics corresponding to the coefficients. Numbers in brackets under the coefficients are corresponding t-values.
The significant coefficients for the ASVI influence under extremely high SNS on absolute returns were negative, implying that during periods of high SNS, the ASVI had a lower impact on futures prices and volatility. This is consistent with the overconfidence and self-attribution showed by uninformed investors. Corredor et al. (2015) suggest that when the market SNS is high, investors become overconfident and pay less attention to related information, which results in a reduction in the volatility of ASVI futures returns. In contrast, under extremely low SNS, the significant effect of the ASVI was positive, with these significant positive coefficients (Table 7) being larger than the corresponding coefficients in Table 4. This suggests that past excess emotions tended to strengthen the ASVI driver function for commodity futures returns (absolute returns). A possible explanation for this may be that when the market SNS is extremely low, investors tend to doubt commodity futures market actions, and submit frequent online search queries, which in turn increases the volatility of ASVI futures returns.
Conclusion
This study examines the impact of investor sentiment and investor attention on returns from the highly speculative commodity futures markets in China. Thirty-four commodity futures markets were examined, and the impacts of the SNS on each commodity futures variety was studied using 160 million reviews. Based on the Tencent (NLP) platform, an innovative NLP machine-learning approach was developed to assess the daily SNS indicators for each commodity. Three commonly-used (SVI) were consulted: Baidu, 360, and Sogou, as direct measures of investor attention. It was found that each SVIs affected different market types differently.
A higher search intensity was found when there were higher futures returns and higher absolute returns. The ASVI influence coefficients for futures returns were smaller than those for absolute futures returns, that is, the impact of the ASVI on absolute returns was stronger. These results suggest a short-term causal effect of attention-driven behavior in these commodity markets. The Baidu ASVI was found to be better at forecasting futures and absolute returns than the Sogou ASVI and 360 ASVI. The Sogou and 360 ASVIs particularly impacted metal and agricultural futures, respectively; metal commodity futures were the most sensitive to ASVI changes.
The sentiment indicator for the overall commodity futures markets (
As commodity futures prices are mainly affected by major factors such as supply and demand, in a more efficient market, the prices of the individual commodity futures should be less affected by the overall market individual SNS. While the overall SNS of the commodity futures market was found to be unrelated to commodity futures variety or information, any rise and fall in investor SNS in the overall market was seen to continue to significantly affect commodity futures prices. These results confirm that China’s futures market is highly speculative and significantly affected by investor SNS.
An extremely high market SNS was found to change the predicted direction of the significant results, and an extremely low market SNS strengthened the predictive ability of the significant results. Because of the structural characteristics of China’s futures market, these findings could assist investors in making commodity futures market decisions.
From the above empirical results, it is clear that the irrational behavior of retail investors has a significant impact on the Chinese commodity futures markets. The following are some practical policy implications to help reduce the impact of economic policy uncertainty and negative volatility in Chinese commodity futures markets. As the Chinese futures markets are highly speculative, individual investor can significantly influence China’s futures market returns. Network investor sentiment and investor attention have a significant impact on Chinese commodity futures market returns. As different search indices have significant positive predictive effects on returns for different types of futures products, regulators could use these three search indices to monitor commodity futures volatility. As the overall commodity futures market network sentiment index was found to positively impact returns, supervisors could monitor overall investor sentiment in the futures market; when investor sentiment is high, investors could be prevented from speculating excessively. This can be used to control market volatility. However, individual investors can obtain excess returns from these individual market sentiment indicators. As the structure of investors in the Chinese market is unreasonable, investors’ irrational behaviors have a significant impact on asset pricing. When the future of the economy is uncertain and investor sentiment is unstable, market risk can increase. Regulators should monitor the state of investor sentiment in the market intermittently, and increase investor education. Regulators must enhance the transparency of information disclosure further. To further strengthen the price discovery function of the Chinese futures market, it is necessary to improve the investor structure, encourage institutional investors to participate in market transactions, and enhance the information discovery function.
The limitations of this study are as follows. Owing to the limitations of many websites, more online investor statements could not be crawled. Investor SNS was only extracted from the Eastmoney Guba. Therefore, the range of the web data sources was relatively small. We did not further analyze the impact of network investor sentiment and search volume on liquidity either. Future researchers could examine these issues. Future studies could integrate investor discussions from multiple websites to comprehensively measure online investor sentiment and systematically analyze the impact of investor behavior on the microstructure of commodity futures markets.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Wenwen Liu is grateful for the National Natural Science Foundation of China (No.72203173, No.71971175).
