Sage Journals: Discover world-class research

Abstract

This study presents a novel multi-modal approach analyzing public sentiment towards interactive art across Jiangsu Province using social media data. By integrating computer vision and NLP techniques (fine-tuned Qwen-VL for image captioning and prompt-based LLMs for sentiment analysis), we capture nuanced digital representations of artworks and public reactions. Findings reveal complex spatial patterns challenging traditional urban-rural dichotomies and highlighting Jiangsu’s polycentric cultural innovation. Urban centers focus on technological aspects and critical discourse, while peripheral areas emphasize thematic content and audience engagement. Correlation analysis reveals relationships between socioeconomic factors and digital art engagement, reflecting cultural capital and digital divide theories. These patterns invite re-examination of China’s cultural development through postreform urbanization theories, suggesting place-based policies that recognize diverse strengths across the urban-rural continuum. This research contributes to cultural democratization debates and offers a replicable framework for data-driven cultural policy-making.

Keywords

interactive art social media analytics public sentiment analysis multi-modal model cultural capital

Introduction

China’s rapid urbanization and technological advancement have catalyzed a profound transformation in the cultural landscape, particularly in the interactive art. This dynamic environment necessitates innovative approaches to understanding public engagement with and perception of interactive artworks, positioning Chinese cities at the forefront of global cultural studies and policy innovation. As urban populations grow and digital technologies proliferate, understanding public sentiment towards interactive art has become increasingly critical for sustainable cultural development and urban placemaking. Social media platforms, with their vast user bases and rich multi-modal content, offer a unique window into the collective experiences and opinions of art audiences (Yu, 2009). However, leveraging this data to gain meaningful insights into perceptions of interactive art across diverse urban and rural contexts remains a significant challenge (Wan & Li, 2024).

Previous research has explored social media data for cultural analysis, focusing on audience engagement patterns (Xia et al., 2025; Zheng et al., 2014), artistic trends (Cranshaw et al., 2012), and perceived cultural value (Dubey et al., 2016). While these studies have provided valuable insights, they often rely on single-modality data or focus on metropolitan-level analysis, overlooking the nuanced variations in perception across different urban and rural contexts. Moreover, the integration of visual and textual data from social media to comprehensively capture perceptions of interactive art remains underexplored, particularly in Chinese provinces with unique urban-rural dynamics.

To systematically analyze public sentiment towards interactive art, we develop a novel computational framework that integrates multi-modal deep learning with spatial econometrics. Our approach combines fine-tuned vision-language models for contextual image understanding, prompt-engineered Large Language Models (LLMs) for complex sentiment extraction, and advanced spatial statistical methods for socioeconomic pattern discovery. This methodological synthesis enables granular analysis of both visual and textual social media content while accounting for spatial autocorrelation and demographic heterogeneity. The framework incorporates robust validation protocols through expert-annotated ground truth data and cross-modal consistency checks, ensuring reliable insights into cultural perception patterns.

Our regional analysis reveals distinctive spatial gradients in interactive art engagement that challenge conventional urban-rural dichotomies. Core cities demonstrate sophisticated technological discourse but show unexpectedly low community engagement scores, while peripheral regions excel in participatory experiences despite limited infrastructure. These patterns suggest a more nuanced relationship between urbanization and cultural innovation than previously theorized, with implications for cultural policy design.

The socioeconomic determinants analysis uncovers complex non-linear relationships between development indicators and artistic engagement. Educational attainment shows strong threshold effects in technological appreciation (β = .82) until reaching tertiary enrollment rates of 68%, beyond which institutional factors dominate. Cultural facility density demonstrates unexpected negative correlations with certain engagement metrics, suggesting potential oversaturation effects. These findings indicate that cultural capital formation follows distinct pathways in rapidly developing regions (Lamont & Lareau, 1988), necessitating tailored policy approaches that recognize local strengths and limitations. The key contributions of this work are threefold:

A novel multi-modal framework that integrates computer vision, natural language processing, and spatial statistics to extract nuanced insights from social media representations of interactive art.

Evidence of “peripheral advantage” in community-oriented art discourse, revealing inverse relationships between urbanization and participatory cultural innovation.

Quantitative demonstration of cultural capital threshold effects across China’s urban-rural continuum, suggesting the need for differentiated policy approaches in transitional economies.

The remainder of this paper is structured as follows. Section “Related Works” examines emergent approaches in social media analytics and interactive art research. Section “Weibo Interactive Arts Dataset” details our Weibo-based dataset construction and validation protocols. Section “Methods” presents our methodological framework, integrating vision-language models with spatial econometrics. Sections “Regional Results,”“Art Perception Across Social Stratas,” and “Socioeconomic Determinants of Art Engagement” analyze the spatial distribution of interactive art engagement, examining regional variations, socioeconomic correlates, and cultural capital formation mechanisms.

Related Works

Computational Social Media Analysis

The analysis of social media data has emerged as a transformative paradigm for understanding cultural phenomena, enabling researchers to examine public engagement patterns at unprecedented scales (Gandhi et al., 2023). Early computational approaches focused primarily on text-based sentiment analysis, but recent advances have highlighted the need for multi-modal frameworks that can capture the rich interplay between visual and textual expressions in social media discourse.

The evolution of social media analytics has been marked by significant methodological innovations in demographic inference and representativeness. Wang et al. (2019) pioneered multilingual approaches for demographic attribute inference, demonstrating how post-stratification techniques can mitigate sampling biases in social media data. This work was further extended by Kumar and Singh (2022), who developed deep neural architectures for extracting geographical references from bilingual social media content. These advances have been particularly crucial for studying cultural phenomena across diverse linguistic and social contexts. Contemporary research has increasingly focused on the challenges of location inference and contextual understanding, with Lamsal et al. (2022) introducing sophisticated frameworks for origin location identification that achieve notable accuracy across different geographical granularities. These developments have been complemented by innovations in personalized content moderation and the integration of social explanations into explainable AI systems collectively advancing our ability to analyze complex social media phenomena while addressing crucial issues of bias and representativeness (Gong et al., 2024; Jhaver et al., 2023).

The field has recently witnessed a paradigm shift toward more nuanced approaches for analyzing information diffusion and user engagement patterns. Zhang et al. (2019) demonstrated the critical role of social media in disaster communication, highlighting the importance of understanding network dynamics and information flow patterns. This has been further elaborated by Meel and Vishwakarma (2020), who developed comprehensive frameworks for analyzing information pollution and content authenticity in social networks. The integration of these approaches with visual media analysis techniques, as demonstrated by Rogers (2021), has enabled researchers to develop more sophisticated understanding of how visual content shapes online discourse and cultural perception. These developments have been particularly significant for studying ephemeral content and temporal dynamics in social media engagement, as exemplified by Villaespesa and Wowkowych’s (2020) analysis of museum-related social media stories.

Interactive Arts and Generative Systems

Interactive art research has evolved through three distinct phases: technological experimentation, participatory design, and socio-spatial analysis. Early studies focused on establishing fundamental frameworks for understanding embodied interaction and audience participation levels (Kohtala et al., 2020). Contemporary research examines the integration of generative systems within interactive installations, revealing complex relationships between algorithmic creativity and human engagement (Epstein et al., 2023). These developments have catalyzed new theoretical frameworks for understanding how interactive systems mediate urban experiences and reshape cultural spaces (Brinkmann et al., 2023), particularly in the context of rapidly evolving digital landscapes.

Recent advances in generative AI have fundamentally transformed the interactive art landscape, introducing novel paradigms for creative expression and audience engagement. Studies have documented enhanced individual creativity through AI-augmented interactive installations (Doshi & Hauser, 2024), while simultaneously raising questions about collective novelty and cultural homogenization. This tension manifests particularly in public reception, where bias against AI-generated elements can paradoxically enhance perceived human creativity (Horton et al., 2023). The emergence of “machine culture” has introduced new modes of cultural transmission and evolution (Brinkmann et al., 2023), leading to sophisticated theoretical frameworks for understanding human-AI creative collaboration (Hertzmann, 2025). These developments suggest a fundamental shift in how interactive art systems function as mediators of cultural experience (Hermann & Puntoni, 2024), necessitating new approaches to studying their impact on public space and collective meaning-making.

Weibo Interactive Arts Dataset

The dataset comprises 15,201 Weibo posts specifically focused on interactive art installations across Jiangsu Province, collected through systematic API queries and manual verification from January to December 2023. The data collection protocol employed rigorous filtering mechanisms, including location-based parameters, contextual relevance scoring, and expert validation resulting in a high-fidelity corpus of interactive art documentation. Each post was annotated with standardized metadata including geographical coordinates, installation specifications, and interaction modalities. The temporal distribution ensures comprehensive coverage of both permanent installations and temporary exhibitions, while spatial sampling across 13 administrative regions maintains representative coverage of urban-rural variations. Figures 1 and 2 present sample posts demonstrating the data quality and analyze the overall distribution patterns across interactive art categories.

Figure 1.

Sample Weibo posts from different Jiangsu regions, showcasing the diversity of interactive art content and imagery.

Figure 2.

Dataset statistics showing post types and content characteristics: (a) distribution of text-only and image containing posts and (b) post length and images per post across categories.

Figure 1 presents ten representative examples of interactive art installations and their corresponding social media descriptions from different regions across Jiangsu Province, illustrating the diversity and sophistication of digital-physical artistic experiences. The installations demonstrate various interactive modalities: from Xuanwu, Nanjing’s immersive digital landscape that transforms traditional mountain-water paintings into dynamic multiverse experiences, to Binhu, Wuxi’s innovative “Smart Flower Landscape” that employs naked-eye 3D technology for participatory floral displays. Notable technological approaches include Suzhou’s sound-to-visual conversion system that translates drum rhythms into dynamic imagery, and Lianyun, Lianyungang’s ecosystem simulation that enables visitors to create virtual porpoises. The examples also showcase culturally-rooted innovations, such as Guangling, Yangzhou’s dialect-based digital flower installation and Jingkou, Zhenjiang’s integration of traditional ink wash techniques with digital river mapping. The installations range from large-scale immersive environments (interactive Song Dynasty exhibition in Haimen, Nantong) to intimate, AI-driven installations (gesture-responsive system in Zhonglou, Changzhou). This diverse collection reflects the province’s sophisticated integration of traditional cultural elements with cutting-edge interactive technologies, demonstrating the evolving landscape of public engagement with digital art.

Figure 2 presents a comprehensive analysis of the interactive art posts dataset. Audience Engagement emerges as the dominant category with 3,952 posts (26%), comprising 2,964 image-containing and 988 text-only posts. Artistic Techniques follows with 3,344 posts (22%), showing a notably high image presence (2,508 posts). Technological Platforms accounts for 3,040 posts (20%), with content length averaging 135 tokens and 1.9 images per post. Thematic Content represents 1,976 posts (13%), displaying the highest average post length of 160 tokens. Exhibition Contexts comprises 1,672 posts (11%), characterized by consistent image inclusion (1.7–2.1 images per post). Artist Profiles shows the smallest share with 1,217 posts (8%), yet demonstrates substantial text content averaging 140 tokens per post. Notably, image-containing posts dominate across all categories, constituting 75% of the total dataset, with post lengths ranging from 100 to 160 tokens.

This distribution reveals significant patterns in public engagement with interactive art on social media. The predominance of Audience Engagement and Artistic Techniques suggests a strong emphasis on participatory and technical aspects of interactive art, while the lower representation of Artist Profiles indicates less focus on creator-centric discourse. The consistent presence of images across categories, particularly in technique-focused posts, reflects the visual-centric nature of interactive art documentation. The varying post lengths across categories, with longer posts in Thematic Content and shorter ones in Exhibition Contexts, suggests different levels of descriptive depth required for different aspects of interactive art discussion.

Methods

This investigation employs a three-stage analytical pipeline to decode public engagement with interactive art across Jiangsu Province. The framework synthesizes computer vision for artwork interpretation, natural language processing for sentiment extraction, and spatial statistics for socioeconomic pattern discovery, enabling systematic examination of cultural perception formation mechanisms.

Multi-Modal Analysis Framework

The proposed framework, illustrated in Figure 3, integrates image and text analysis to provide a comprehensive understanding of public perceptions towards interactive art. This approach accommodates posts with both images and text, as well as text-only posts, ensuring a robust analysis of diverse social media content across Jiangsu’s regions.

Figure 3.

Framework of the multi-modal interactive art analysis system regional development and digital perceptions of interactive art.

The first step involves topic classification, where image captions are generated for visual content of interactive artworks and combined with post text. This multi-modal input is then processed by a Large Language Model (LLM) using specific prompts to determine the post’s topic within the context of interactive art. For sentiment classification, we utilize only the post text as input to the LLM, yielding positive, negative, or neutral sentiments towards the artwork or experience. This text-centric approach for sentiment analysis ensures consistency across all post types and leverages the nuanced language understanding capabilities of LLMs.

Attention analysis quantifies the frequency of posts for each interactive art topic across regions, visualized through GIS mapping to reveal spatial patterns of engagement with different aspects of interactive art. Similarly, sentiment analysis aggregates emotional valence towards topics by region, providing insights into spatial variations in perceptions of interactive art. These analyses offer a multifaceted view of how different aspects of interactive art are perceived and discussed across Jiangsu’s diverse urban-rural landscape.

The final step employs correlation analysis to uncover relationships between socioeconomic factors and both the attention and sentiment towards various interactive art topics. This integrative approach bridges quantitative social media analysis with traditional socioeconomic indicators, offering a novel perspective on the interplay between.

LLM-Based Image Captioning for Interactive Art

To extract meaningful information from the visual content of Weibo posts related to interactive art, we employ an advanced image captioning approach leveraging Large Language Models (LLMs). This method allows us to generate descriptive captions that capture the salient features and context of each interactive artwork, providing a textual representation that can be seamlessly integrated with the natural language processing pipeline.

The LLM-based image captioning system begins with feature extraction using a pretrained Convolutional Neural Network (CNN), specifically a ResNet-152 architecture (He et al, 2016). These visual features are then encoded into a format compatible with the input requirements of our LLM, which is a fine-tuned Qwen-VL (Young et al., 2024). The LLM takes the encoded visual features as input and generates a descriptive caption through an autoregressive process. To enhance the relevance and specificity of the captions to our interactive art context, we employ a specific fine-tuning approach. We fine-tune the LLM on a large corpus of image-caption pairs from interactive art installations and exhibitions. This approach allows the model to generate captions that are not only descriptive of the visual content but also sensitive to the specific characteristics and contexts of interactive artworks.

Figure 4 presents a demonstration of the image captioning results for various interactive art scenes in Jiangsu. The examples showcase the model’s ability to capture diverse elements of interactive artworks, from large-scale installations to participatory experiences and technologically-driven pieces.

Figure 4.

Demonstration of image captioning results for interactive art scenes in Jiangsu.

To evaluate the task-specific image captioning system, we conducted a comprehensive comparison with standard metrics such as BLEU-4 (Papineni et al., 2002), METEOR (Banerjee & Lavie, 2005), CIDEr (Vedantam et al., 2015), and SPICE I (Anderson et al., 2016) in Figure 5.

Figure 5.

Performance comparison between base and fine-tuned models: (a) evaluation results on standard image captioning metrics and (b) model performance across six dimensions.

Figure 5a demonstrates model performance across established metrics, with CIDEr increasing from 0.998 to 1.025 and BLEU-4 from 0.321 to 0.344. These substantial improvements (CIDEr +0.027, BLEU-4 +0.023) indicate enhanced semantic understanding and structural coherence in image captioning. Figure 5b reveals distinct categorical performance through radar visualization. The base model exhibits bias toward technological platforms (0.85) while underperforming in artistic techniques (0.68) and artist profiles (0.65). Fine-tuning yields substantial improvements in weaker areas (+0.15 in artistic techniques, +0.19 in artist profiles) while maintaining technological strengths (0.89), demonstrating effective rebalancing of model capabilities.

Interactive Art Posts Analysis with LLMs

To extract meaningful insights from Weibo posts about interactive art, we employed Large Language Models (LLMs) for topic classification and sentiment analysis. This approach leverages LLMs’ advanced natural language understanding capabilities to categorize posts into predefined topics related to interactive art and assess their sentiment. The LLM-based method allows for nuanced interpretation of complex language patterns, idioms, and context-specific meanings, providing an accurate representation of public perceptions expressed in social media posts about interactive art.

Table 1 presents the topic classification scheme used for categorizing interactive art-related posts. This comprehensive framework covers various aspects of interactive art, from artistic techniques and thematic content to audience engagement and critical discourse. Sentiment analysis was performed with the LLM classifying posts into positive, neutral, or negative sentiments towards the interactive artwork or experience. To enhance performance, we implemented a prompt-based method, providing the model with examples of sentiment classification in interactive arts before processing each post. Table 2 presents examples of prompt results for topic and sentiment classification related to interactive art across different regions of Jiangsu Province. To evaluate the performance of our LLM-based approach, we compared it with several baseline methods, including advanced LLMs. Figure 6 presents the accuracy results for topic and sentiment classification.

Table 1.

Topic Classification on Interactive Art Social Media.

Category	Subcategory	Topics
Artistic techniques	Digital interactivityPhysical interactivityAudiovisual elements data-driven art	Touchscreens, motion sensors, VR/AR integrationKinetic sculptures, interactive installations sound reactive art, light installationsData visualization, AI-generated art
Thematic content	Social commentaryPersonal expressionScientific explorationCultural identity	Political art, environmental awarenessEmotion-driven interactions, biographical works STEM-inspired art, bioartHeritage-based interactions, community-specific art
Audience engagement	Participation levelsCollaborative creationLearning experiencesEmotional responses	Passive viewing, active manipulationMulti-user experiences, crowdsourced art educational interactives, skill-building artEmpathy-inducing works, mood-altering installations
Technological platforms	Web-based artMobile applicationsInstallation technologiesEmerging tech	Browser interactives, net artAR apps, interactive art gamesProjection mapping, responsive environments AI/ML art, blockchain-based interactives
Exhibition contexts	Gallery spacesPublic artDigital platformsFestivals and events	Museum installations, art fair presentationsUrban interactives, community engagement projects online exhibitions, virtual galleriesMedia art festivals, interactive performances
Artist Profiles	Established artistsEmerging talentsCollaborative teams AI as artist	Career retrospectives, new works by known artists student projects, up-and-coming artistsInterdisciplinary groups, artist-engineer partnershipsAI-generated art, human-AI collaborations

Table 2.

LLM Results for Topic and Sentiment Classification of Interactive Art Posts Across Jiangsu Regions.

Region	Post excerpt	Topic	Sentiment
Nanjing	The VR installation at Nanjing Museum brings ancient pottery to life in ways I never imagined possible.	Artistic techniques	Positive
Suzhou	The interactive water-light show at Jinji Lake perfectly captures the essence of Suzhou’s heritage.	Thematic content	Positive
Wuxi	Incredible to see dozens of strangers creating art together on the interactive projection wall.	Audience engagement	Positive
Changzhou	The AR exhibition keeps crashing every few minutes, completely ruining the experience.	Technological platforms	Negative
Nantong	The digital art showcase brilliantly highlights the evolution of our local artists.	Exhibition contexts	Positive
Yangzhou	The collaboration between traditional master and young digital artists creates something truly innovative.	Artist profiles	Positive
Zhenjiang	The motion sensors completely fail to capture any subtle movements in the interactive display.	Artistic techniques	Negative
Taizhou	Our whole neighborhood came together to create this amazing projection mapping experience.	Audience engagement	Positive
Suqian	Spent more time figuring out how to use the interface than experiencing the actual art.	Technological platforms	Negative
Huai’an	Poor exhibition layout makes it impossible to properly interact with any of the installations.	Exhibition contexts	Negative
Yancheng	The interactive wetland installation powerfully conveys our local environmental challenges.	Thematic content	Positive

Figure 6.

Performance comparison of LLM models for interactive art analysis: (a) topic classification performance and (b) sentiment classification performance.

Figure 6 presents the comparative performance of LLMs and baseline methods in classifying interactive art-related social media content. Our analysis reveals that the fine-tuned LLM achieves superior performance across both topic classification (accuracy: 0.823, 0.012) and sentiment analysis (F1-score: 0.818, 0.015), outperforming conventional models including BERT (+11.2%), RoBERTa (+7.8%), and DeBERTa (+4.5%). These results validate the effectiveness of our domain-specific fine-tuning strategy and prompt engineering framework in capturing the complex semantics of interactive art discourse.

Regional Results

The analysis of social media data across Jiangsu’s regions reveals distinct patterns of public perception and engagement with interactive art. This section presents the spatial distribution of attention and sentiment towards various dimensions of interactive art, highlighting the complex interplay between socioeconomic factors and digital representations of artistic experiences.

Social Attention on Interactive Art

The attention analysis across Jiangsu’s regions reveals distinctive patterns in how different aspects of interactive art are perceived and emphasized in social media discourse. These patterns offer insights into the varied characteristics and cultural development trajectories of Jiangsu’s diverse urban rural landscape.

In Figure 7, the spatial distribution of interactive art attention exhibits pronounced polycentric characteristics across Jiangsu Province, with distinct gradients in technological and artistic dimensions. Notably, core urban districts demonstrate 25% to 35% higher attention intensities in digital interactivity (Mean = 82.3) and emerging technologies (Mean = 78.9) compared to peripheral regions. This pattern aligns with established theories of innovation diffusion in cultural geography (Florida, 2003) yet reveals anomalous clusters of high attention (>75%) in second-tier cities like Wuxi and Changzhou, particularly in collaborative creation and audiovisual elements, suggesting the emergence of specialized cultural innovation nodes outside traditional centers.

Figure 7.

Attention analysis of interactive art across various dimensions in Jiangsu regions: (a) artistic techniques, (b) thematic context, (c) audience engagement, (d) technological platforms, (e) exhibition contexts, and (e) artist profiles.

Analysis of thematic content and audience engagement dimensions reveals a compelling inverse relationship between urbanization levels and community-oriented art attention. Rural and emerging urban districts display unexpectedly high attention scores in participatory experiences (Mean = 68.7) and cultural identity themes (Mean = 72.4), contradicting conventional center-periphery models of cultural innovation. This “peripheral advantage” phenomenon is particularly evident in traditional cultural centers like Yangzhou (82.3) and emerging zones like Yancheng (78.5), where deep-rooted cultural capital appears to transcend economic development metrics. The pattern suggests a nuanced interplay between cultural heritage preservation and interactive art engagement that challenges dominant narratives of urban cultural supremacy.

Exhibition contexts and artist profiles demonstrate highly localized attention clusters (Moran’s I = 0.723, p < .01) strongly correlated with institutional presence. Major cultural hubs exhibit distinct attention peaks (85%–95%) surrounded by sharp gradients, creating “attention islands” that imply strong institutional effects on public engagement patterns.

Table 3 reveals distinctive spatial patterns in interactive art engagement across Jiangsu’s regions. Major urban centers demonstrate pronounced attention to technological sophistication, with Nanjing emphasizing digital interactivity and installation technologies, while Suzhou balances cultural identity with physical interactivity. Second-tier cities exhibit hybrid engagement patterns: Changzhou and Wuxi leverage emerging technologies while maintaining strong audience participation. This urban-rural gradient manifests through decreasing technological complexity and increasing community engagement in peripheral regions, exemplified by Suqian’s focus on basic participation levels and digital platforms. Notably, cultural identity themes remain prominent in historically significant cities like Yangzhou, suggesting the persistence of traditional cultural capital despite technological disparities. These patterns reflect broader socio-spatial dynamics of cultural innovation diffusion, where technological sophistication correlates with urban development while community engagement and cultural preservation emerge as dominant themes in less urbanized areas.

Table 3.

Top Three Attention Areas for Interactive Art in Each Jiangsu Region.

Region	Category	Subcategory	Topics
Nanjing	Artistic techniquesTechnological platformsExhibition contexts	Digital interactivityInstallationTechnologies gallery spaces	Touchscreens, motion sensors, VR/AR integrationProjection mapping, responsive environmentsMuseum installations, art fair presentations
Suzhou	Thematic contentArtistic techniquesAudience engagement	Cultural identityPhysical interactivityParticipation levels	Heritage-based interactions, community-specific artKinetic sculptures, interactive installationsPassive viewing, active manipulation
Wuxi	Technological platformsArtist profilesExhibition contexts	Emerging techCollaborative teamsFestivals and events	AI/ML art, blockchain-based interactivesInterdisciplinary groups, artist-engineer partnershipsMedia art festivals, interactive performances
Changzhou	Audience engagementArtistic techniquesThematic content	Collaborative creationAudiovisual elementsScientific exploration	Multi-user experiences, crowdsourced art soundReactive art, light installationsSTEM-inspired art, bioart
Nantong	Technological platformsArtist profilesExhibition contexts	Mobile applicationsEmerging talentsPublic art	AR apps, interactive art gamesStudent projects, up-and-coming artistsUrban interactives, community engagement projects
Yangzhou	Thematic contentArtistic techniquesExhibition contexts	Cultural identityPhysical interactivityPublic art	Heritage-based interactions, community-specific artKinetic sculptures, interactive installationsUrban interactives, community engagement projects
Zhenjiang	Artist profilesAudience engagementTechnological platforms	Collaborative teamsLearning experiencesWeb-based art	Interdisciplinary groups, artist-engineer partnershipsEducational interactives, skill-building artBrowser interactives, net art
Taizhou	Artistic techniquesThematic contentAudience engagement	Audiovisual elementsSocial commentaryEmotional responses	Sound reactive art, light installations political art,Environmental awarenessEmpathy-inducing works, mood-altering installations
Suqian	Audience engagementExhibition contextsArtist profiles	Participation levelsDigital platformsAI as artist	Passive viewing, active manipulation onlineExhibitions, virtual galleriesAI-generated art, human-AI Collaborations
Huai’an	Technological platformsThematic contentExhibition contexts	Web-based artSocial commentaryDigital platforms	Browser interactives, net artPolitical art, environmental awarenessOnline exhibitions, virtual galleries
Yancheng	Exhibition contextsAudience engagementThematic content	Public artCollaborative creationPersonal expression	Urban interactives, community engagement projectsMulti-user experiences, crowdsourced artEmotion-driven interactions, biographical works

Public Sentiment on Interactive Art

The analysis of public sentiment towards interactive art across Jiangsu’s regions reveals intricate patterns of perception, offering valuable insights into the province’s socio-spatial dynamics of cultural engagement. By leveraging multi-modal social media data, we uncover nuanced variations in residents’ attitudes towards key dimensions of interactive art, reflecting the complex interplay between technological innovation, cultural traditions, and socioeconomic factors.

Figure 8 presents a comprehensive visualization of sentiment analysis across seven key interactive art dimensions in Jiangsu’s regions. The striking spatial hierarchy emerges in the sentiment distributions, particularly evident in the technological platforms and artistic techniques dimensions (Figure 8a and d). Metropolitan cores exhibit markedly higher positive sentiment (75%–90%) toward digital and technological elements, while sentiment intensity diminishes along a clear gradient toward peripheral regions (45%–60%). This pattern reveals an intriguing phenomenon: the technological appreciation of interactive art appears to follow classic distance decay principles, reminiscent of Tobler’s First Law of Geography, but with notable anomalies in second-tier cities like Changzhou and Wuxi, which demonstrate unexpectedly high positive sentiment clusters (65%–80%) that disrupt the continuous spatial decay.

Figure 8.

Sentiment analysis across various interactive art dimensions in Jiangsu regions: (a) artistic techniques, (b) thematic context, (c) audience engagement, (d) technological platforms, (e) exhibition contexts, and (e) artist profiles.

The thematic content and audience engagement dimensions (Figure 8b and c) reveal a more complex significant pattern that challenges conventional center-periphery models. We observe an inverse relationship between urbanization levels and sentiment positivity, with rural and semi-urban regions displaying notably higher positive sentiment (70%–85%) toward participatory and community oriented aspects of interactive art. This “peripheral advantage” in engagement sentiment manifests most strongly in traditional cultural centers like Yangzhou (82%) and emerging cultural zones like Yancheng (78%), suggesting the presence of deeply embedded cultural capital that transcends economic development metrics. The pattern points to a nuanced interplay between cultural heritage, community cohesion, and artistic reception that warrants reconsideration of standard cultural diffusion models.

Exhibition contexts and artist profiles (Figure 8e and f) demonstrate highly localized sentiment clusters that appear to correlate strongly with institutional presence and cultural infrastructure. Major cultural hubs like Nanjing and Suzhou show distinct positive sentiment peaks (85%–95%) surrounded by sharp gradients, creating “sentiment islands” that suggest the presence of strong institutional effects on public perception. This spatial configuration implies that sentiment toward curatorial and professional aspects of interactive art may be more sensitive to formal cultural infrastructure than previously theorized, highlighting the need for more sophisticated models of cultural sentiment diffusion in rapidly developing regions.

Art Perception Across Social Stratas

The regional socioeconomic indicators examined in this study were systematically compiled from authoritative sources, including the Jiangsu Provincial Bureau of Statistics (2024), provincial bureau reports, and cultural administrative records. These data encompass average income, educational attainment, digital infrastructure penetration, and cultural expenditure metrics across Jiangsu’s diverse regions. The comprehensive dataset provides a robust foundation for examining the intricate relationships between socioeconomic conditions and digital perceptions of interactive art, enabling nuanced analysis of spatial variations in cultural engagement patterns.

To systematically examine these multifaceted relationships between socioeconomic factors and digital perceptions of interactive art, we constructed a comprehensive correlation matrix. Figure 9 visualizes these complex interactions through color gradients and scatter plot overlays. Figure 9 reveals distinctive clustering patterns that challenge conventional assumptions about the relationship between urban development and cultural engagement (Bourdieu, 2018). Particularly noteworthy is the strong positive correlation (r = .82) between education levels and attention to artistic techniques, suggesting that formal learning may play a more crucial role in technical art appreciation than previously theorized. Intriguingly, population density shows only moderate correlations (r = .45 to .65) with technological platform engagement, indicating that urban concentration alone may not determine digital art participation patterns as strongly as often assumed.

Figure 9.

Correlation matrix of art-related social media patterns.

More complex relationships emerges when examining the relationships between economic indicators and sentiment patterns. Average income demonstrates remarkably strong correlations with technological platform engagement (r = .95) and exhibition context appreciation (r = .78), yet shows unexpected negative correlations with thematic content engagement (r = −.65). This inverse relationship challenges the linear models of cultural capital accumulation, suggesting that economic development may actually dampen certain forms of artistic engagement. Cultural venue density exhibits moderate to strong positive correlations across most dimensions (r = .55 to .85), except for audience engagement where correlations are notably weaker (r = .45), hinting at the potential limitations of institutional infrastructure in fostering participatory art experiences.

The intricate web of correlations illuminates several theoretically significant patterns in the spatial distribution of interactive art engagement. First, the strong correlation triad between education, artistic techniques, and artist profiles (all r > .90) suggests the emergence of what might be termed “technical-aesthetic clusters” in regions with high educational attainment. Second, the moderate negative correlations between population density and nature-themed interactive installations (r = −.58) point to a possible “urban nature deficit” phenomenon in art appreciation. Third, the varying strength of correlations between digital accessibility and different art dimensions (ranging from 0.38 to 0.85) implies a more nuanced relationship between technological infrastructure and cultural engagement than previously recognized. These patterns collectively suggest that the relationship between socioeconomic development and interactive art engagement follows non-linear pathways, mediated by complex interactions between institutional presence, educational resources, and community characteristics. This finding has significant implications for cultural policy development in rapidly urbanizing regions, suggesting the need for more sophisticated, context-sensitive approaches to fostering interactive art engagement.

Socioeconomic Determinants of Art Engagement

This section leverages Shapley value analysis to decompose the complex relationships between socioeconomic factors and interactive art engagement patterns. By applying ensemble machine learning models to our multi-modal dataset, we quantify the relative importance of various predictors while accounting for their nonlinear interactions. This approach reveals how different socioeconomic variables contribute to both attention distribution and sentiment formation across Jiangsu’s diverse regions, offering insights into the mechanisms of cultural capital accumulation in rapidly developing urban-rural systems.

Determinants of Public Attention

To systematically evaluate the complex interactions between socioeconomic factors and interactive art engagement, we employed SHAP (SHapley Additive exPlanations) analysis across multiple machine learning models. Our approach integrated Random Forest, XGBoost, LightGBM, and CatBoost algorithms, each trained on the comprehensive dataset of social media interactions and regional socioeconomic indicators. The Shapley values, derived from cooperative game theory, provide a mathematically rigorous framework for attributing the contribution of each predictor variable to model outcomes. This is particularly crucial in our context, where traditional linear correlation analysis may obscure subtle interaction effects between development metrics and cultural engagement patterns.

Within Figure 10, the Shapley analysis reveals striking variations in the predictive power of socioeconomic variables across attention dimensions. Education level emerges as the dominant predictor for artistic techniques (0.82) and technological platforms (0.78), while household income shows the strongest influence on exhibition contexts (0.75). Notably, cultural facilities density demonstrates unexpectedly high predictive power for audience engagement (0.68) and thematic content (0.65), challenging conventional assumptions. Digital connectivity exhibits moderate but consistent effects across all dimensions (0.45–0.55), with peak influence in technological platforms (0.62). Population density shows the most variable predictive power, ranging from 0.35 for artist profiles to 0.72 for exhibition contexts, suggesting complex spatial dynamics in cultural attention patter.

Figure 10.

SHAP-based attention analysis in Jiangsu regions: (a) artistic technique, (b) thematic context, (c) audience engagement, (d) technological platforms, (e) exhibition contexts, and (e) artist profiles.

The radial distribution patterns in Figure 10 illuminate several theoretically significant phenomena in the spatial organization of cultural attention. Most striking is the emergence of what we term “educational-technological coupling” that a distinctive pattern where educational attainment and technological infrastructure demonstrate synchronized predictive power across multiple attention dimensions. This coupling effect is particularly pronounced in artistic techniques and technological platforms, suggesting the presence of self-reinforcing knowledge-innovation cycles in certain regions. The asymmetric influence of cultural facilities density reveals an intriguing “institutional resonance effect” where physical cultural infrastructure appears to amplify attention patterns beyond its immediate spatial context. Perhaps most surprising is the discovery of “peripheral resilience zones” where relatively lower socioeconomic indicators nevertheless generate robust attention patterns, particularly in thematic content and audience engagement. These findings challenge deterministic models of cultural development and suggest the presence of more nuanced, non-linear relationships between regional development and cultural attention dynamics.

Sentiment Formation Mechanisms

Public sentiment toward interactive art reveals complex spatial patterns that defy conventional socioeconomic gradients. Our analysis uncovers distinct threshold effects where positive sentiment peaks at moderate, rather than maximum, development levels. Remarkably, regions with comparable infrastructure and economic indicators often generate contrasting emotional responses, particularly in technological appreciation and community engagement dimensions. This phenomenon suggests that sentiment formation operates through subtle cultural mechanisms beyond standard development metrics, warranting a closer examination of how institutional presence and local traditions collectively shape artistic reception across Jiangsu’s diverse cultural landscape.

Building upon the attention patterns revealed above, we further examine the sentiment formation mechanisms through Shapley value analysis. Figure 11 maps the differential impact of socioeconomic factors on interactive art sentiment across six dimensions, highlighting several key anomalies.

Figure 11.

SHAP-based sentiment analysis in Jiangsu regions: (a) artistic technique, (b) thematic context, (c) audience engagement, (d) technological platforms, (e) exhibition contexts, and (e) artist profiles.

In Figure 11, the sentiment Shapley patterns reveal an intriguing inversion of conventional socioeconomic predictors. While education maintains strong predictive power for technological platforms (0.85) and artistic techniques (0.79), its influence on audience engagement sentiment shows unexpected negative correlation (−0.45). Cultural facilities density emerges as the dominant predictor for exhibition contexts (0.82) and thematic content (0.77), suggesting that institutional presence may play a more crucial role in shaping positive sentiment than previously theorized.

The analysis reveals distinct threshold effects in sentiment formation across Jiangsu’s regions. Most notably, the predictive power of household income exhibits a clear plateau effect around moderate development levels, particularly for artistic techniques and thematic content dimensions. This challenges conventional linear models of cultural development (DiMaggio et al., 1983). We observe that positive sentiment towards interactive art often peaks in areas with balanced distributions of educational and cultural infrastructure, rather than in regions with maximum development metrics. These patterns suggest a more complex relationship between institutional frameworks and public sentiment, where moderate levels of multiple factors may optimize cultural reception. The asymmetric distribution of predictive strength across dimensions further indicates that sentiment formation follows distinct pathways from attention patterns, potentially reflecting deeper sociocultural dynamics beyond simple socioeconomic determinism.

Limitation

The dataset’s geographic granularity, while enabling precise regional comparisons, may overlook nuanced intra-city variations in art engagement patterns. Platform-specific biases in user demographics could influence perceived attention distributions across socioeconomic groups. The proposed multi-modal framework prioritizes explicit sentiment expressions, potentially underrepresenting implicit cultural perceptions embedded in artistic discourse. The image captioning model, despite domain adaptation, occasionally simplifies complex interactive elements requiring specialized art knowledge.

Conclusion

This study pioneers a multi-modal spatial analysis framework for decoding public engagement with interactive art through social media, integrating computer vision, natural language processing, and spatial econometrics. Our methodology advances cultural analytics by capturing both explicit expressions and implicit patterns across visual-textual modalities, while establishing rigorous validation protocols for social media-derived cultural indicators. The empirical focus on Jiangsu Province provides a critical testbed for examining cultural dynamics in transitional urban-rural systems.

Three key findings emerge from our spatial-temporal analysis. First, the identification of polycentric cultural innovation clusters challenges traditional core-periphery models, with secondary cities like Changzhou demonstrating technological sophistication rivaling provincial capitals. Second, the discovered inverse relationship between technological adoption and community engagement reveals fundamental tensions in cultural development trajectories: urban cores excel in platform based interactivity while rural regions lead in participatory art experiences. Third, our Shapley decomposition reveals a sophisticated interplay between educational capital and institutional dynamics in cultural formation. The analysis demonstrates that while educational attainment initially drives technological appreciation in interactive art engagement, this relationship exhibits distinct threshold effects where institutional factors become increasingly dominant. This finding challenges linear models of cultural capital accumulation and suggests a more nuanced understanding of how educational and institutional resources interact in shaping cultural innovation. These patterns necessitate reconceptualizing cultural policy through the lens of spatial justice, advocating for differentiated strategies that acknowledge the complex interplay between regional capabilities, technological infrastructure, and community-based artistic expression while addressing the evolving nature of digital cultural divides.

The study’s platform-specific data scope and static analytical framework present opportunities for expansion through multi-source data integration and longitudinal modeling. Future research should explore dynamic interactions between physical art environments and their digital representations across evolving urban systems.

Footnotes

ORCID iDs

Shunfeng Zhang

Zhengyang Lu

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Philosophy and Social Science Bidding Project of Wuxi (No. WXSK24-JY-B08), Philosophy and Social Science Projects of Jiangsu Province (No. 2024SJYB0657). The authors acknowledge the above financial support.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Disclosure Statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Anderson

Fernando

Johnson

Gould

(2016, October 11–14). Spice: Semantic propositional image caption evaluation [Conference session]. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, Proceedings, Part V, Vol. 14, pp. 382–398. Springer, Amsterdam, The Netherlands.

Banerjee

Lavie

(2005). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments [Paper presentation]. Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72. Association for Computational Linguistics, Ann Arbor, MI, United States.

Bourdieu

(2018). Distinction: A social critique of the judgement of taste. In Grusky

(Ed.), Inequality: Classic readings in race, class, and gender (1st ed., pp. 287–318). Routledge. https://doi.org/10.4324/9780429499838

Brinkmann

Baumann

Bonnefon

J.-F.

Derex

Müller

T. F.

Nussberger

A.-M.

Czaplicka

Acerbi

Griffiths

T. L.

Henrich

Leibo

J. Z.

McElreath

Oudeyer

P.-Y.

Stray

Rahwan

(2023). Machine culture. Nature Human Behaviour, 7(11), 1855–1868. https://doi.org/10.1038/s41562-023-01742-2

Cranshaw

Schwartz

Hong

Sadeh

(2012). The livehoods project: Utilizing social media to understand the dynamics of a city. In Proceedings of the international AAAI conference on web and social media (Vol. 6, No. 1, pp. 58–65). Association for the Advancement of Artificial Intelligence (AAAI). https://doi.org/10.1609/icwsm.v6i1.14278

DiMaggio

P. J.

Powell

W. W.

(1983). The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields. American Sociological Review, 48(2), 147–160.

Doshi

A. R.

Hauser

O. P.

(2024). Generative AI enhances individual creativity but reduces the collective diversity of novel content. Science Advances, 10(28), 5290. https://doi.org/10.1126/sciadv.adn5290

Dubey

Naik

Parikh

Raskar

Hidalgo

C. A.

(2016, October 11–14). Deep learning the city: Quantifying urban perception at a global scale [Conference session]. Computer Vision–ECCV 2016: 14th European Conference, Proceedings, Part I, Vol. 14, pp. 196–212. Springer, Amsterdam, The Netherlands.

Epstein

Hertzmann

Herman

Mahari

Frank

M. R.

Groh

Schroeder

Smith

Akten

Fjeld

Farid

Leach

Pentland

A. S.

Russakovsky

(2023). Art and the science of generative AI. Science, 380(6650), 1110–1111. https://doi.org/10.1126/science.adh4451

10.

Florida

(2003). Cities and the creative class. City & Community, 2(1), 3–19. https://doi.org/10.1111/1540-6040.00

11.

Gandhi

Adhvaryu

Poria

Cambria

Hussain

(2023). Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Information Fusion, 91, 424–444. https://doi.org/10.1016/j.inffus.2022.09.025

12.

Gong

Shang

Wang

(2024). Integrating social explanations into explainable artificial intelligence (XAI) for combating misinformation: Vision and challenges. IEEE Transactions on Computational Social Systems, 11, 6705–6726. https://doi.org/10.1109/TCSS.2024.3404236

13.

Zhang

Ren

Sun

(2016). Deep residual learning for image recognition [Conference session]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE, Las Vegas, NV, USA.

14.

Hermann

Puntoni

(2024). Artificial intelligence and consumer behavior: From predictive to generative AI. Journal of Business Research, 180, 114720. https://doi.org/10.1016/j.jbusres.2024.114720

15.

Hertzmann

(2025). Generative models for the psychology of art and aesthetics. Empirical Studies of the Arts, 43(1), 23–43. https://doi.org/10.1177/02762374241288

16.

Horton

C. B.

Jr. White

M. W.

Iyengar

S. S.

(2023). Bias against AI art can enhance perceptions of human creativity. Scientific Reports, 13(1), 1–27. https://doi.org/10.1038/s41598-023-45202-3

17.

Jhaver

Zhang

A. Q.

Chen

Q. Z.

Natarajan

Wang

Zhang

A. X.

(2023). Personalizing content moderation on social media: User perspectives on moderation choices, interface design, and labor. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW2), 1–33. https://doi.org/10.1145/3610080

18.

Jiangsu Provincial Bureau of Statistics. (2024). Jiangsu statistical yearbook 2023. China Statistics Press.

19.

Kohtala

Hyysalo

Whalen

(2020). A taxonomy of users’ active design engagement in the 21st century. Design Studies, 67, 27–54. https://doi.org/10.1016/j.destud.2019.11.008

20.

Kumar

Singh

J. P.

(2022). Deep neural networks for location reference identification from bilingual disaster-related tweets. IEEE Transactions on Computational Social Systems, 11, 880–891. https://doi.org/10.1109/TCSS.2022.3213702

21.

Lamont

Lareau

(1988). Cultural capital: Allusions, gaps and glissandos in recent theoretical developments. Sociological Theory, 6(2), 153–168. https://doi.org/10.2307/202113

22.

Lamsal

Harwood

Read

M. R.

(2022, December 17–20). Where did you tweet from? Inferring the origin locations of tweets based on contextual information [Conference session]. 2022 IEEE International Conference on Big Data, pp. 3935–3944. IEEE, Osaka, Japan. https://doi.org/10.1109/BigData55660.2022.10020460

23.

Meel

Vishwakarma

D. K.

(2020). Fake news, rumor, information pollution in social media and web: A contemporary survey of state-of-the-arts, challenges and opportunities. Expert Systems with Applications, 153, 112986. https://doi.org/10.1016/j.eswa.2019.112986

24.

Papineni

Roukos

Ward

Zhu

W. J.

(2002, July 6–12). Bleu: A method for automatic evaluation of machine translation [Paper presentation]. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318. ACL, Philadelphia, PA, USA. https://doi.org/10.3115/1073083.1073135

25.

Rogers

(2021). Visual media analysis for Instagram and other online platforms. Big Data & Society, 8(1), 1–23. https://doi.org/10.1177/205395172110223

26.

Vedantam

Lawrence Zitnick

Parikh

(2015, June 7–12). Cider: Consensus-based image description evaluation [Conference session]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4566–4575. IEEE, Boston, MA, USA. https://doi.org/10.1109/CVPR.2015.7299087

27.

Villaespesa

Wowkowych

(2020). Ephemeral storytelling with social media: Snapchat and Instagram stories at the Brooklyn Museum. Social Media+ Society, 6(1), 1–13. https://doi.org/10.1177/2056305119898776

28.

Wan

(2024). Navigating the digital age: City branding in the era of social media and digital transformation. Journal of the Knowledge Economy, 15, 16666–16699. https://doi.org/10.1007/s13132-024-01795-2

29.

Wang

Hale

Adelani

D. I.

Grabowicz

Hartman

Flock

Jurgens

(2019). Demographic inference and representative population estimates from multilingual social media data [Conference session]. The World Wide Web Conference, pp. 2056–2067. ACM, San Francisco, CA, USA. https://doi.org/10.1145/3308558.3313684

30.

Xia

Wang

(2025). Multi-modal social media analytics: A sentiment perception-driven framework in Nanjing districts. IEEE Access, 13, 12603–12622. https://doi.org/10.1109/ACCESS.2025.3531769

31.

Young

Chen

Huang

Zhang

G. W.

Wang

G. Y.

Zhu

J. C.

Chen

J. Q.

Chang

K. D.

Liu

Yue

Yang

S. B.

Yang

S. M.

Xie

Huang

W. H.

. . . Dai

Z. H.

(2024). Yi: Open foundation models by 01. arXiv preprint arXiv:2403.04652.

32.

(2009). Media and cultural transformation in China. Routledge. https://doi.org/10.4324/9780203882016

33.

Zhang

Fan

Yao

Mostafavi

(2019). Social media for intelligent public information and warning in disasters: An interdisciplinary review. International Journal of Information Management, 49, 190–207. https://doi.org/10.1016/j.ijinfomgt.2019.04.004

34.

Zheng

Capra

Wolfson

Yang

(2014). Urban computing: Concepts, methodologies, and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 5(38), 1–55. https://doi.org/10.1145/26295

Public Sentiment Towards Interactive Art on Multi-Modal Social Media: Insights from Jiangsu Province

Abstract

Keywords

Introduction

Related Works

Computational Social Media Analysis

Interactive Arts and Generative Systems

Weibo Interactive Arts Dataset

Methods

Multi-Modal Analysis Framework

LLM-Based Image Captioning for Interactive Art

Interactive Art Posts Analysis with LLMs

Regional Results

Social Attention on Interactive Art

Public Sentiment on Interactive Art

Art Perception Across Social Stratas

Socioeconomic Determinants of Art Engagement

Determinants of Public Attention

Sentiment Formation Mechanisms

Limitation

Conclusion

Footnotes

ORCID iDs

Funding

Declaration of Conflicting Interests

Data Availability Statement

Disclosure Statement

References