Abstract
The graph neural network employed in the present visual relation detection framework overlooks the directivity of interactive relationships. To meet the perceptual demands of intelligent human-computer interaction systems on users’ visual interaction state, this paper proposes a visual information clustering model for the interaction scene. Initially, the Interactive Network Structure (INS) is used to segment the visual information of the interactive interface. Then, visual characteristics are integrated into the Fuzzy C-Means (FCM) algorithm to achieve soft segmentation of fixation points, and calculate the user’s attentional value for each object in the scene. The experimental outcomes demonstrate the model’s strong ability to forecast the user’s interaction intentions, with a classification accuracy of 85.2% for the four visual interaction patterns, and superior prediction effects for visual interaction states that involve high-level cognitive activities. Moreover, the model explicitly enhances object feature expression and interactive relation feature expression in the modeling scene, enabling the realization of binocular interactive 3D reconstruction.
Get full access to this article
View all access options for this article.
