Sage Journals: Discover world-class research

Abstract

The graph neural network employed in the present visual relation detection framework overlooks the directivity of interactive relationships. To meet the perceptual demands of intelligent human-computer interaction systems on users’ visual interaction state, this paper proposes a visual information clustering model for the interaction scene. Initially, the Interactive Network Structure (INS) is used to segment the visual information of the interactive interface. Then, visual characteristics are integrated into the Fuzzy C-Means (FCM) algorithm to achieve soft segmentation of fixation points, and calculate the user’s attentional value for each object in the scene. The experimental outcomes demonstrate the model’s strong ability to forecast the user’s interaction intentions, with a classification accuracy of 85.2% for the four visual interaction patterns, and superior prediction effects for visual interaction states that involve high-level cognitive activities. Moreover, the model explicitly enhances object feature expression and interactive relation feature expression in the modeling scene, enabling the realization of binocular interactive 3D reconstruction.

Keywords

interactive scene visual information FCM INS

Get full access to this article

View all access options for this article.

References

Katona

. A review of human–computer interaction and virtual reality research fields in cognitive InfoCommunications. Appl Sci 2021; 11(6): 2646.

Martinez-Toro

Ariza-Zabala

Bautista

DWR

, et al. Human computer interaction in transport, a systematic literature review. J Phys: Conference Series. IOP Publishing 2019; 1409(1): 012002.

Jiang

, et al. Intelligent human-computer interaction based on surface EMG gesture recognition. IEEE Access 2019; 7: 61378–61387.

Cai

Shang

Tan

, et al. Human action recognition based on deep learning. In: International Conference on Applications and Techniques in Cyber Security and Intelligence. Springer, 2019, pp. 1595–1600.

. Toward human-centered AI: a perspective from human-computer interaction. Interactions 2019; 26(4): 42–46.

Suma

. Computer vision for human-machine interaction-review. J Trend Compu Sci Smart Tech (TCSST) 2019; 1(02): 131–139.

Wang

Tao

Gong

, et al. A mobile robotic measurement system for large-scale complex components based on optical scanning and visual tracking. Robot Comput Integrated Manuf 2021; 67: 102010.

Carter

Luke

. Best practices in eye tracking research. Int J Psychophysiol 2020; 155: 49–62.

Jarick

Bencic

. Eye contact is a two-way street: arousal is elicited by the sending and receiving of eye gaze information. Front Psychol 2019; 10: 1262.

10.

Zhang

. Construction and application of beautiful rural landscape evaluation model using computer eye tracking technology. J Phys : Conf Ser 2021; 1915(3): 032092.

11.

Cheng

Chang

Zhu

, et al. MMALFM: explainable recommendation by leveraging reviews and images. ACM Trans Inf Syst 2019; 37(2): 1–28.

12.

Liu

Milanova

. Visual attention in deep learning: a review. Int Rob Auto J 2018; 4(3): 154–155.

13.

Yang

. Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front Oncol 2021; 11: 638182.

14.

Adeshina

Adedigba

Adeniyi

. Breast cancer histopathology image classification with deep convolutional neural networks. In: 2018 14th international conference on electronics computer and computation (ICECCO). IEEE, 2018, pp. 206–212.

15.

Vinyals

Toshev

Bengio

. Show and tell: a neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, 2015, pp. 3156–3164.

16.

Kuznetsova

Rom

Alldrin

, et al. The open images dataset v4: unified image classification, object detection, and visual relationship detection at scale. Int J Comput Vis 2020; 128(7): 1956–1981.

17.

Kafle

Kanan

. Visual question answering: datasets, algorithms, and future challenges. Comput Vis Image Understand 2017; 163: 3–20.

18.

Yan

Peng

, et al. Learning spatio-temporal transformer for visual tracking. In: Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10448–10457.

19.

Riquelme

De Goyeneche

Zhang

, et al. Explaining VQA predictions using visual grounding and a knowledge base. Image Vis Comput 2020; 101: 103968.

20.

Chang

Ren

, et al. A comprehensive survey of scene graphs: generation and application. IEEE Trans Pattern Anal Mach Intell 2021; 45(1): 1–26.

21.

Zhou

Wang

, et al. Cascaded parsing of human-object interaction recognition. IEEE Trans Pattern Anal Mach Intell 2021; 44(6): 2827–2840.

22.

Oudah

Al-Naji

Chahl

. Hand gesture recognition based on computer vision: a review of techniques. J Imaging 2020; 6(8): 73.

23.

Lang

. A stereo-based visual-inertial odometry for slam. In: 2019 Chinese Automation Congress (CAC). IEEE, 2019, pp. 594–598.

24.

Jakhar

Verma

Rathore

APS

, et al. Prioritization of dimensions of visual merchandising for apparel retailers using FAHP. Benchmark Int J 2020; 27(10): 2759–2784.

25.

Yongqiang

Wang

Jue

, et al. Human computer interaction behavior intention prediction model based on eye movement characteristics. Acta Electron Sin 2018; 46(12): 2993–3001, (in Chinese).

Visual information recognition and classification in interactive scenes

Abstract

Keywords

Get full access to this article

References