A two-stage deep semantic model for movie recommendation with long-term user profiling on Spark

Abstract

With the growth of the film market, users are often overwhelmed by irrelevant information, making it difficult to make accurate movie choices. To address this, we propose an improved deep structured semantic model (DSSM) movie recommendation algorithm based on Spark technology. The algorithm employs two DSSMs: one to extract users’ long-term preferences and another for final movie recommendations. The preference model integrates explicit and implicit user interaction features, while the recommendation model utilizes recent viewing history and search behavior. Experiments were conducted on the MovieLens 100K and MovieLens 1M datasets, using metrics such as accuracy, recall, mean reciprocal ranking (MRR), diversity, and normalized discounted cumulative return (NDCG). The results show that the improved algorithm outperforms others in recall, with rates of 36.0% and 42.1% for the top 40 recommendations on the MovieLens 100K and 1M datasets, respectively. Additionally, the algorithm’s NDCG scores for the top 10 recommendations were 0.525 and 0.554, higher than those of competing algorithms. The system, built on Spark technology, passes all performance tests, with an average response time of under 600 ms. These results demonstrate that the proposed DSSM-based movie recommendation algorithm can provide accurate recommendations while overcoming the challenge of data sparsity. The research not only provides an efficient and accurate solution for the field of movie recommendation, but also shows the significant advantages of Spark technology in processing large-scale data sets, which has a wide range of application value and promotion prospects.

Keywords

Spark technology DSSM movie recommendations recommendation engine

Get full access to this article

View all access options for this article.

References

Spoorthy

Sanjeevi

. Multi-criteria- recommendations using autoencoder and deep neural networks with weight optimization using firefly algorithm. Int J Eng 2023; 36(1): 130–138.

Patil

Kotwal

. Micro video recommendation in multimodality using dual-perception and gated recurrent graph neural network. Multimed Tool Appl 2024; 83(17): 51559–51588.

Saxena

Kaur

Ahuja

, et al. Leveraging item attribute popularity for group recommendation. Int J Syst Assur Eng Manag 2024; 15(6): 2645–2655.

Jha

Gaur

Thakur

. A trust-worthy approach to recommend movies for communities. Multimed Tool Appl 2022; 81(14): 19655–19682.

Chetana

Seetha

. Enhancing movie recommendations: an ensemble-based deep collaborative filtering approach utilizing AdaMVRGO optimization. Trait Du Signal 2023; 40(6): 2337–2351.

Rao

Narayanan

Putheti

. Intelligent collaborative filtering recommendation system for movie review rating. J Theor Appl Inf Technol 2023; 101(15): 5942–5952.

Mhammedi

Gherabi

Amnai

. Enhancing recommendation system using ontology-based similarity and incremental SVD prediction. Recent advances in computer science and communications 2023; 16(9): 67–76.

Liu

Miyazaki

. Knowledge-aware attentional neural network for review-based movie recommendation with explanations. Neural Comput Appl 2023; 35(3): 2717–2735.

Manikandan

Kavitha

. A content recommendation system for e-learning using enhanced Harris Hawks optimization, Cuckoo search and DSSM. J Intell Fuzzy Syst: Applications in Engineering and Technology 2023; 44(5): 7305–7318.

10.

Guo

Zhao

, et al. Identifying purchase intention through deep learning: analyzing the Q & D text of an e-commerce platform. Ann Oper Res 2024; 339(1): 329–348.

11.

Lee

Moon

Chi

. Reference section identification of construction specifications by a deep structured semantic model. Eng Construct Architect Manag 2023; 30(9): 4358–4386.

12.

Liu

, et al. Product family lean improvement based on matching deep mining of customer group preference. Res Eng Des 2021; 32(4): 469–488.

13.

Vidyadhari

Sandhya

Ramakrishnaiah

. An incremental clustering using bat-spotted hyena optimiser with spark framework. Int J Intell Inf Database Syst: IJIIDS 2023; 16(2): 167–195.

14.

Benitez-Hidalgo

Navas-Delgado

Roldan-Garcia

MDM

. NORA: scalable OWL reasoner based on NoSQL databases and apache Spark. Software Pract Ex 2023; 53(12): 2377–2392.

15.

Jha

Tiwari

Bharill

, et al. Apache Spark-based scalable feature extraction approaches for protein sequence and their clustering performance analysis. Int J Data Sci Anal 2023; 15(4): 359–378.

16.

Tan

Yoon

. Testing the effects of personalized recommendation service, filter bubble and big data attitude on continued use of TikTok. ASIA PAC J MARKET LO 2025; 37(5): 1280–1301.

17.

Sun

Fan

. A combined water quality pollution prediction model based on the Spark big data platform. Aqua: water infrastructure, ecosystems and society 2022; 71(7): 963–974.

18.

Ogbeyemi

Lin

, et al. A semantic model for enterprise application integration in the era of data explosion and globalisation. Enterp Inf Syst 2023; 17(1): 499–521.

19.

Vashistha

Varshney

Sarin

, et al. A novel music recommendation system using filtering techniques. Informatica 2024; 48(4): 595.

20.

Hasan

MDR

Ferdous

. Dominance of AI and machine learning techniques in hybrid movie recommendation system applying text-to-number conversion and cosine similarity approaches. Journal of Computer Science and Technology Studies 2024; 6(1): 94–102.

21.

Mayer

Chu

, et al. Automatic semantic modeling of structured data sources with cross-modal retrieval. Pattern Recognit Lett 2024; 177(1): 7–14.