Abstract
The proliferation of online network services has led to a steady rise in gambling-related activities, posing serious challenges to cyber governance and public security. Despite significant efforts by law enforcement agencies to curb online gambling, its dynamic and covert nature continues to hinder effective regulation. In this study, we propose a novel identification framework that leverages multimodal signals—including webpage text, visual content, and embedded image text—to detect gambling websites with high precision. Our approach integrates these heterogeneous data sources into a unified model, achieving robust representation and classification across diverse website structures. Extensive experiments on a domain-specific dataset demonstrate that our method significantly outperforms traditional baselines, reaching an accuracy of 99.3%. This work contributes an effective and scalable technical solution to assist real-world gambling crime detection and opens new directions for multimodal modeling in security applications.
Get full access to this article
View all access options for this article.
