Abstract
The rise of e-commerce has brought women’s shirt images to the forefront, showcasing complex styles and diverse features. This presents a challenge for consumers in finding their preferred styles. Deep learning, known for its speed and superior retrieval performance, offers a solution by learning image features through multi-layer neural networks and extracting high-level semantic information, making it highly effective for recognizing and classifying clothing styles. The ResNet model excels in extracting detailed pixel information, but has limitations: (1) Jump connections in ResNet, which add input directly to the output in the residual block, can cause feature distortion, especially when input and output sizes do not match; (2) Despite many network layers, the effective receptive field of ResNet is smaller than theoretically expected. To address these limitations, this paper introduces the Transformer model into ResNet, proposing a new method for recognizing and classifying women’s shirt styles based on local detail feature extraction. The Transformer’s attention mechanism enhances the model’s ability to focus on important features and suppress less relevant ones, improving the accuracy of local detail feature extraction. This study examines six typical women’s shirt style features, applying the improved ResNet to their recognition and classification, resulting in a highly accurate and reliable model. This theoretical and practical advancement enhances the recognition of detailed features in women’s shirts, significantly contributing to the development of intelligent clothing design.
Get full access to this article
View all access options for this article.
