Abstract
Tamil is one of the world's oldest classical languages still in use. The Tamil language boasts a rich and extensive literary tradition, dating back over 2,000 years. Tamil literature addresses various aspects of life, such as love, war, social values and religion. Tamil classical literature encodes human emotions through dense metaphor, symbolism, and cultural convention, posing significant challenges for automatic emotion analysis. This research investigates the classification of melancholic emotions in Kuruntokai, a Sangam-era Tamil poetic anthology, focusing on two dominant affective categories: Lamentation and Consolation. A manually annotated dataset of 401 poems, along with their explanatory prose (urai), is used to evaluate classical machine learning models, recurrent neural networks, and a fine-tuned multilingual BERT (mBERT) model. To address the linguistic complexity of classical Tamil, the framework incorporates morphological analysis, a word reformation algorithm tailored to poetic constructs, and subword-level tokenization. Experimental results show that while Support Vector Machines perform best among classical classifiers, the fine-tuned mBERT model achieves superior performance, attaining an accuracy of 78% on urai-based classification. Quantitative analysis, supported by statistical significance tests and confidence intervals, demonstrates that explanatory prose provides richer emotional cues than the original poems. Qualitative error analysis further reveals how metaphorical compression in poetry leads to misclassification, which is resolved through urai. The findings highlight the effectiveness of transformer-based models for emotion classification in classical Tamil literature and underscore the importance of explanatory prose for reliable affective modelling.
Get full access to this article
View all access options for this article.
