Abstract
In text classification tasks with complex models and high-stakes domains the alignment between predictions and explanations tends to be weak because post-hoc explainability methods operate independent of model training. In this paper, we suggest ATM-AM - an approach based on the Gated Recurrent Unit (GRU) that combines Bahdanau attention with a training-time SHAP-backed alignment objective to offer real-time, context-aware interpretability without trade-off in predictive performance. The model is tested over three frequently-used sentiment analysis datasets (IMDbhttps://huggingface.co/datasets/imdb, Amazon Reviews https://www.kaggle.com/datasets/bittlingmayer/amazonreviews, and SST-2. https://huggingface.co/datasets/glue/viewer/sst2) yielding accuracy scores of 91.8%, 89.5%, and 90.0% with respective F1-scores of 0.899, 0.877, and 0.889 respectively, on each dataset. We also average our measurements over three runs for statistical soundness. The additional training latency added by ATM-AM is quite modest (13–18%), and the inference time remains short (3–4 ms per sample), rendering it feasible to be deployed in real-time. A user-centered interpretability study with 30 participants obtained an average rating of 4.6/5 showing that users trust the explanations produced by our proposed model. These observations posit ATM-AM as a feasible and interpretable solution Text Classification in contexts where model behavior needs to be accountable and reliable.
Keywords
Get full access to this article
View all access options for this article.
