Abstract
Sociology has been described as a ‘third culture’ between science and literature. The distinctions between different orientations in sociological writing have been studied primarily through their non-textual manifestations (publication genres or venues, methodologies used, scientometric indicators, etc.). Our knowledge of how the science–literature boundary relates to the rhetorical composition of sociological texts therefore remains limited. We mixed a bespoke corpus of Czech sociological articles with a corpus of Czech short fiction to straightforwardly account for the relationship between sociology and literature. Unsupervised classification based on the distribution of most frequent verbs yielded two categories of sociological articles. Each cluster exhibited significant association with non-textual variables. Articles less similar to literature were associated with higher rates of co-authorship, citation counts, and number of women as first authors. Both clusters also displayed clear semantic differences. The signal from literary works increased variance in the textual feature space and subsequent pseudo-experimental validation confirmed its indispensability for the discovery of the association between the rhetorical pattern of verbs usage and non-textual variables related to sociological articles.
Get full access to this article
View all access options for this article.
