Abstract
This study aims to evaluate 16,891 academic publications in the field of cinema between 1980 and 2024 using bibliometric analysis and topic modeling methods. Based on data obtained from the Web of Science (WOS) and Scopus databases, bibliometric findings were received, including the distribution of publications by year, the annual number and rate of citations per article, the most productive authors in the field, the production status of authors over time, the countries of authors and the number of articles they published, and the journals with the highest number of publications. Data obtained from the Web of Science (WOS) and Scopus databases were also used to identify prominent word groups and themes in the articles using text mining and Latent Dirichlet Allocation (LDA) topic modeling. As a result of the analysis, 12 main themes emerged based on word-text relationships and the weight of publications. The findings show that cinema studies have developed with increasing momentum over the years and that there has been a growing focus on certain topics. This study systematically examines the development of cinema studies literature through descriptive content analysis and LDA topic modeling. In this respect, it is important in that it systematically reveals the structural and thematic transformation of academic production in the field of cinema and provides a theoretical and methodological basis for future research. It also makes a current and multidimensional contribution to the discipline in terms of revealing the increasingly important digital trends, cultural representations, and interdisciplinary developments in cinema studies.
Plain Language Summary
This study aims to evaluate 16,891 academic publications in the field of cinema between 1980 and 2024 using bibliometric analysis and topic modeling methods. Based on data obtained from the Web of Science (WOS) and Scopus databases, bibliometric findings were received, including the distribution of publications by year, the annual number and rate of citations per article, the most productive authors in the field, the production status of authors over time, the countries of authors and the number of articles they published, and the journals with the highest number of publications. Data obtained from the Web of Science (WOS) and Scopus databases were also used to identify prominent word groups and themes in the articles using text mining and Latent Dirichlet Allocation (LDA) topic modeling. As a result of the analysis, 12 main themes emerged based on word-text relationships and the weight of publications. The findings show that cinema studies have developed with increasing momentum over the years and that there has been a growing focus on certain topics. This study systematically examines the development of cinema studies literature through descriptive content analysis and LDA topic modeling. In this respect, it is important in that it systematically reveals the structural and thematic transformation of academic production in the field of cinema and provides a theoretical and methodological basis for future research. It also makes a current and multidimensional contribution to the discipline in terms of revealing the increasingly important digital trends, cultural representations, and interdisciplinary developments in cinema studies
Keywords
Introductıon
Conceptually, cinema is a branch of fine arts derived from the French word “cinematographe” and possesses unique characteristics. In this sense, cinema, which has been accepted as the seventh art form, drawing inspiration from existing art forms such as sculpture, painting, photography, and theater, has become a kind of integrator of all the art forms it has drawn from and a unique medium of expression. However, cinema should not be viewed solely as the result of an artistic approach, but also as a means of communication, a cultural representation, and a producer of political discourse. In this context, the perspectives presented by cinema theories explain the multifaceted and multi-layered structure of cinema. Roberto Rossellini defines cinema as the manipulation of reality through images and sound (Michelson, 1995), while Bazin (1967) describes it as the capture and reproduction of a segment of time, focusing on cinema’s relationship with reality rather than its artistic aspect.
Discussions on the nature of representation within cinema theories have intensified, especially after the 1970s. Christian Metz’s semiotic approach analyzes cinema as a language system, while Laura Mulvey’s feminist film criticism questions the male gaze on the cinema screen and patriarchal regimes of representation (Mulvey, 1975). As can be understood from these approaches, it is generally accepted that cinema does not merely produce content, but also possesses a sign system with ideological functions.
Cinema is not only Western-centric in terms of its industrialized structure. It is also largely guided by Western-centric theoretical paradigms in terms of its academic interpretation and theoretical organization. Although this situation began to be questioned with the rise of postcolonial film criticism, the views that emerged after Said’s (1978) conceptualization of Orientalism brought to the fore the debate that cinema functioned as an instrument of cultural hegemony in the postcolonial period. The works of Bhabha (1994) and Spivak (1988) have attempted to explain how the representation of the other in cinema operates in the construction of identities.
Cinema has come to the fore as a fictional space that has continuously updated itself with new formats added over more than a century and strengthened its technical equipment to increase audience participation in the content. Cinema, which has developed in many ways, has made its most striking updates in the elements that are prioritized within hierarchical perception practices. From the first film screening in 1895 to the present day, cinema has become an innovation that the whole world cares about with its effective position on the global stage. This situation has made it necessary to re-examine and re-evaluate cinema in the context of technological transformations such as digitalization, media convergence, and artificial intelligence, beyond classical theories. In this context, the integration of technology-based developments into cinema has brought about changes in both the definition and functional processes of cinema and in the viewer experience. Manovich (2001) evaluates digital cinema in the context of software-based cultural production, while Hansen (2004) argues that digital cinema has transformed the viewer experience in terms of body politics.
Cinema is a multifaceted art form. It encompasses not only art, but also economics, ideology, culture, communication, and all disciplines centered on humanity, making it a development that attracts the attention of all segments of society. Therefore, it is only natural that many disciplines and academic circles interested in cinema engage in discussions and scientific research with cinema at the center. This multifaceted development of cinema has led to diversification not only in film production but also in the ways it is interpreted. Cinema, which attracts the interest of different disciplines such as art, ideology, culture, economics, and communication, has begun to take center stage in interdisciplinary studies. Therefore, there is a need for cinema studies to be evaluated in a more systematic, data-driven, and multidimensional manner today. In this sense, the multi-layered and multifaceted nature of cinema and the increasing interest of many disciplines in cinema have highlighted the need for a bibliometric and modeling analysis of cinema. The study systematically analyzes the development of literature in the field of cinema through descriptive content analysis and LDA topic modeling. In addition to being based on the current general findings, this study was inspired by previous bibliometric studies and aims to highlight important gaps for further research.
In recent years, artificial intelligence and natural language processing techniques have begun to be integrated into cultural fields such as cinema studies. Modeling techniques such as LDA (Latent Dirichlet Allocation) have become important tools for extracting topics from textual data. This study aims to reveal both thematic and structural trends by examining articles published in the field of cinema between 1980 and 2024 using bibliometric analysis and LDA-based topic modeling methods.
Main Problems of the Study
The main problem of this study is to determine which themes have come to the fore in academic publications in the field of cinema between 1980 and 2024 and to develop predictions about future research directions in line with these thematic trends. Cinema is not only an artistic form of expression; it is also a multidimensional field of communication that reflects cultural, ideological, and technological transformations. Therefore, identifying which topics are more prominent in cinema studies is important for understanding the dynamics of development in the field. However, systematic, data-driven, and large-scale analyses such as bibliometric analysis and topic modeling on cinema studies are quite limited. This situation highlights the need for a comprehensive assessment of the structural and content-related transformation of academic production in the field of cinema.
The findings obtained within the scope of the study have been interpreted in the context of theoretical inferences about the historical course of academic production and future orientations, within the framework of structuralist theory and cultural transformation theories. This theoretical grounding is based on the assumption that the field of cinema has been reshaped along the main axes of socio-cultural transformation, digitalization, political representation, and interdisciplinary expansion.
In this context, the study sought answers to the following questions:
What themes does scientific production in the field of cinema focus on?
How have these themes changed over the years?
What topics are cinema studies likely to focus on in the future?
Theoretical Foundation
The phenomena emerging within social organization are grounded in a set of structural and semantic relationships. Structuralist theory is concerned with these underlying structures. Structuralism, which has been frequently used in the analysis of language, culture, and society since the 20th century, does not seek to explain everything. Instead, it attempts to reveal the structure of the work by asking real questions (Furtana, 2014). In cinema research, works are not merely the sum of their contents but gain meaning within specific linguistic, cultural, ideological, and social structures. Therefore, themes derived from bibliometric distributions and topic modeling analyses are shaped by the influence of certain structural codes (e.g., gender, political representation, cultural identity). The findings obtained during the study reveal the historical and structural relationships of cinema studies. Uncovering the structural nature of dominant themes in the past and proposing ideas for the future is based not only on statistical but also on structural analysis-based approaches.
In the modeling and trend analysis process of the study, future trend predictions were also interpreted within the framework of cultural transformation theory. “Cultural transformation is a result of cultural interaction emerging in a globalizing world” (Özekici & Ünlüönen, 2019). Cultural transformation theory is primarily concerned with changes in cultural practices and representations. In particular, structural transformations such as technological developments, digitalization, and globalization have led to the reshaping of cultural practices and representations; this situation has necessitated the examination of these transformations within a broad academic context, and the phenomenon of cultural transformation has become a common area of interest that brings together many disciplines around the axis of structural change. Themes that are understood through modeling and gain momentum through trend analysis show that the field of cinema is undergoing not only technological but also ideological and cultural transformations.
Purpose and Significance of the Study
The main purpose of this study is to examine academic publications in the field of cinema between 1980 and 2024 using bibliometric analysis and topic modeling methods, to identify the dominant thematic orientations in the field, and to provide guiding insights for future research. The study also aims to examine how cinema-related research is approached in different disciplines, revealing interdisciplinary transitions and the diversity of theoretical approaches.
This research is significant in that it is conducted with a large data set of 16,891 articles. The sample size is a fundamental element in bibliometric analysis that enhances the reliability and generalizability of the results obtained. Additionally, the LDA topic modeling method used in the study objectively and data-driven identifies prominent themes in the field of cinema, thereby contributing to the literature both methodologically and substantively.
Method
This study presents scientific results using bibliometric analysis and topic modeling analysis methods consisting of data collection, inclusion-exclusion classification, text preprocessing, topic modeling, revealing ideal topic numbers, and trend analysis for the concept of cinema, which has been examined by a wide range of disciplines in the scientific literature.
Academic articles were used as the unit of analysis in this study. Each article was evaluated at the level of title, abstract, and keywords, and topic modeling was performed based on these contents. Research themes represent semantic clusters determined through the LDA algorithm. In addition, publication year information was also used to enable the analysis of temporal trends. Thus, content, conceptual, and temporal analysis units were evaluated together.
Type of Research
This study aimed to reveal the bibliometric profiles of cinema-related studies and identify important topics as a result of topic modeling. In terms of type, the research falls under the category of descriptive content analysis. Descriptive content analysis, a method within content analysis, involves systematic studies that evaluate trends and research results in an explanatory dimension by including all published studies related to a predefined topic in the data set (Lin et al., 2014; Suri & Clarke, 2009; Ültay et al., 2021). The results obtained from this analysis method are expected to guide future studies on the targeted topics (Lune & Berg, 2017; Yıldırım & Şimşek, 2018). Due to the use of descriptive content analysis in different studies, it is necessary to present the literature in a systematic manner (Ültay et al., 2021).
Application Steps
The analysis process in this study was carried out through the following steps.
Data collection and scope determination,
Data cleaning and application of inclusion criteria,
Text preprocessing
LDA topic modeling and determining the ideal number of themes,
Coding and thematic naming process,
Determining temporal trends with trend analysis
Collection of Data
The data for the study were obtained from both the WOS and Scopus databases using key indexes. Some limitations were applied to the search criteria among the millions of data in these two databases. In the data collection process, where the concept of “cinema” was used as a keyword, book chapters, early-appearing articles, conference abstracts, and conference full texts were excluded, and only published articles were included. Additionally, articles related to the subject written in Spanish, French, and Portuguese, in addition to English, are available in the WOS and Scopus databases. Articles written in languages other than English were also excluded from the study. The most important reason for excluding articles written in languages other than English is the limited access to the literature on cinema studies in other languages and the technical limitations of programs such as RStudio and Orange used in the analysis in terms of linguistic diversity. Therefore, considering the possibility that some words and phrases in many languages that are difficult to understand globally could mislead the analysis tools and expert researchers in the interpretation phase of the study, it was decided to exclude languages other than English from the scope. In summary, this is due to the limited accuracy and consistency of machine learning-based methods such as text preprocessing and LDA modeling in languages other than English. However, it is believed that this shortcoming can be addressed in future studies by using multilingual text mining models (such as multilingual BERT and LDA2Vec).
In this study, articles related to cinema published in the WOS and Scopus databases were also subject to historical time constraints. In this context, articles published before January 1, 1980, and after December 31, 2024, were excluded from the scope.
When examining Figure 1, the quantitative status of the data obtained from the search using the key index is shown in the PRISMA flow diagram. Between January 1, 1980, and January 31, 2024, a search using the keyword “cinema” in the WOS and Scopus databases yielded 22,061 articles in WOS and 31,358 articles in Scopus.

Collection of research data according to the PRISMA flow chart.
Since the study was limited to published articles, a total of 20,451 arguments were excluded from the study, with 3,282 from the WOS database and 17,169 from the Scopus database. Additionally, 16,077 articles were excluded from the data pool as they were present in both databases. The data were independently reviewed by the researchers, and a consensus was reached on those that met the inclusion criteria in order to minimize bias. After all these processes, the analysis phase proceeded with 16,891 articles.
Analysis of Data
Three different analyses were used in the study.
Bibliometric Analysis
Bibliometric analysis has become a popular method in recent years due to its benefits in processing large volumes of scientific data and creating a high research impact. In particular, bibliometric analyses are preferred in scientific studies for reasons such as revealing trends in article or journal performance, collaboration models, and research components (Donthu et al, 2021). This method reveals the changes and developments in the research topic over time with concrete data. The large and objective data included in bibliometric analysis can be interpreted with subjective evaluation. Bibliometric analysis includes two basic approaches: performance analysis and science mapping. Performance analysis involves evaluating the impact of researchers, institutions, and countries using criteria such as total publications, author contributions, and citation indicators, while scientific mapping helps to map the structure and dynamics of research (Analysis of cited works to identify the most influential publications, co-citation analysis to better understand the relationships between cited works, bibliographic links to connect related publications, co-word analysis to show relationships between topics, and co-authorship to understand the relationships between authors’ social interactions; Passas, 2024). In this study, bibliometric analysis was conducted using the “biblioshiny” application within the Bibliometrix package in RStudio, and the findings were supported with visualizations.
Some inclusion criteria were determined while creating the data set for this study. These criteria were that the document type should consist of articles, the publication languages should be limited to English, and the publications should be between 1980 and 2024. After this filtering process, 16,891 articles were downloaded in CSV format and analyzed using RStudio software. The tools and parameters used in the bibliometric analysis phase are as follows:
Bibliometrix and Biblioshiny packages running in the RStudio environment
Number of publications by year
Authors, journals, institutions, and countries with the most publications
Number of citations and H-index
Keyword analysis
The obtained data were supported by graphs and tables using R-based visualization tools (e.g., ggplot2, graph functions within Bibliometrix).
LDA Topic Modeling Analysis
In the topic modeling phase, which is another part of the data analysis phase, text mining was used. The primary objective of text mining is to classify, group, and label texts. As an extension of data mining based on textual data, text mining is a technology that seeks to derive meaningful results from unstructured textual data (He et al., 2013). Although text mining is performed on unstructured texts, making these texts suitable for qualitative and quantitative analysis is another important goal of text mining. In other words, extracting information and patterns from text documents is the basic function of text mining (Bach et al., 2019).
Text mining methods can be used for many purposes. Classification of expressions and topics, clustering, named entity recognition, sentiment analysis, face analysis, keyword extraction, parsing, topic detection, and topic modeling are just a few of them. Topic modeling is a widely used intelligent technique for topic discovery in natural language processing and semantic mining from documents consisting of scattered, unordered large datasets (Celodar et al, 2018). In the process of the topic modeling method, LDA (Latent Dirichlet Allocation) is a frequently preferred natural language processing method in natural language processing fields. LDA, which provides a productive approach to finding hidden structures in huge information, is used as a heuristic approach to calculate the similarity between source files and obtain the relevant distributions of each document on topics (Celodar et al, 2018). To perform LDA, the data must be converted into a corpus. Then, the compilation text preprocessing stage must be performed.
Text Preprocessing Phase
To perform topic modeling, the data set must first be prepared in a manner suitable for modeling. Therefore, misleading, incomplete, and insufficient data must be removed from the data set to ensure accurate and reliable modeling. The comprehensive examination of the data set, that is, the text preprocessing process, consists of several stages.
Transformation: The first step in the data preprocessing stage is to ensure that all data is included in the same language set. In this context, accents are removed and capital letters in the text are converted to lowercase letters. In addition, HTML extensions in the data set are separated or removed at this stage.
Tokenization: In this stage, the necessary adjustments at the sentence and word level are determined. Punctuation marks and spaces between words are processed in this stage.
Normalization: This stage involves technical processes to determine what type of modeling is required for the data set to be used for modeling.
Filtering: The available data set may not always be organized as the researcher desires. When modeling, both unwanted unnecessary letters and non-English characters can lead to misleading modeling results. At this stage, characters that are not wanted in the modeling, or characters that have no semantic weight, such as “a, an, the,” are removed. At the same time, numbers in the text are separated, and unnecessary punctuation marks and symbols that may hinder analysis are removed.
In the Transformation stage of the study, non-English characters in the text were first removed. Then, to prevent deviations that could occur at the letter level, all letters were converted to lowercase. Again, in this stage, special characters and HTML tags with a high potential to be recognized as words by the model were removed from the text. In the second stage, Tokenization, separations, and arrangements that could mislead the model at both the word and sentence levels were made. Standardization of spaces between sentences and removal of structures classified as undefined words, except for special characters and HTML tags that could be tokens, was performed at this stage. In addition, the texts were parsed at the word level and then lemmatized (films → film, studies → study). In the normalization stage, the words in the texts were reduced to their roots. For example, words such as “films,”“filmmaking,” and “filmmakers” were normalized to a single root, “film.” This process was applied to prevent word variations from disrupting thematic clustering. In the Filtering stage, the final stage of the text preprocessing process, a stopword list was prepared. This stopword list included letters and words that had no semantic weight but had the potential to be tokens. In this study, the words “a, an, the, is, with” were added to this stepword along with all letters. After all these processes, the texts were cleaned structurally and semantically and converted into a vector structure consisting of 1,961,116 tokens. Thus, a high-quality, noise-free text set was obtained for the LDA model.
Topic Modeling and Determining the Ideal Topic Number
In this study, topic modeling was employed to objectively identify the thematic density within the field of cinema. After the text preprocessing stage, the qualitative data obtained were converted into quantitative data using the Latent Dirichlet Allocation (LDA) natural language processing method. In this conversion process, words represent vector structures, and each word is modeled as a vector structure. As a result, a term-document matrix was obtained from the entire corpus.
The topic modeling process was performed using the Orange Data Mining v3.34.1 program written in Python. Orange Data Mining is an open-source data mining and machine learning software developed in the Python programming language with a visual programming interface. It is widely used in the social sciences, particularly for text mining, classification, clustering, and topic modeling, thanks to its modules that support these processes. Through the Text Mining and Topic Modeling modules used in the study, vector representations of the texts were created, the distribution of topics was analyzed using the LDA algorithm, and the outputs were visualized.
The LDA model was applied through embedded libraries, and a certain number of articles under each topic were clustered according to their common keywords. In interpreting the model outputs, the representative keywords of each topic were revealed; the titles, abstracts, and keywords of the articles belonging to these topics were examined through content analysis, and topic names were assigned. In addition, the opinions of two expert academics in the fields of communication studies and cinema were consulted during the naming process, thereby increasing the scientific validity of the interpretations. In particular, content that did not correspond to high-representative keywords or served multiple themes was carefully reclassified during the qualitative analysis process. At this stage, the 16,891 articles in the dataset were evaluated not only based on automatic model outputs but also through the academic orientations, cultural frameworks, and research focuses they represented in the content context. Thus, the topic modeling process was enriched not only with algorithmic depth but also with interpretive and conceptual depth. This method strengthens the mixed methods nature of topic modeling, ensuring that qualitative and quantitative analysis results mutually support each other. The steps of this process are shown in Figure 2.

Steps of topic modeling analysis.
In the data set collected for the study, each document consists of a combination of multiple topics. In addition, the words used for these independent and different topics contribute to the emergence of certain terms. Based on these assumptions, the most critical element in topic modeling is determining the ideal number of topics to be extracted from this independent and mixed data. To do this, the expertise of the researcher who prepared the dataset is of great importance (Bystrov et al., 2023). Therefore, determining the number of topics in topic modeling is a subjective process. To minimize this subjectivity, there are evaluation criteria for determining the ideal topic in LDA models. In this study, coherence and log perplexity measurements are the evaluation measurements used to minimize subjectivity. After calculating the coherence and log perplexity values, the appropriate number of topics was identified before the topic modeling analysis. By examining the suitability of the coherence and log perplexity values, the ideal number of topics was increased to 30. The coherence and log perplexity values of these 30 topics were compared with each other.
Table 1 shows the 30 topics examined to determine the ideal number of topics, ranked according to their Log Perplexity and Coherence values. The highest coherence and lowest perplexity values indicate that there could be as few as five topics. However, the large amount of data and the token count of 1,961,116 provided a valid reason to increase the number of topics, and after examining the word clusters, it was decided to set the number of topics to 12. The evaluation conducted in the range of 1 ≤ k ≤ 30 to determine the most appropriate number of topics also pointed to 12 topics based on two main reasons in addition to numerical indicators.
Reason 1: The large volume of the data set, which contains 16,891 articles and approximately 2 million tokens, requires detailed thematic separation.
Reason 2: In terms of interpretability, the 12-topic model is quite important in revealing the diversity of cinema studies. The 12-topic model provided a sufficient level of segmentation to reveal clearer, more consistent, and semantically distinct themes (Political Representations in South Asian and African Cinema: Film Festivals, Censorship, and Cultural Identity; Pasolini’s Ethnographic and Anthropological Narratives in Global Cinema).
The selection of the number of topics was not only determined by log-perplexity and topic coherence values, but also by the semantic consistency of the study, the interpretability of the topics, and the contextual objectives of the study. Although k > 12 appears to be numerically acceptable, it was found that the topics began to consist of only a few keywords. This situation leads to excessive fragmentation in content integrity. The 12-topic model provided both statistical balance (high coherence, low perplexity) and content clarity in this context.
Log Perplexity and Coherence Values in the Process of Determining the Number of Topics.
Coding and Thematic Naming Process
To determine the naming of topics in topic modeling, we first examined the highly representative keywords of each theme emerging from LDA. The context of these keywords was analyzed along with the titles, abstracts, and keywords of the relevant articles. The resulting conceptual clusters were labeled by the study authors based on content similarities and academic usage. These themes were then reviewed by two expert researchers and interpreted based on their content similarities. The coding process for two sample themes is presented below. This process was supported by both qualitative content analysis and expert opinion to objectify the labeling process.
Example 1: Political Representations in South Asian and African Cinema: Film Festivals, Censorship, and Cultural Identity
Keywords: film, cinema, festival, african, indian, politica, india, south, censorship
Sample Article: Dovey, L. (2020). African film festivals in Africa: Curating “African Audiences” for “African Films.” Black Camera, 12(1), 13–47. https://doi.org/10.2979/blackcamera.12.1.03
Example 2: Cultural Codes and Gender Politics in Asian Cinema
Keywords: film, women, gender, cinema, chines, female, represent, cultur, queer
Sample Article: Li, J. (2017). In search of an alternative feminist cinema: Gender, crisis, and the cultural discourse of nation building in Chinese independent films. ASIANetwork Exchange: A Journal for Asian Studies in the Liberal Arts, 24(1), 86–111
Following this stage, trend analysis was conducted.
Trend Analysis
Within the scope of the study, after uncovering hidden structures and complex relationships using LDA topic modeling, the distribution of 12 identified topics over time, their volume over time (by year), their trend curves, their relative acceleration values, and the weighting ratios of topics in the large dataset were revealed using trend analysis.
In this study, trend analysis was conducted to examine the temporal trends of the themes identified through LDA (Latent Dirichlet Allocation) topic modeling. Throughout the analysis process, the volume, percentage changes, and acceleration development of the topics over time were systematically calculated in Excel. Twelve main topics identified using LDA were analyzed by dividing them into nine distinct periodic clusters (1980–1984, 1985–1989, 1990–1994, 1995–1999, 2000–2004, 2005–2009, 2010–2014, 2015–2019, and 2020–2024) between 1980 and 2024. The absolute number of articles for each topic during these periods, their percentage share in total production, and changes in acceleration over time were calculated separately.
Graphs were created to assess the temporal distribution of topics:
Volumetric Change of Topics Over Time (number of productions for each period),
Percentage Change of Topics (share in total production),
Acceleration of Topics (percentage increase rates).
Findings and Comments of the Research
Within the scope of the study, bibliometric analysis, model analysis, and trend analysis were performed in sequence. Articles indexed on cinema were identified using the WOS and Scopus databases. The term “cinema” was used as a keyword in these databases, and after the necessary screening processes were performed, a total of 16,891 articles were identified. These articles were transferred to biblioshiny, an interface of the biblimetrix package in the R programming language, and the data were classified, defined, compared, and analyzed using biblioshiny.
As shown in Table 2, 16,891 articles were accessed from 4,051 sources within the dates specified in the study, and the data set was analyzed based on this number. According to the statistical information in Table 2, 18,476 authors have contributed to scientific production related to cinema. Of these authors, 8,682 contributed individually to scientific production, while 9,794 were involved in collaborative production. Parallel to this finding, the co-authorship rate of 1.56 and the international co-authorship rate of 4.938 in the articles indicate that collaboration in articles produced in the field of cinema is relatively low.
Basic Statistical Data on Cinema.
When looking at the growth rate related to cinema studies, it was found to be 12.8. This rate indicates that the subject is of high interest. When looking at the average age of articles published on the subject, it is seen to be 9.14. In this sense, we can say that the literature has gained momentum in recent years and that research on the subject has remained up-to-date. The annual citation rate per article is 4.504. This rate indicates the scientific value of articles published on cinema. Additionally, 18,476 authors contributing to cinema used 8,827 keywords and 37,157 keywords.
Table 3 shows the annual number of scientific publications related to cinema. In this context, it can be seen that scientific interest in the field is increasing day by day. Upon examining the graph above, it is observed that the largest numerical increase occurred in 2023, while the largest proportional increase occurred in 2007 (49.6%). This significant increase in 2007 is closely related to the integration of digital technologies into cinema. In particular, the rise of social media platforms is one of the defining dynamics of this period. Facebook’s public offering in 2006, Twitter’s launch in 2006, and YouTube’s establishment in 2005 laid the groundwork for the construction of digital culture in various forms. These developments have brought to the forefront topics such as new media narratives, user-based content production, digital aesthetics, interactive viewer experience, and online distribution strategies in cinema studies, which has led to a noticeable increase in the number of academic publications as of 2007.
Scientific Production Related to Cinema.
Note. “*” was used to highlight the significant increase in the number of publications in 2024.
Among the main reasons why 2023 is the most productive year for scientific output in cinema studies are the publication of academic studies that were postponed during the pandemic, the emergence of new research topics in the field of cinema due to artificial intelligence technologies, and the transformation of content production and distribution practices on a global scale by digital publishing platforms.
Table 4 shows the annual citation values of articles related to cinema. According to this, although the average number of citations per article is relatively low, it appears to be above average between 1990 and 2006. The highest number of citations was reached in 1997 (n = 21.57), while the lowest number of citations was observed in 2024. When looking at the annual citation rates of the articles, 1997 stands out again. The highest citation rate was reached in 1997 (n = 0.74), while the lowest citation rate was observed in 1987 (0.04). When examining the citation data for 2023, the year with the highest number of articles produced, it is seen that it lags behind the years 1990 to 2006 in terms of both numerical and proportional values. When Tables 3 and 4 are examined together, it is seen that there is an increase in production in terms of numbers. However, this increase in scientific production is not receiving sufficient citations. It would not be wrong to interpret this situation as a quantitative increase that does not bring about a qualitative increase. In particular, the increasing expectations of academics regarding the number of publications by universities and the increase in the number of open access journals are the most important reasons for the quantitative increase in scientific production. This quantitative increase does not bring about methodological depth or theoretical consistency.
Annual Citation Number and Rates of Cinema-Related Articles.
Note. “*” a statistical or noteworthy increase.
While universities’ focus on the number of publications in their academic performance criteria has led to a quantitative increase in scientific output, the increasing number of specialized academics in the field of cinema has also led to a quantitative increase, as well as a parallel increase in the number of academic journals. In this process, where the distinction between high-quality and low-quality publications has disappeared, it is observed that many academic journals, except for important databases, approach science in an industrial manner (Larivière et al., 2015).
According to Table 5, Alexander FEDEROV ranks first in terms of scientific contribution with 30 articles on the subject. In the above table, which lists the top 10 authors, it was found that the number of articles produced by the authors is very close to each other. The Articles Fractionalized value indicates the individual contribution percentage in multi-authored articles where authorship contributions are shared. According to this metric, it is seen that all authors are involved in joint projects. For example, Levitskaya’s fractionalized ratio is only 8.80 for 21 publications, indicating that the author is in a secondary or supporting position in most of her publications.
Authors Producing the Most Articles on Cinema.
According to Figure 3, Alexander FEDEROV, who has published the most on cinema, began contributing to the field in 2012 and wrote his most recent article in 2024. His most productive year was 2017. When examining Figure 3, the first publication among the 10 authors who have made the most scientific contributions related to cinema was made by Keyan TOMASELLI in 1992. Another notable point is that Beti ELLERSON is the author who published the most articles in a single year compared to other authors. ELLERSON wrote 16 of her 23 articles in 2024 alone. This indicates that ELLERSON collaborated more with other authors than others.

Authors’ production over time.
Table 6 shows the average number of articles and author collaborations for the 10 countries with the highest number of publications related to cinema. According to this, the country with the highest number of publications on cinema (f = 3,255) is the United States. The United Kingdom follows the United States (f = 2,366) in terms of articles, while Germany is the country with the fewest articles among these 10 countries. The MCP (Multiple Country Publications) value in the table indicates that authors from more than one country collaborated on a publication, while the SCP (Single Country Publications) value indicates that authors from only one country were involved in a publication. The sum of a country’s SCP and MCP values indicates the total number of publications produced in that country. In this context, it is evident that the United Kingdom is the country that places the greatest emphasis on international collaboration in cinema-related articles. The countries with the lowest levels of international collaboration are India and Italy, as indicated in the graph.
Countries of Affiliation of Corresponding Authors and Number of Articles.
Note. “**” SCP: Single Country Publication, “***” MCP: Multiple Country Publication.
Table 7 shows the 10 most relevant institutions and the number of articles on the subject. According to this, Kings College London, University of London, University of Warwick, and University of Cambridge in the United Kingdom have made the most scientific contributions to the field with a total of 694 articles. When examining the institutions and the number of articles, it is seen that institutions in Canada also made a significant contribution after the United Kingdom. The University of Melbourne, the University of Sydney, and the University of Toronto produced a total of 407 articles. In addition to the United Kingdom and Canada, institutions in Spain, the Netherlands, and Romania were also found to be the institutions that contributed the most to the field.
Academic Institutions and Number of Articles on Cinema.
When examining Table 8, we see that “Studies in European Cinema” ranks first (f = 258) among the 10 journals that publish the most articles related to cinema. This journal, which has an impact factor of 0.2, is listed in the ESCI index. The journal “New Review of Film and Television Studies,” which ranks second with 208 articles, is a UK-based journal with an impact factor of 0.3 and is listed in the AHCI index. The journal “Studies in French Cinema” ranks third with 204 articles. This South African journal has an impact factor of 0.2 and is indexed in ESCI. The journal that publishes the most articles on cinema, but ranks last in the list, is the UK-based “Studies in European Cinema.” This journal has an impact factor of 0.3 and is indexed in ESCI. When the indexes of the journals listed in Table 9 are examined in detail, it is found that they are indexed as 9 ESCI and 1 AHCI.
Journals Publishing the Most Articles on Cinema.
Number of Topics and Headings Revealed as a Result of Text Mining.
The most frequently used words in bibliometric analysis and topic modeling analysis related to cinema were also included in the study. The results obtained are presented in the word cloud map in Figure 4. According to this, the three most frequently used words other than cinema are “film, gender, documentary.”

Word cloud of the hundred most frequently used keywords in cinema related articles.
Another analysis used in the study is topic modeling analysis. The topics that stand out in articles related to this analysis and the trends of these topics over time were determined using topic modeling analysis. When studies published in the field of cinema between 1980 and 2024 were analyzed using text mining methods, it was determined that the contents of these publications were concentrated in 12 main themes through the Latent Dirichlet Allocation (LDA) model.
Table 9 shows the 12 topic headings that emerged from the word-text relationship (LDA) and on which the authors agreed, along with the percentage of the data set they cover. In creating these topics, the titles, abstracts, and keywords of the articles grouped under each topic were examined in detail by the researchers.
When examining the scope of the topics, certain topics appear to be marginal. For example, while the theme “Political Representations in South Asian and African Cinema: Film Festivals, Censorship, and Cultural Identity” has a dominant presence at 20.50%, the theme “The intersection of cinema and digital media, the impact of new technologies on film art, and the transformation of cinematic space” has a marginal presence at 1.60%. These results show that academic orientations and research interests within the field are concentrated on certain themes. In addition, themes with a marginal presence represent relatively new and developing areas such as digital media and metacinema. The low weight of such themes suggests that these areas have not yet been sufficiently explored academically or are more difficult to study methodologically. Therefore, this imbalance is directly related to the research maturity and orientations of the discipline rather than a deficiency.
According to the headings created during the topic modeling process and the topic weight ratios revealed, the topic “Political Representations in South Asian and African Cinema: Film Festivals, Censorship, and Cultural Identity” was found to have the highest weight. In addition, the topics “The Use of Networks and Control Mechanisms in Cinema” and “Pasolini’s Ethnographic and Anthropological Narratives in Global Cinema” also stand out from other topics with high weight ratios. The average weight of these three topics is 51.24%. As such, it can be seen that more than half of the studies related to cinema focus on a national and narrow group. Additionally, it is evident from the emerging topics that the concept of culture is also associated with cinema. These three topics, which have significantly higher weight ratios compared to other topic headings, also highlight the relationship between cinema and technology. The LDA values for these emerging topics are detailed in Figure 5 below.

Topic and Keyword Distribution According to LDA Values.
The LDA word-text analysis figures are shown above. When examining the figures, the dominant keywords marked in red used in naming the 12 topics can be seen. These dominant keywords that stand out in word-text analysis demonstrate that cinema studies have a multi-layered thematic structure. This multifaceted structure reveals that cinema studies have evolved from classical formal analyses to post-structuralist, cultural studies-focused, and digital media-based approaches. Concept clusters such as “gender,”“digital,”“algorithm,”“colonial,”“festival,”“philosophy,”“body,”“covid,” and “ethnograph,” particularly visible in LDA graphs, demonstrate how closely intertwined cinema studies are with social, political, technological, and epistemological transformations.
Within the scope of the study, articles published between 1980 and 2024 were divided into nine periods and a topic modeling analysis was performed. Table 10 provides an overview of the distribution of nine different topic headings over the years. According to this, the number of articles on each topic heading has been steadily increasing. The period with the lowest number of articles is 1980 to 1984, while the period with the highest number of articles is 2020 to 2024. The topic with the highest number of articles is “Political Representations in South Asian and African Cinema: Film Festivals, Censorship, and Cultural Identity,” while the topic with the lowest number of articles is “Memory and Metacinema in the Italian and Korean Film Industries in Post-War Cinema.” This indicates that “Political Representations in South Asian and African Cinema: Film Festivals, Censorship, and Cultural Identity” is the most comprehensively researched topic. The topic with the least interest, “Memory and Metacinema in the Italian and Korean Film Industries in Post-War Cinema,” showed an average level of interest until 2010, but an increase in interest was observed after 2010. According to Table 10, the popularity of topics 3, 11, and 12 shows a more noticeable increase compared to other topics. The volumetric changes and trends over time in topics consisting of articles published on cinema are detailed in Figure 6 below.
Distribution of Subjects by Period.
| Topics | 1980–1984 | 1985–1989 | 1990–1994 | 1995–1999 | 2000–2004 | 2005–2009 | 2010–2014 | 2015–2019 | 2020–2024 | Total |
|---|---|---|---|---|---|---|---|---|---|---|
| Postcolonial İnfluences in Soviet, Russian, Japanese and Turkish Black and White Films | 1 | 6 | 1 | 4 | 9 | 35 | 108 | 116 | 119 |
|
| Cultural Codes and Gender Politics in Asian Cinema | 1 | 0 | 3 | 2 | 14 | 60 | 126 | 198 | 258 |
|
| Network Use and Control Mechanisms in Cinema | 2 | 1 | 10 | 28 | 102 | 333 | 741 | 961 | 1099 |
|
| Cinema Aesthetics and Environmental Consciousness in Andrei Tarkovsky’s Cinema | 5 | 14 | 18 | 30 | 63 | 138 | 194 | 232 | 196 |
|
| The İmpact of The American Film İndustry on National and Global Culture | 4 | 2 | 6 | 42 | 25 | 7 | 24 | 21 | 144 |
|
| The Reality of Body and Time in Cinema-Human Relationship | 1 | 2 | 1 | 4 | 12 | 59 | 140 | 190 | 317 |
|
| Research and Cultural Trends in Cinema: The Use of Social Media in Film Studies | 19 | 20 | 18 | 43 | 66 | 132 | 236 | 363 | 623 |
|
| The İntersection of Cinema and Digital Media, the İmpact of New Technologies on Film Art And the Transformation of Cinematic Space | 1 | 1 | 2 | 2 | 5 | 2 | 22 | 27 | 51 |
|
| Aesthetic Searches and Film Criticism in Cinema History | 3 | 6 | 11 | 35 | 75 | 166 | 417 | 576 | 833 |
|
| Memory and Metacinema in the Italian and Korean Film İndustry in Postwar Cinema | 0 | 1 | 1 | 4 | 2 | 6 | 19 | 16 | 29 |
|
| Political Representations in South Asian and African Cinema: Film Festivals, Censorship and Cultural Identity | 6 | 14 | 21 | 49 | 117 | 384 | 840 | 1055 | 1401 |
|
| Ethnographic and Anthropological Narratives of Pasolinin in Global Cinema | 2 | 6 | 9 | 14 | 62 | 242 | 629 | 877 | 1101 |
|



