Abstract
With the urgent call for sustainability and the rapid advancement of information technology, especially machine learning and artificial intelligence, the food industry faces unparalleled opportunities and challenges in optimizing resources and enhancing performance. By that call, we foresee a massive change, especially to the way research has been traditionally conducted, it is hypothesized for a surge of research and increasing adoption of innovative tools such as deep learning (DL) and optimization techniques by the research community and the field to address critical challenges in food systems. This study investigates the intersection of commercial, sustainability, and technology domains through advanced bibliometric analysis enhanced by deep document clustering, aims to uncover the multifaceted nature of these interconnected fields, addressing how emerging technologies and sustainability practices shape commercial strategies in the field, providing a panoramic perspective for reader about the dynamic development of the field. Utilizing a multi-tiered literature search strategy and employing deep document clustering techniques, the analysis reveals distinct thematic clusters—ranging from macroeconomic demand challenges to technical advancements in AI-driven agricultural practices and sustainable supply chain management—that underscore the deep connections among commercial, sustainability, and technological domains. The results not only validate the hypothesis of increasing convergence among these fields but also highlight the critical role of advanced methodologies in uncovering latent patterns and facilitating strategic and operational improvements. Moreover, the findings demonstrate that deep document clustering is a powerful tool for bibliometric analysis, offering nuanced insights that can be further refined in future studies. Overall, this research contributes to a holistic understanding of the field, emphasizing the need for continued interdisciplinary collaboration and innovation to drive sustainable development in a global business environment.
Introduction
This paper aims to comprehensively map the evolving landscape of food systems, 1 specifically highlighting the impacts of various external pressures. We attempt to achieve this by leveraging advanced bibliometric techniques and multi-tiered literature search strategy2,3 to uncover key trends and relationships.
The surge of call for sustainability and ESG
The intersection of food systems, sustainability, and technological innovation has long been a critical focus for both researchers and practitioners. Food systems encompass the entire spectrum of activities involved in producing, processing, distributing, and consuming food, thereby influencing environmental, economic, and social domains. The concept of sustainable food systems gained prominence. The foundational principles outlined in the 1987 BrundtHistorically, the study of food systems centered on production efficiency—as exemplified by the Green Revolution, which introduced high-yield crops and advanced agricultural techniques to meet the demands of growing populations 4 — often at the expense of environmental and social sustainability.5–7 However, as global challenges such as climate change, resource depletion, and population growth emerged, land Report emphasized the need to meet current demands without compromising future generations, 8 spurring research into life cycle assessments, circular economies, and ESG principles to mitigate the negative impacts of food systems.9,10 For example, the initiatives have lead to establishment of program and guideline for life cycle assessments as a method in the fast growing food processing field as a method for environmental impact evaluations for the fast growing food processing field. 11 Circular economy has also gain the momentum to grow from a solution idea to feasibility and integration study to the food system, including academic, supply chain, economic, production and business model.12–16 However, the urgency of climate change has make the call for sustainability within food systems evolve from general encouragement to a matter of serious regulatory concern. Initially framed within broad discussions of environmental responsibility and voluntary corporate social responsibility (CSR), sustainability is now increasingly embedded in food policy frameworks and legal mandates across various nations, expecting an enforcement in the future. For instance, the European Union’s Farm to Fork Strategy mandates sustainable practices from production to consumption, including EU code of conduct for responsible business and marketing practice, and related corporate governance framework and circular business models. 17 While the United States introduced the Food Loss and Waste 2030 Champions initiative to engage industries in measurable waste reduction. 18 Similarly, countries like China, Germany, France, and Japan, developing countries have imposed carbon accounting or food waste reporting requirements on large food companies.19,20 These regulatory shifts signal a growing governmental commitment to enforceable sustainability in food systems.
The rise of new information processing technology
Another significant change comes from the rise of new technology, especially artificial intelligence (AI), machine learning (ML) and optimization techniques, represents a fundamental paradigm shift. These technologies are not merely tools but transformative infrastructures that permeate virtually every layer of the economic system. Their integration into the food system is particularly noteworthy by enabling sophisticated modeling and decision-making processes. Deep learning, for example, has been instrumental in improving food quality control, production and supply chain optimization, and precision agriculture, while both constrained and unconstrained optimization algorithms have enhanced resource allocation and waste reduction.21–26
Impact to food production
With the rapid and high impact externally to the food system, the operational priorities of food systems have shifted dramatically. Modern food industries now integrate concerns for sustainability and ESG into their traditional production and efficiency models, resulting in a diversified, cross-disciplinary approach. This dual pressure—from sustainability demands and technological disruption—marks a structural inflection point in the evolution of food systems. The convergence of commercial food systems, sustainability, and technological innovation has thus created a fertile ground for interdisciplinary research. Modern food system operations are no longer confined to conventional production-centric metrics; they are now deeply intertwined with these external factors, drawing upon diverse domain knowledge from engineering, economics, environmental science, and data analytics. A deeper scholarly engagement is urgently needed to understand how these forces interact, conflict, or synergize to shape the future of global food production, distribution, and consumption. Unfortunately, by our knowledge, such a report is not available yet which clarifies the purpose of this paper.
Purpose of the study
By using multi-tiered search strategy, this study systematically examines the intersection degree of these fields by analyzing research productivity, thematic developments, and emerging trends. We hypothesize that the urgent need for sustainability, combined with rapid advances in information technology, will drive the application of transformative technological solutions to optimize resource utilization and overall performance within food systems. With the modern bibliometric tool, we carried out the bibliometric analysis attempt to test the hypothesis. Not only that, other than the conventional bibliometric analysis procedure, we have also adapted the newly developed, Deep Language Model Representation of Document Clustering with BERT and SBERT 27 into the analysis, in order to further uncover the emerging trends and latent themes. We compared the results with those of conventional methods. Overall, our research offers a comprehensive, panoramic perspective on the current research landscape and provides actionable insights for future innovation and interdisciplinary collaboration.
Bibliometric analysis
Bibliometric analysis is a technique emerged in the 60s as a powerful, data-driven approach to mapping scientific knowledge, uncovering research trends, and evaluating academic influence. Defined by Pritchard, 28 bibliometrics refers to the application of quantitative analysis and statistics to publications such as journal articles and their accompanying citation data. Unlike narrative or meta-analytical reviews, bibliometrics offers objectivity and reproducibility by leveraging structured databases (e.g., Scopus, Web of Science) to uncover co-authorship, co-citation, and keyword co-occurrence patterns across large datasets through correlation-based network analysis using structured metadata such as author names, references, and keywords, that form the foundation of bibliometric mapping, allowing for the visualization of relationships and thematic structures within a research domain.29–32 Its advantages include transparency, scalability, and the ability to visualize complex intellectual structures and knowledge evolution. However, the shortcomings are also obvious, that the outcome are straight statistical relationships based on meta-data where semantic context is missing: they only give statistical links from data, without the full context.
Current tools: Bibliometrix and biblioshiny
One of the most widely adopted tools in modern bibliometric research is Bibliometrix, an R-based open-source software developed by. 33 It facilitates various analyses including performance metrics, science mapping, and thematic evolution through an accessible script-based interface. To make this functionality more approachable, Biblioshiny was introduced as its interactive web interface, enabling non-technical users to perform robust bibliometric analyses with minimal coding. These tools have gained strong adoption in management, sustainability, and engineering fields for their ease of use and analytical depth.34,35 While these tools digitalize traditional bibliometric analysis, they inherently carry over its existing limitations.
Natural language processing (NLP) and its incorporation into bibliometric analysis
Natural Language Processing (NLP) is a subfield of artificial intelligence (AI) and linguistics focused on enabling machines to understand, interpret, generate, and respond to human language in a meaningful way. It integrates computer science, linguistics, and machine learning to bridge the gap between human communication and machine comprehension.36,37 While traditional bibliometric techniques focus on citation-based metadata, NLP enables analysis at the full-text or abstract level, offering deeper semantic insights. NLP techniques such as topic modeling (e.g., LDA), named entity recognition, and sentiment analysis allow researchers to detect nuanced patterns within the content itself—beyond author keywords or citation links. This synergy of bibliometrics and NLP helps overcome limitations like synonymy, polysemy, and outdated keyword taxonomies in classical bibliometrics. Applications include automatic classification of documents, content-based trend analysis, and identification of research gaps, yielding new insights from many fields.38–49
BERT and SBERT: Transforming text mining in bibliometrics
The introduction of transformer-based models like BERT (Bidirectional Encoder Representations from Transformers) has revolutionized NLP by enabling deep semantic understanding through contextual embeddings. BERT is pre-trained on massive amounts of unlabeled text data using tasks like “Masked Language Modeling” (predicting masked words) and “Next Sentence Prediction,” allowing it to learn deep contextualized representations of words and sentences, and fine-tunes on specific tasks, excelling in tasks such as question answering, named entity recognition, and document classification. 50 Unlike conventional NLP methods that rely on rule-based systems or shallow statistical models, BERT uses transformer-based deep neural networks to capture the full context of a word based on its surroundings—both left and right. This contrasts with earlier models like Word2Vec or GloVe, which generate static embeddings and treat words the same regardless of context.50–52
BERT, while excellent for understanding word context, is not directly optimized for sentence similarity. SBERT (Sentence-BERT) is a modification of the pre-trained BERT model (and similar Transformer models), specifically designed for generating sentence embeddings that accurately reflect semantic similarity, making it more suitable for clustering tasks. SBERT achieves this by fine-tuning BERT using a siamese or triplet network architecture, enabling it to produce embeddings where the distance between them corresponds to semantic similarity, making it a more direct and efficient approach.53,54 The introduction of models like BERT and SBERT has massively improved document clustering and semantic analysis, leading to significantly better quality and handling capabilities. These advancements allow us to understand and organize large text datasets with unprecedented accuracy and efficiency.53,55–61
With the approaches and scope defined above, the study is guided by the following Research Objectives (ROs).
In alignment with these objectives, the study addresses the following Research Questions (RQs).
To answer these questions, the paper is structured as follows: Section 2 explains the methodology and design of the research. The results are then presented in Section 3. Section 4 is a discussion of the results. Section 5 concludes with the main findings, final remarks, future research, and limitations.
This study provides a major contribution to capture the landscape of today’s food systems, particularly on how sustainability challenges intersect with rapid technological growth, through multi-tiered literature search strategy.2,3 Unlike traditional literature reviews, the strategy enables us to quantify publications which are normally masked by traditional single objective approach. For instance, prior studies have focused on individual objectives such as high level conceptual or managerial perspectives, for example green supply chain management framework62,63 or LCA on environmental sustainability of food processing 11 or even social-political context,64,65 majority with particular application of new technology to the system, such as IOT, deep learning and blockchain in SCM and planning,24,66,67 application of machine learning to LCA 68 and agriculture technology adoption,69,70 this research takes a more balanced, yet unique, interdisciplinary perspective with boarder keyword range covers popular and key concept well representing the context, to uncover patterns across commercial, technological, and sustainability-driven research. By answering our research questions, we create a complete, multi-faceted sketch of this segment.
By applying deep document clustering to bibliometric analysis and comparing these findings with those from conventional analytical approaches reveals that how data is processed can significantly change the story it tells, showing both the capabilities and drawbacks of different methods.
Overall, our work improves upon existing knowledge by thoroughly examining the less-studied conditions within the sustainability, technology, and food systems nexus, thereby connecting previously separate research efforts.
Methodology
Design scope for the study
This study focuses on three contexts of food system: the commercial (or the economic) contexts, the sustainability contexts, and the technological contexts within the food system. A multi-tiered literature search strategy2,3 is employed, focusing on specific key terms within each area. The terms are brainstormed and filtered internally with team members, and further cross check and verify with the number of publications with mere the terms from Scopus (data not shown), where the terms and selection are designed to match the criterias of: 1. Terms selected have to be general enough, yet specific, to represent or could represent the relevance of the document content to the context Definition 2. Selected terms collectively can capture the maximum space of the defined context 3. Terms are general, and layman enough to be highly possible chosen by general writer to represent the concept of the context. • The Commercial Context: This encompasses the demand and supply of food to the food system, terms shortlisted are “food supply,” “food product,” “food process,” “food system*”, and “food management.” This scope encompasses the entire food value chain, from upstream farming to midstream manufacturing and downstream distribution. • The Sustainability Context: This focuses on sustainability concepts, including terms such as “sustain*”, “ESG,” “life cycle assessment,” “management,” “eco*”, and “circular.” • The Technological Context: This focuses on the rapidly arising information technology in AI, ML, and optimization techniques. Terms shortlisted are “deep learning,” “unconstrained optimization,” “constrained optimization,” “natural language processing,” “artificial intelligence,” “large language model,” “objective optimization,” and “machine learning.”
The context, keywords, definitions and references of the keywords used.
Research design and configurations for multi-tiered literature search strategy
The primary database for this review is Scopus, chosen for its extensive coverage of peer-reviewed literature across multiple disciplines. To ensure comprehensive retrieval of relevant studies, Boolean operators (AND, OR) are used to construct multiple search strings that integrate commercial, sustainability, and technological contexts generating all relevant keyword combinations following the Cartesian product. The search is conducted within the title, abstract, and keywords fields, covering articles published up to 2023.
Keywords for Context, Boolean Configurations for Combining Search string and search results.

Research Design and Scheme of analysis.
The output is carried forward for bibliomeric analysis using Bibliometrix, through Biblioshiny, a web-interface app for Bibliometrix. 33 and procedure for deep language document clustering as per described by Makan et al. 27
Findings
Publications productivity
The bibliometric analysis of publication trends from 1987 to 2023 reveals a significant growth in research output over time (Figure 2). Initially, the number of articles published was extremely low, with only a few sporadic publications between 1987 and 2007. A notable increase began in 2008, when 11 articles were published, followed by moderate but consistent growth in subsequent years. The research output experienced a sharp rise starting in 2017 (21 articles), followed by a rapid acceleration in 2020 (79 articles). The most significant growth occurred between 2021 and 2023, with the number of publications rising from 148 in 2021 to 276 in 2023, marking the peak of research activity in the analyzed period. Annual distribution of publications from 1987 to 2023.
Most Influential Sources
Most influential sources.
Publications by country
Publications by country.
Top cited original articles
Top cited original articles.
Govindan (2014) ranks second with 503 citations for research on sustainable supply chain logistics, specifically the two-echelon multiple-vehicle location–routing problem for perishable food distribution. Similarly, Validi (2014) contributed another high-impact study (225 citations) on sustainable food supply chain distribution using a multi-objective approach.
Sharma (2020) ranks third with 371 citations for a systematic literature review on machine learning applications in sustainable agriculture supply chains. Lee (2020) also contributed (201 citations) to the AI-driven plant disease characterization field, reinforcing the relevance of artificial intelligence in agriculture.
Klerkx (2020) (297 citations) discusses the impact of Agriculture 4.0, focusing on technological transitions in food systems. Saurabh (2021) (265 citations) explores blockchain applications in agri-food supply chains, highlighting its role in enhancing traceability and sustainability.
Wei (2010) (281 citations) applies machine learning techniques to predict seafloor biomass distributions. Bottero (2011) (192 citations) examines decision-making for wastewater treatment using the Analytic Hierarchy Process and Analytic Network Process.
Zhang (2008) (211 citations) presents a study on automatic keyword extraction using conditional random fields, reflecting an important contribution to natural language processing (NLP) applications.
Most prominent topics
The bibliometric analysis of top keywords by frequency of occurrence reveals most prominent topics in sustainability, technological advancements, management systems, and commercial applications in the agricultural and food supply chain sectors (Figure 3, Table 6). Word cloud from the analysis. Top keywords by frequency of occurrence.
Sustainability Context: “Food supply” (404 occurrences) is the most frequently mentioned keyword, reflecting the significance of ensuring sustainable and efficient food distribution. Other high-frequency sustainability terms include “crops” (184), “agriculture” (160), “food security” (146), “sustainable development” (121), “climate change” (110), and “water supply”. 78 These indicate a strong research focus on balancing agricultural productivity with environmental and social concerns. Topics such as “land use”, 63 “water management”, 63 “forestry”, 52 and “environmental impact” 42 highlight broader ecological considerations.
Technological Context: “Machine learning” (284) and “deep learning” (187) are prominent keywords, showing the increasing role of AI-driven analytics in food and agriculture research. Other related terms such as “artificial intelligence” (166), “remote sensing” (136), “learning algorithms”, 72 “decision trees”, 66 and “neural networks” 43 emphasize the growing reliance on computational techniques for predictive analytics and automation. “Optimization”, 72 “multi-objective optimization”, 83 and “algorithm” 62 highlight the mathematical and engineering methodologies applied to agricultural and food production systems.
Commercial Context: “Food production”, 70 “crop yield”, 65 and “crop production” 54 emphasize research directed at improving agricultural efficiency and maximizing output. “Supply chains”, 46 “supply chain management”, 42 and “food processing” 40 illustrate the critical role of logistics and production in ensuring food sustainability.
It is noteworthy that the frequent appearance of “decision making” (111) and “decision support systems” 81 suggests that many studies focus on improving efficiency through structured analytical frameworks. “Learning systems” (118) and “forecasting” (116) are also recurrent terms, indicating a research interest in predictive modeling and adaptive management strategies. The findings highlight the prominent role of the management system and decision support system.
Other notable keywords including “China” (94 occurrences) appear frequently, suggesting a significant research contribution from Chinese institutions. Terms like “economic and social effects”, 47 “economics”, 40 and “human” 58 underscore the socio-economic dimensions of agricultural sustainability.
Thematic Map Network: The co‐occurrence analysis reveals two major thematic clusters that capture the multifaceted research landscape at the intersection of food supply, sustainability, and technology (Figure 4, Figure 5). Thematic clusters from thematic analysis. Thematic Network from thematic analysis.

Cluster 1: Food Supply This cluster is dominated by high-frequency keywords related to agriculture and technological applications within food systems, serving as the basic themes in the field (Figure 4, Figure 6). The term “food supply” appears 387 times and holds a high PageRank centrality (0.04,398), indicating its pivotal role in the network. Other prominent keywords include “machine learning” (194 occurrences) and “deep learning” (129), which underscore the integration of advanced AI techniques into agricultural practices. Additional terms such as “crops” (182), “remote sensing” (98), “learning systems” (118), and “forecasting” (103) further highlight the focus on precision agriculture, predictive analytics, and resource management. Notably, high betweenness centrality scores for keywords such as “deep learning” (51.92) and “remote sensing” (47.90) reveal that these technologies serve as crucial bridges that connect various subfields within this cluster. Other terms like “cultivation,” “decision trees,” and “crop yield” emphasize operational aspects and process optimization, which are essential for enhancing food security and sustainability.
Cluster 2: Artificial Intelligence This cluster centers on the broader application of AI in addressing sustainability challenges, acting as a niche themes by the timeframe the study is carried out. High-frequency keywords include “artificial intelligence” (135 occurrences), accompanied by terms such as “agriculture” (114), “food security” (119), “sustainable development” (114), and “decision making” (98). These terms suggest that AI is increasingly applied to optimize resource allocation, support decision-making, and drive sustainable practices across food systems. The betweenness centrality values, particularly for “decision making” (114.48) and “sustainable development” (84.88), indicate that these concepts are key connectors linking various interdisciplinary topics. Moreover, methodological terms such as “multiobjective optimization” (77) and “decision support systems” (81) further emphasize the analytical approaches used to address complex, multi-criteria sustainability challenges.
Collaboration Network
The bibliometric analysis reveals key authors contributing to the research domain, classified into multiple clusters (Figure 6). Cluster 1 has the most dominant presence, with authors such as Wang X, Wang Y, and Li Y showing high betweenness and PageRank values, indicating their strong influence and connectivity in the research network. Cluster 2 and Cluster 3 contain additional notable contributors like Zhang W, Zhang X, and Sun Y, demonstrating moderate influence within their respective groups. Other clusters, such as Clusters 4, 5, and 6, highlight a more dispersed network of researchers, while Cluster 7 exhibits high PageRank scores, suggesting a focused but impactful presence. Clusters 9 and 10 include authors with unique characteristics, such as Azzaro-Pantel C and Aguilar-Lasserre AA, showing closeness values of 1, indicating their specialized but potentially impactful research contributions. Collaboration Network.
Clusters analyzed from deep document clustering technique
The document clustering analysis, conducted across the domains of commercial, sustainability, and technology, yielded six distinct thematic clusters, each characterized by unique underlying themes and focus areas (Figure 7, Table 7). The first cluster, designated as Label 0 (Global Systems and Demand Challenges), adopts a macroeconomic perspective that emphasizes global and regional demand, system models, and systemic challenges, addressing large-scale economic and logistical issues such as demand forecasting and resource distribution. In contrast, Label 1 (Methodological and Problem-Solving Approaches) is highly technical, focusing on problem-solving, optimization, and the application of fuzzy logic, and is distinguished by its emphasis on analytical tools and decision-making methodologies critical for navigating complex, uncertain environments in both commercial and technological contexts. The second cluster, Label 2 (Environmental Practices and Supply Chain Sustainability), centers on sustainability within supply chains by highlighting environmental practices, carbon footprint management, and supplier relationships, thereby reflecting the growing importance of integrating sustainability into procurement and supply chain operations. Similarly, Label 3 (Sustainable Logistics and Distribution) addresses operational challenges, particularly in logistics and supply chain solutions, with an emphasis on sustainable transportation and distribution systems. Label 4 (Precision Agriculture and Data-Driven Approaches) shifts the focus toward agriculture, emphasizing precision farming, data analytics, and sustainable land use, which are essential for improving agricultural productivity and addressing food security challenges. Finally, Label 5 (Energy and Industrial Technology) explores industrial applications and energy systems, with a focus on energy efficiency, technological innovation, and decision-making in industrial settings, particularly emphasizing renewable energy and sustainable industrial practices. Clustering result from document clustering. Clusters analyzed from deep document clustering technique.
Discussion
Below findings provide sufficient information in answering RQ1
Scientific Publications: The observed trend in publication output indicates a delayed but exponential growth in research activity within the field. The near absence of publications before 2008 suggests that either the topic is not a major research focus or the technological and methodological advancements required to support significant academic contributions have not yet been developed. The increase in publications from 2008 onward likely reflects growing interest in the field, driven by technological advancements, increased awareness of sustainability issues, or policy shifts encouraging research in related domains.
A major turning point occurred after 2017, where publication numbers grew sharply, indicating a phase of accelerated research and knowledge dissemination. The 21st Conference of the Parties in Paris in 2015, Parties to the UNFCCC reached a landmark agreement to combat climate change and to accelerate and intensify the actions and investments needed for a sustainable low carbon future marked a significant progress, follow by the exponential rise from 2020 onward suggests that recent global challenges, such as climate change, digital transformation, outbreak of Covid-19 and the following supply chain disruptions, may have stimulated research efforts. The surge in 2021-2023 could also be attributed to the increasing adoption of advanced methodologies, such as artificial intelligence, data analytics, and automation, especially years around 2023 when OpenAI in AI began to soar in popularity. Overall, the result has confirmed the hypothesis for RQ1. However, it is worth noting that the number of publications at the peak of 2023 at 276 publications are still considered low, compared with hot topics which can easily exceed 20k publications annually, from Scopus database only (data not shown). This indicates that the segment remains underexplored by the moment this study is carried out.
Most Influential Sources: The dominance of Remote Sensing as the most influential journal suggests a strong research focus on geospatial technologies, remote sensing applications, and environmental monitoring. The presence of Computers and Electronics in Agriculture in the second position reflects the increasing role of digitalization, artificial intelligence, and automation in modern agricultural practices.
The prevalence of Elsevier and MDPI as key publishers indicates the continue gorwing importance of open-access and high-impact publishing platforms in the field. Switzerland, as the home of MDPI, and the Netherlands, as the base of Elsevier, reinforce the European influence in scholarly research and publication hub, in contrast to Publications by Country, where researches are led by China, the USA and India.
Journals such as Journal of Cleaner Production and Science of the Total Environment emphasize sustainability and environmental science, demonstrating a shift towards integrating green technologies and sustainable practices across industries. The inclusion of multidisciplinary journals such as IEEE Access and Scientific Reports highlights the interdisciplinary nature of current research, bridging engineering, environmental science, and computational approaches.
Overall, the findings indicate a well-balanced distribution of research across remote sensing, environmental science, agriculture, and digital transformation, reflecting the growing integration of technology in addressing sustainability challenges, which align with the hypothesis.
Publications by Country: The dominance of China in research publications aligns with its increasing investment in technological advancements and scientific research. The USA, maintaining its position as a global research leader, follows closely, demonstrating a strong research ecosystem driven by top-tier universities and institutions. India’s ranking in third place highlights its growing presence in academic research, particularly in fields such as agriculture, environmental science, and sustainable development.
European nations, including the UK, Italy, and Germany, maintain a substantial contribution, reflecting their strong research infrastructure and international collaborations. Iran’s presence in the top 10 suggests significant advancements in scientific research, despite geopolitical and economic challenges.
The contributions from Malaysia, Mexico, and South Korea indicate emerging research efforts from developing economies, demonstrating a shift towards global knowledge production. Meanwhile, Middle Eastern countries such as Saudi Arabia and Iran reflect regional efforts to strengthen academic and scientific capabilities.
Overall, we found that other than the traditional research house such as China and the USA continue to lead, emerging economies are increasingly contributing to the global scientific landscape. The rsults also indicate that, while Western countries remain as the leading hub for publications, China and India are leading many emerging countries catching up on this.
Below findings provides sufficient information in answering RQ2
Top Cited Original Articles: The analysis of the most cited articles provides valuable insights into research trends and the growing influence of artificial intelligence, sustainability, and supply chain optimization in modern scientific studies.
The top two most cited papers highlight the intersection of AI with agriculture and supply chain management. The success of Wang (2017) and Sharma (2020) underscores the increasing reliance on deep learning and machine learning for real-time decision-making and efficiency improvements in food production and logistics. AI-driven applications in plant disease detection (Wang, 2017; Lee, 2020) are particularly impactful in advancing precision agriculture and minimizing losses.
Govindan (2014) and Validi (2014) demonstrate the role of advanced logistics models in sustainable food distribution, particularly for perishable goods. This aligns with the rising global emphasis on reducing food waste and enhancing efficiency within supply chains. Blockchain technology (Saurabh, 2021) further adds to this discourse, offering decentralized solutions for improved traceability and transparency in agri-food systems.
The discussion by Klerkx (2020) on Agriculture 4.0 highlights the increasing complexity of managing diverse emerging technologies in food systems. His work stresses the need for strategic governance in transitioning towards more sustainable and responsible agricultural practices.
The contributions by Wei (2010) and Bottero (2011) indicate the importance of machine learning in ecological predictions and environmental decision-making. With increasing global concerns over resource management, these studies reinforce the role of AI-driven analytics in sustainability efforts.
The work by Zhang (2008) demonstrates early advancements in NLP for automatic keyword extraction, which is foundational in modern AI-driven data processing and information retrieval systems.
The highly cited articles collectively reflect the evolution of AI, supply chain optimization, and sustainability research. The increasing citation counts of studies in these areas indicate their growing relevance in tackling real-world challenges related to agriculture, food supply chains, and environmental management. Moving forward, integrating emerging technologies such as AI, blockchain, and decision-making frameworks will be crucial in shaping the future of sustainable food production and logistics. The result also reflect the diversity of the research interest within the research areas, seeing a focus on agriculture study, implying that other research areas remain underexplored with big rooms of opporutnity for improvements.
Word Cloud: The keyword analysis provides valuable insights into the dominant research areas, showcasing the interplay between sustainability, technological innovations, decision-making frameworks, and commercial applications.
Sustainability as a core theme
The prominence of sustainability-related keywords highlights the ongoing global emphasis on food security, resource conservation, and climate change adaptation. The frequent occurrence of “food supply” (404) and “food security” (146) suggests that researchers are prioritizing ways to ensure stable and sustainable food distribution. Additionally, concerns over “water management” 63 and “land use” 63 indicate the necessity of balancing agricultural expansion with environmental preservation.
The rise of AI and automation in agriculture
The strong presence of machine learning (284), deep learning (187), and artificial intelligence (166) demonstrates the rapid adoption of computational techniques in agricultural research. AI-driven methodologies are being leveraged for tasks such as remote sensing (136), predictive modeling (learning algorithms: 72), and automated decision-making (decision trees: 66). The application of optimization techniques 72 further emphasizes the push for efficiency in food production and supply chain logistics.
The importance of decision support systems
The presence of keywords like “decision making” (111) and “decision support systems” 81 suggests that many studies focus on enhancing strategic planning and operational efficiency. Forecasting (116) is also a key topic, indicating that predictive models play a crucial role in managing uncertainties in agriculture and food supply chains.
Economic and commercial considerations
The frequent mention of “food production”, 70 “crop yield”, 65 and “supply chains” 46 suggests a strong commercial orientation in agricultural research. The focus on supply chain management 42 highlights the importance of logistics and efficiency in ensuring food availability. Additionally, keywords like “economic and social effects” 47 and “economics” 40 reflect broader considerations of financial sustainability and social impact.
Regional and contextual relevance
The high occurrence of “China” 94 suggests that a substantial portion of research originates from or focuses on China, reinforcing the country’s role as a leading contributor to agricultural and technological studies. The mention of “human” 58 and “humans” 43 suggests that studies are also exploring the human dimensions of sustainability, particularly in terms of labor, policy-making, and socio-economic impacts.
The bibliometric keyword analysis indicates that sustainability, technological advancements, and efficient decision-making frameworks are the dominant themes in agricultural and food supply chain research. The increasing role of AI and machine learning suggests a paradigm shift towards automation and predictive analytics, while the focus on supply chain management underscores the importance of logistics and efficiency in food production. Moving forward, research in this area is likely to continue integrating advanced technologies with sustainability strategies to address global food security challenges.
Co-occurrence Analysis: On the other hand, co‐occurrence analysis provides a robust understanding of how research in food systems, sustainability, and technology is structured into distinct yet interconnected thematic areas.
In Cluster 1, the prominence of terms like “food supply,” “machine learning,” and “deep learning” illustrates the critical role of technology in modern agriculture. These keywords reveal that recent research has increasingly focused on integrating AI techniques into agricultural operations to improve crop management, optimize resource use, and enhance overall food security. The high betweenness centrality of deep learning and remote sensing underscores their role as bridges, facilitating the flow of information between traditional agricultural practices and innovative technological methods. This cluster clearly reflects an operational shift, where traditional production metrics are now complemented by advanced, data-driven approaches to meet the evolving challenges of food security in a sustainable manner.
In Cluster 2, the focus shifts towards the strategic and systemic application of AI to sustainability challenges. Here, keywords such as “artificial intelligence,” “sustainable development,” and “decision making” illustrate how AI is not merely a technical tool but a strategic enabler that helps address complex, multidisciplinary issues. The integration of multiobjective optimization and decision support systems indicates that researchers are developing sophisticated frameworks to balance economic, environmental, and social objectives. This cluster highlights the transition from conventional production-centric approaches to more holistic, integrated strategies that are essential for sustainable development. The high connectivity of these terms suggests that AI-driven methodologies are crucial for designing systems that can navigate trade-offs and uncertainties inherent in sustainability challenges.
Interestingly, commercial aspects like “forecasting,” “food processing,” and “supply chain management” suggest a focus on optimizing food production and distribution. The presence of “multiobjective optimization” and “economic and social effects” indicates efforts to balance sustainability with economic viability, ensuring that food supply strategies remain cost-effective while minimizing environmental impact.
Overall, the findings from the co‐occurrence network analysis underscore the dual role of technology in food systems. On one hand, technological innovations—particularly in AI, machine learning, and remote sensing—are driving operational improvements in agriculture, as evidenced in Cluster 1. On the other hand, these technologies are being leveraged at a strategic level to address broader sustainability challenges, as shown in Cluster 2. This multifaceted approach reflects the evolving nature of research in this area, where interdisciplinary integration is key. The results not only validate the growing influence of AI in optimizing food production and resource management but also emphasize the need for continued collaboration across disciplines to further refine and implement these technologies for sustainable development.
These insights have significant implications for both academic research and practical applications. For researchers, the analysis provides a roadmap for exploring underdeveloped areas, such as advanced AI methodologies for sustainability optimization. For practitioners and policymakers, the findings highlight the potential of integrating data-driven approaches to improve decision-making and operational efficiency within food systems. Implying more comprehensive models that can better predict, manage, and optimize the complex interactions between technology and sustainability in the global food industry for future research.
While sections make a good sketch for RQ3
Collaboration Network Analysis: The results suggest that certain authors, particularly in Cluster 1, play a central role in shaping the research field, likely acting as key knowledge hubs. Their high betweenness values indicate their importance in connecting various research groups. Meanwhile, authors in Clusters 2 and 3 exhibit moderate influence, possibly contributing to subdomains within the broader research field. Clusters 4 to 7 show a more distributed network, with some authors having higher PageRank scores, signifying their citation importance despite lower betweenness values. This suggests a mix of emerging researchers and established authors with niche expertise. The presence of authors in Clusters 9 and 10 with unique closeness characteristics implies that their research may be highly specialized, potentially contributing significant insights within specific subfields rather than acting as intermediaries.
These findings indicate a well-structured research landscape with both dominant contributors and emerging researchers playing essential roles in advancing the field.
For RQ4 which aims to further uncover emerging trends and latent themes by applying deep document clustering techniques for advanced literature analysis. The clustering results reveal significance in scope, focus, and domain emphasis, underscoring the multidisciplinary nature of the intersection among commercial, sustainability, and technology domains. The macroeconomic orientation of Label 0 contrasts sharply with the technical and methodological focus of Label 1, while Labels 2 and 3 reflect a convergence of environmental and operational concerns in supply chain sustainability and logistics, respectively. Labels 4 and 5 further illustrate the importance of industry-specific applications, with Label 4 emphasizing the role of precision agriculture and data-driven approaches in enhancing food production and security, and Label 5 highlighting the critical function of energy-efficient and innovative technologies in industrial settings. These findings underscore the necessity for tailored, interdisciplinary approaches that integrate sustainability into commercial and technological practices in order the capture the entire landscape. They also suggest that future research should address the unique challenges and opportunities within each thematic area, thereby fostering innovation and supporting strategic and operational improvements in organizations operating within a contemporary global business environment. The increasing complexity of global food supply chains, climate change, and resource management necessitates an integrated, multidisciplinary approach that bridges engineering, business, and management disciplines. This study, based on a thematic map analysis, identifies two major clusters—Food Supply (Label 1) and Artificial Intelligence (Label 2)—both of which highlight the transformative potential of AI-driven technologies in enhancing sustainability, optimizing operations, and driving strategic advancements in the global business environment. The findings underscore the critical role of AI in decision-making, operational efficiency, and policy integration, aligning with the journal’s objective of promoting holistic, sustainable technology management in business.
Distinct clustering result from BERT
The application of deep document clustering has yielded a markedly different and more distinct thematic structure compared to traditional bibliometric methods such as those implemented through Biblioshiny. While Biblioshiny primarily relies on top keyword frequencies and co-word networks (as seen in Table 6 and Figures 4 and 5), these methods often result in overlapping clusters with more ambiguous thematic boundaries. For example, take the top 3 most frequent keywords from the two clusters using Biblioshiny, Cluster 1 = food supply, machine learning and crops, Cluster 2 = artificial intelligence, food security and agriculture, they are highly similar and overlapped which does not differentiate the theme from each other. In contrast, deep NLP-based clustering techniques extracted six thematically rich and well-separated clusters, each characterized by unique and context-sensitive keyword sets. This indicates a higher granularity and interpretive clarity than the two broad and intersecting clusters derived from Biblioshiny.
This discrepancy highlights the power of NLP in semantic text analysis, which does not merely count keywords but interprets their contextual relationships, co-dependencies, and syntactic structures. NLP-based clustering, particularly when leveraging transformer models such as BERT or SBERT, captures subtle semantic distinctions that traditional term-frequency methods often overlook. Similar effects are observed in other studies, with NLP showcasing its strong semantic analytical potential within and beyond bibliometric analysis field38,39,42,48,55,88,89
Given the growing scale and complexity of scholarly corpora, traditional bibliometric tools may no longer suffice in capturing the full thematic richness and interdisciplinary evolution. NLP-based document clustering is poised to become not just a complementary approach, but a core analytical method in future bibliometric analyses. Its ability to semantically segment large textual datasets into non-overlapping, coherent themes provides significant advantages in mapping intellectual structures, detecting emerging research areas, and supporting strategic decision-making for funding and policy development.
How emerging technologies and sustainability practices shape the food systems?
The outcomes of the deep document clustering and bibliometric analysis strongly align with the central objective of this study—to explore the landscape of thematic evolution and interconnections between sustainability, technological advancement, and commercial food systems. The resulting clusters reveal an increased concentration of research at the intersections of these high-impact domains, indicating that this space is gaining traction. However, the relatively low volume of output within some intersectional clusters suggests that while interest is growing, the research landscape remains underexplored, leaving significant room for scholarly contribution. This is particularly relevant in interdisciplinary zones where sustainable development and digital transformation converge with operational and policy dimensions of food systems.
In addition, despite the global relevance of food systems, the bibliometric network reveals that a large portion of scholarly outputs and collaborations are concentrated in three countries: China, the United States, and India. These nations dominate the field in terms of publication volume and author centrality, indicating that they have strategically aligned national research agendas and policy frameworks to reinforce research and innovation in agri-tech, food security, and sustainable development. For example, China’s 14th Five-Year Plan, the U.S. Farm Bill 2023, and India’s National Mission on Sustainable Agriculture have placed innovation at the forefront of food system transformation. This geographical consolidation, while productive, also suggests potential risks related to regional research bias and a lack of representation from Global South or low-income countries, where food system vulnerabilities are most pronounced. 90
Furthermore, although the document clustering analysis revealed a diverse range of research interests, a significant portion of publications focus primarily on agriculture—particularly topics such as crop performance, agroecology, and environmental input efficiency. While agriculture is indeed foundational, this narrow focus may overshadow other critical components of the food system, including post-harvest processing, cold chain logistics, food retail innovation, operations strategies, and food waste management. It is well recognized that sustainability challenges are increasingly emerging not just at the farm level, but across the entire food value chain, from input sourcing to consumer behavior. 91 This underscores a need for broader and more balanced research that explores system-wide sustainability transformations and techno-commercial innovations across diverse operational contexts.
In sum, the current research landscape reflects both progress and fragmentation: while advances in NLP-based bibliometric techniques are uncovering clearer thematic boundaries and untapped intersections, the field still suffers from thematic and geographical concentration. There is a critical need for more integrative, system-wide, and globally representative research that explores how commercial imperatives and sustainability goals can be aligned across the entire food system—not just in agricultural production but across logistics, policy, retail, and consumption.
Conclusions
This study advances our understanding of how sustainability imperatives and rapid technological developments are reshaping the commercial food systems landscape. By employing advanced bibliometric techniques—especially natural language processing (NLP), deep document clustering, and BERT-based models—this research reveals clearer thematic distinctions and more granular knowledge structure than traditional methods like Biblioshiny alone. The clustering analysis confirmed that while research interest in the intersection of sustainability, technology, and food systems is growing, significant thematic and geographic gaps remain.
The dominance of contributions from countries like China, the United States, and India suggests strategic policy-driven investment in sustainable food system innovations. However, the thematic concentration on agriculture—while important—indicates a pressing need for further exploration into broader areas of the food value chain, including logistics, retail, consumer behavior, and circular practices. Moreover, the integration of sustainability in commercial settings remains a complex challenge, especially where economic feasibility and policy alignment are still maturing.
Overall, this research demonstrates the critical potential of NLP-driven bibliometric analysis to uncover hidden patterns, emerging themes, and knowledge gaps. It provides both theoretical implications for future academic inquiry and practical insights for policy makers and industry stakeholders seeking to navigate the complex interface of sustainability and technology in food systems. As sustainability becomes not just a regulatory expectation but a competitive differentiator, aligning research, policy, and industry strategy will be pivotal in building resilient, responsible, and future-ready food systems.
Limitation and future work
Despite offering valuable insights, this study has several limitations that should be acknowledged. First, the bibliometric analysis is based solely on the Scopus database, which, while comprehensive, may have excluded relevant publications indexed in Web of Science, PubMed, IEEE Xplore, or Google Scholar. This single-source reliance may introduce publication bias and limit the generalizability of the findings across broader disciplines and geographies.
Second, the search string is developed through internal brainstorming rather than systematic keyword harvesting or expert recommendation procedure. As a result, potentially relevant documents may have been omitted due to variations in terminology, especially in an interdisciplinary field that spans sustainability, food systems, and technological innovation.
Third, the temporal scope and citation-based metrics used in the clustering methods may underrepresented emerging research, particularly recently published work that has not yet accumulated citations.
Furthermore, this study primarily focused on co-authorship, co-citation, and keyword co-occurrence patterns. Future research could benefit from integrating altmetric indicators, patent data, or policy documents to offer a more holistic view of influence and practical application. Also, integrating geospatial analysis and institutional affiliations could shed light on how knowledge flows across global innovation systems.
Future studies are encouraged to adopt multi-database strategies, use a more systematic keyword expansion techniques, and apply other clustering models.92–94 Such advancements could yield richer, more accurate representations of knowledge landscapes, particularly in fast-evolving and multi-sectoral domains like food systems.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
