Abstract
Self-Driving Manufacturing Labs (SDMLs) are emerging as a transformative approach to experimental manufacturing research, offering the ability to automate and optimize complex workflows with minimal human intervention. This paper defines a novel conceptual framework for SDMLs, systematically distinguishing between automation—the coordinated execution of experimental tasks through integrated hardware and software—and autonomy, the system’s ability to make data-driven decisions using machine learning and optimization algorithms. We decompose automation into four core components: materials design or manufacturing, property characterization, materials handling, and inter-machine communication. Autonomy is structured around data collection, surrogate modeling, and Bayesian optimization, enabling systems to adaptively choose optimal experimental conditions. The primary contribution of this work is the structured definition of this framework illustrated by examples, which is shown to be generalizable across different manufacturing domains, providing a modular blueprint for the design and implementation of next-generation self-driving laboratories. The paper concludes with a discussion of future directions for advancing automation, autonomy, and scaling SDMLs across broader applications in intelligent manufacturing.
Keywords
Introduction
The traditional approach to materials design and manufacturing process optimization is often time-consuming, labor-intensive, and relies heavily on human intuition and trial-and-error. The complexity of developing new materials, such as high-entropy alloys or advanced ceramics, involves navigating vast compositional and processing spaces, making manual experimentation costly and resource-intensive.1,2 For example, it is estimated that nearly 200 million unique multi-principal element alloy combinations exist when considering systems with three to six constituent elements. Yet, from 2004 to 2017, only 122 high-entropy alloy (HEA) systems were experimentally reported, highlighting the substantial limitations of conventional experimental approaches in efficiently exploring such vast compositional landscapes. 3
The concept of fully automated, autonomous research systems, often termed Self-Driving Laboratories (SDLs), is a paradigm shift intended to accelerate the pace of scientific discovery and engineering innovation. 4 To specifically address the application of this concept to the closed-loop design, optimization, and production of solid-state materials, we introduce the term Self-Driving Manufacturing Labs (SDMLs) in this paper. 5 These systems integrate automation hardware, analytical instruments, and sophisticated control software driven by machine learning algorithms to execute a closed-loop scientific process without continuous human oversight.6,7 The integration of Artificial Intelligence (AI), Machine Learning (ML), is integral in SDML to analyze vast datasets from experiments and simulations, identifying patterns and optimizing parameters with unprecedented speed and accuracy. 8 For instance, Myung et al.9,10 demonstrated an autonomous material extrusion system using Multi-Objective Bayesian Optimization (MOBO) to optimize four printing parameters. The system reached a high objective score (0.94) in under 100 iterations, compared to ∼10,000 required in a traditional design. Operating at 30–60 iterations per hour, it showcased how SDMLs can drastically accelerate and streamline manufacturing optimization.
Automation in manufacturing and experimental science has evolved through distinct phases, generally classified by their flexibility and decision-making capacity. These phases range from Fixed Automation (dedicated equipment for high-volume, single-product manufacturing) and Programmable Automation (machines controlled by coded programs for batch production of different products), to Flexible Automation (systems easily reconfigured to produce various products with minimal downtime).11,12 A SDML, represents the highest and most advanced level: Intelligent or Adaptive Automation.13,14 The key distinction of SDMLs is the incorporation of a closed-loop intelligence layer that dynamically modifies the execution program based on real-time data and optimization algorithms. In essence, an SDML not only executes tasks but also designs the next experiment itself, moving beyond mere programed sequence execution. This distinction is fundamental to our conceptual framework. 15
The SDML exemplifies the evolution toward intelligent and cognitive manufacturing systems. Cognitive manufacturing systems are characterized by their ability to perceive, learn, and adapt in real-time, enabling autonomous decision-making and continuous process optimization. This paradigm shift is facilitated by technologies like cyber-physical systems, the Internet of Things (IoT), and advanced data analytics, which collectively enhance the responsiveness and efficiency of manufacturing processes. The integration of SDMLs into this framework underscores the move toward more intelligent, interconnected, and adaptive manufacturing environments especially in the design stage.16–19
While the concept of closed-loop experimentation is not new, a standardized, generalized, and modular conceptual framework is essential for widespread adoption. The primary research gap addressed by this work is the need for a formally structured, domain-agnostic framework to guide the robust design, implementation, and educational adoption of autonomous systems for material design and process optimization. The core idea of an SDML is the integration of two distinct, yet interdependent, capabilities: Automation (the physical execution of tasks) and Autonomy (the intelligent optimization loop). The main contribution of this paper is the introduction of a comprehensive framework that systematically separates and defines the necessary elements for both the automation and the autonomy of an SDML across diverse manufacturing domains, particularly for solid-state materials.
Specifically, this work offers the following novel contributions:
A structured, four-component decomposition of the Automation layer (Materials Manufacturing, Property Characterization, Materials Handling, and Inter-Machine Communication) that covers the entire physical experimental loop.
A clear, three-component definition of the Autonomy layer (Data Collection, Surrogate Modeling, and Bayesian Optimization) which enables adaptive, data-driven decision making.
A critical discussion on the future directions of SDMLs, including the necessary architectures for distributed and decentralized control.
The remainder of this paper is structured as follows: Section 2 presents a conceptual and generalizable framework for SDML’s. Section 3 and 4 present this framework in detail, defining the components of automation and autonomy. In these sections, we presented a literature review in a selective and illustrative manner, designed to establish the background, necessary components, and state-of-the-art implementations of current SDML’s. At the end, Section 5 addresses research gaps and future directions discussing critical topics such as the generalizability of the framework and distributed and decentralized control.
Conceptual framework for self-driving manufacturing labs
The core idea of an SDML is the integration of two distinct, yet interdependent, capabilities: Automation and Autonomy.
The proposed SDML framework is formally structured as a closed-loop system, represented by the unified set
Automation layer: The structured execution cycle (
)
The Automation Layer
(1) Materials Manufacturing/Design (
(2) Materials Handling (
(3) Property Characterization (
(4) Inter-Machine Communication (
The output of the Execution Layer, the pair of process/design parameters and measured data (
Autonomy layer: The optimization methodology (
)
The Autonomy Layer
(1)
(2)
(3)
The core methodological advance is the rigid structure of the closed loop: the Execution Layer executes the experiment, and the Optimization Layer systematically designs it. This formal decomposition provides a structured design methodology that can be rigorously implemented and validated. Figure 1 illustrates the formal, structured flow of the SDML framework. The cycle begins in the Automation Layer with Component 1: Materials Manufacturing/Design (

Image of SDML conceptual framework showing a closed loop with automation components feeding into autonomy components, which then feeds back to manufacturing.
Automation: The execution layer
Automation refers to the physical and software-based infrastructure required to execute a manufacturing or experimental cycle reliably and repeatedly. This layer corresponds to the implementation of Flexible/Programmable Automation principles to achieve robust and repeatable execution. 20 We propose that the automation layer must be composed of four essential, interoperable components:
(1)
(2)
(3)
(4)
These elements collectively enable a single iteration encompassing design, manufacturing, and property measurement, all executed in a fully automated manner. It is important to note that while this automation framework efficiently handles the execution of tasks, the decision-making process for subsequent experiments is addressed separately in the
Materials design and manufacturing process
Manufacturing systems in SDMLs exhibit diverse levels of automation and control, shaped by the nature of the processed materials and the capabilities of the hardware platforms employed. While some setups rely on manual or locally controlled operations, others achieve higher degrees of autonomy through integration with sensors, API-based communication protocols, and software-controlled feedback loops. This section selectively reviews the types of solid-state materials fabricated in SDMLs, the manufacturing techniques applied, and the degree of automation implemented in each case.
A variety of materials have been manufactured in SDMLs using additive and subtractive techniques. For example, Xue et al. 21 employed a Carbon™ M2 SLA 3D printer to process Liquid Rigid Polyurethane (RPU) resin and Silicone double networks (SilDn), albeit through manual or local control. Similarly, Graphene-enhanced Acrylonitrile Butadiene Styrene (ABS) was fabricated using the Hyrel System 30M 3D printer via Fused Filament Fabrication (FFF), although this system lacked automated control mechanisms. 22
In metal-based systems, Laser Powder Bed Fusion (LPBF) was used by Zhang et al. 23 to produce Ti-6Al-4V titanium alloy components, a process commonly equipped with automated monitoring capabilities. Other studies utilized Vacuum Arc Melting (VAM) to synthesize high-entropy alloys such as Ti-V-Nb-Mo-Hf-Ta-W, though the reports lacked details on automation or communication protocols. 3 Maraging steel, a class of ultra-high-strength, ultra-low-carbon, iron alloys, was also fabricated via Powder Bed Fusion, but similarly, no system-level automation was specified. 24
Liquid-phase processes have also been leveraged to fabricate solid-state materials. Colored dyes, metal nanoparticles, and perovskite semiconductors were deposited using liquid dispensing techniques that solidify through evaporation or annealing. In one example, these materials were printed using a Monoprice Select Mini V2 3D printer controlled by an Arduino-based microcontroller, enabling some level of programmable automation. 25
Automation has been particularly notable in polymer-based systems. Thermoplastics such as TPU-1, TPU-2, TPU-3, TPE, PLA, PETG, and Nylon were processed using the MakerGear M3 3D printer, which achieved a high level of automation, operating with minimal human intervention. 5 Another study used Creality Ender-3 FDM 3D printer equipped with a direct drive print head for thermoplastic extrusion, although details on their automation capabilities were not provided. 26
One of the most advanced and autonomous systems reviewed utilized the Creality CR-20 Pro FFF 3D printer integrated with OctoPrint via a Raspberry Pi, enabling real-time monitoring, remote access, and automated print control. 27 Such systems highlight the transition from conventional 3D printing toward fully autonomous manufacturing cells.
Software tools played a crucial role in enabling this automation. For instance, OctoRest provided remote API access to 3D printing hardware, while PySerial was used for establishing serial communication with microcontrollers, demonstrating seamless API-driven integration in SDML environments.28–30 Slic3r software was also utilized to generate G-code for 3D printing workflows. 31
Overall, these studies reflect a spectrum of automation levels across manufacturing platforms in SDMLs—from manual or semi-automated setups to fully autonomous, closed-loop systems. The choice of materials, hardware capabilities, and software interfaces collectively determine the extent of automation achievable, underscoring the need for cohesive integration of manufacturing hardware with smart control systems in the pursuit of autonomous materials discovery. Table 1 provides a concise overview of the diverse processes employed in self-driving manufacturing literature. It summarizes the key materials design and manufacturing methods discussed in this section, categorizing them by materials/process, technique, and automation level.
Summary of material-process combinations and associated automation levels reported in recent studies.
Materials characterization and property measurement
In SDMLs, the characterization and property measurement stage serves as a critical feedback mechanism, enabling the evaluation of material performance and informing subsequent experimental iterations. This step provides the values for objective functions that guide Bayesian optimization loops, ultimately allowing the system to adjust manufacturing parameters in real time to optimize desired material properties. A variety of mechanical, structural, thermal, and optical properties have been assessed in prior SDML studies, using both traditional laboratory instruments and automated sensing platforms.
Mechanical Property Evaluation: Standard mechanical properties such as Young’s modulus, shear modulus, and Poisson’s ratio were evaluated to determine the elastic response of materials. 21 More advanced assessments included tensile testing for ultimate tensile strength and yield strength, conducted using the Instron ASTM D638 system. 22 Fatigue testing, specifically rotating bending fatigue tests, was used to characterize long-term durability, also facilitated by standardized mechanical testing platforms. 24 Uniaxial compression testing was employed to measure material response under load, including parameters such as energy absorption (K) and specific energy absorption (SEA). 5 Nanoindentation using a Nanomechanics iMicro2 system with a Berkovich indenter enabled high-precision evaluation of mechanical modulus, normalized by material density. Vickers microhardness was measured with a LECO LM-100 system to assess localized strength and resistance to deformation. 3 Additional tests such as compressive modulus and printing time were evaluated using the WDW-4204 microcomputer-controlled electronic testing machine. 32 In other studies, computational models were used to simulate strain-hardening behavior—an important metric for ductility and formability. 33
Surface and Microstructural Characterization: Surface quality was assessed using digital microscopy (Opti-Tek Scope OT-HD, 200 × magnification) to measure average surface roughness. 22 Additional structural insights were gained using surface profilometry and scanning electron microscopy (SEM), which revealed details about corrosion resistance and oxide layer development. 34 Surface profilometry was also applied to analyze droplet morphology, including shape and thickness during liquid-phase material deposition. While API usage was not always reported, hardware such as the Resonon Pika L hyperspectral camera, and profilometers are known to support digital data streaming, suggesting strong potential for automation. 25
Thermal properties: Thermal Gravimetric Analysis (TGA) was used to verify graphene content in polymer composites, offering insights into filler dispersion and thermal stability. 22 For metal additive manufacturing, thermal performance was measured through metrics such as average displacement (AD) due to cooling distortion, and melt indicators (MI), which quantified the fraction of time laser temperatures exceeded 1600 K during Laser Powder Bed Fusion (LPBF) processing. 35
Process control: Some SDMLs employed computer vision techniques to monitor process quality. For example, warp severity—an indicator combining bounding box area, detection count, volume, confidence score, and aspect ratio—was evaluated using a USB camera (JL Corporate 2K Optic Webcam) to analyze in-process deformations. These real-time image processing techniques enabled closed-loop control based on visual feedback. 27
Automation and Communication Infrastructure: Several systems integrated API-driven communication for real-time data acquisition and automated characterization. In some instances, software such as Instron Bluehill, MATLAB, and AutoCAD were used to operate or analyze results from characterization systems, although explicit API integration was not always documented. 5
These studies collectively illustrate the diversity of characterization methodologies used in SDMLs—from conventional mechanical testing and microscopy to cutting-edge computational modeling and machine vision. The degree of automation and integration varies across systems, with some platforms operating under full API control and others relying on manual or semi-automated workflows. Nevertheless, each contributes essential feedback for property optimization and closed-loop materials design. Table 2 provides a structured, illustrative summary of various methods and tools used across SDMLs to characterize mechanical, surface, thermal, and process parameters, as elaborated in the preceding paragraphs.
Summary of techniques and automation strategies for material property evaluation in SDMLs.
Materials handling
Materials handling is a critical component in SDML systems, responsible for transferring parts between different stages of the workflow—most notably from manufacturing equipment to characterization and property measurement instruments. The degree of automation in materials handling varies widely across SDML implementations, reflecting the diversity of technological integration and system maturity.
In several studies, material handling remained largely manual. For instance, some systems automated the 3D printing process but required manual intervention for transferring samples to testing equipment or for further processing. 21 Similarly, Liu et al. 22 described a twin-screw extrusion system where material feeding and handling were predominantly manual, highlighting the limited scope of automation in certain experimental workflows. In many cases, the lack of explicit documentation on materials handling suggests manual transfer of components between fabrication and evaluation stages. These systems typically fall under low-autonomy categories, where human operators are essential for coordinating workflows and performing post-processing tasks.
Conversely, higher levels of automation have also been demonstrated. One notable example integrated a robotic handling system that used a webcam for real-time visual tracking. This system communicated through URScript with MATLAB as the central control platform, enabling fully automated, closed-loop material transport with minimal human intervention. 5 Such configurations represent a high level of SDML automation, in which robotic components handle part movement seamlessly within the experimental cycle.
In contrast, several simulation-based studies did not involve any physical material manufacturing and handling, as the entire optimization loop was conducted in silico.23,36 In these systems, low levels of automation were implemented, relying on human operators for slicing, parameter selection, and data collection. 37
Innovative approaches to automation have emerged, even in systems without full robotic integration. One study developed an in-house Automated Part Remover, which repurposed the gantry motors of a 3D printer to remove completed prints and deposit them into a collection bin. Although focused on post-processing, this mechanism demonstrated a creative application of embedded automation to reduce manual handling demands. 27
Other hybrid systems, such as the Archerfish platform, 25 employed partial automation. While the printing process was automated with continuous fluid mixing and dispensing, post-printing steps still required manual handling due to the absence of robotic sample collection or a fully closed-loop feedback system.
Collectively, these studies reveal a broad spectrum of material handling strategies in SDML research, ranging from fully manual processes to advanced robotic systems. The extent of automation often correlates with the system’s autonomy level, and where implemented, sensor-based monitoring and API-enabled communication enhance real-time integration and operational efficiency. Future SDML developments may benefit from incorporating self-directed materials handling as a fundamental component of autonomous experimental workflows.
Autonomy: The decision-making layer
In contrast to automation—which focuses on executing predefined tasks—autonomy in SDMLs emphasizes the system’s ability to make and act on decisions without human intervention. Autonomy enables an SDML to analyze data, understand complex input-output relationships, and determine optimal experimental or processing conditions using advanced algorithms. Through this approach, the lab transitions from being a reactive executor to a proactive decision-maker.38,39 These capabilities are central to the emerging paradigm of cognitive manufacturing, which emphasizes the integration of artificial intelligence to create self-aware, goal-directed systems capable of dynamic decision-making. 40 We identify three core components that define autonomy in SDMLs:
(1)
(2)
(3)
Together, these elements form the decision-making engine of an SDML. By leveraging the automation infrastructure described earlier, the system can close the experimental loop—design, execute, measure, and decide—entirely on its own.
Data
In SDMLs, data serves as the foundational layer for building surrogate models that drive AI-powered decision-making. Data used in SDMLs generally originates from three primary sources: physical experiments, computational simulations, and open-source datasets.
Experimental Data: Experimental data collection remains central to many SDML implementations, offering direct insight into material and process behavior under real-world conditions. In smaller-scale studies, data was collected through carefully designed experimental campaigns. For instance, one investigation employed 18 physical experiments based on a non-regular fractional factorial design to evaluate material performance. 22 Another study focused on alloy characterization, collecting 10 indents per sample across 24 alloys per iteration, resulting in 480 unique data points. 3 Additionally, Latin hypercube sampling can be used to generate sampling points across the design space, particularly when the space is large or high-dimensional. In this approach, each input dimension is divided into intervals equal to the number of required samples, and values are then randomly combined across dimensions. This ensures that each interval in every dimension is sampled exactly once, resulting in an efficient and well-distributed coverage of the input space.41,42
More advanced SDML setups have achieved higher throughput. The Archerfish platform demonstrated real-time, high-throughput experimentation by screening up to 250 unique compositions per minute. 25 Another notable system autonomously executed and repeated 25,387 experiments within a self-optimized loop, showcasing the potential scale of experimental data generation in SDMLs. 5 In other cases, data was generated incrementally, such as in a study involving 20 parameter sets and their corresponding compressive modulus and print time values, iteratively expanded through multi-objective Bayesian optimization (MOBO). 43
Incorporation of imaging technologies further enhanced experimental data richness. 44 One study collected 10,154 images for material analysis, including 1414 warped samples and 1976 labeled bounding box samples for computer vision applications. 27 Real-time visual monitoring through webcams was also integrated into robotic material handling systems, enabling dynamic tracking of process stages. 5
Simulation Data: To reduce the cost and time associated with physical testing, several SDML studies utilized simulation-based data. Finite element analysis (FEA) was a common tool, as seen in a study that used Autodesk Netfabb Local Simulation to generate 1000 thermal simulations for predictive modeling. 35 Another example involved generating 15 unique datasets through a finite element-based Representative Volume Element (RVE) simulation, paired with micromechanical models like Isowork, Isostress, and Isostrain. 33 Other research efforts adopted artificial datasets of RVEs created with Gaussian Random Fields (GRFs), which were processed using PyTorch and FEniCS—though without explicit mention of API integration. 21
Open-Source Datasets: Open-source materials databases have increasingly become vital resources for accelerating SDML autonomy. For example, one study utilized 58 novel material entries from the Materials Project for initial screening and surrogate model training. 45 Another leveraged high-dimensional datasets including a 6D Poisson’s ratio dataset with 146,000 materials, a 6D thermoelectric figure-of-merit dataset comprising 1000 materials, and 174 analytical stress-test datasets, all employed to evaluate the ZoMBI optimization algorithm. 46
Collectively, these studies illustrate the diversity and complexity of data sources utilized in SDMLs. Whether derived from physical experiments, simulations, or open databases, these data form the backbone of surrogate modeling and Bayesian optimization frameworks. As SDMLs continue to evolve, seamless integration of these data streams will be essential for advancing fully autonomous materials design and manufacturing workflows.
Surrogate model
Surrogate models are integral to autonomous decision-making in SDMLs, as they provide computationally efficient approximations of complex, often expensive-to-evaluate functions. These models facilitate optimization and prediction in manufacturing and materials science by learning the input–output relationships of physical systems, thus significantly reducing the need for exhaustive experiments or simulations.
Among the various surrogate modeling techniques, Gaussian Processes (GPs) have emerged as a dominant choice due to their probabilistic framework and built-in uncertainty quantification. GP models have been employed in solving mixed-constraint optimization problems 47 and in modeling intricate relationships such as the effects of carbon content and annealing temperature on normalized strain-hardening rate, which allowed researchers to minimize costly Representative Volume Element (RVE) simulations. 33 In the ZoMBI framework, GP surrogates were strategically trained within the zoomed-in regions of the design space to improve optimization efficiency. 46 Other studies also deployed GP models across various machine learning-driven manufacturing processes.21,37,48–50
In multi-objective optimization contexts, both Gaussian Process Regression (GPR) and Gaussian Process Classifiers (GPC) models have been utilized. Specifically, Gaussian Process Classifiers (GPCs) have been integrated into Bayesian classification loops, showcasing the adaptability of GP-based models for discrete and categorical decision-making tasks.51,52
To overcome the limitations of standard GP models, Hierarchical Gaussian Process Regression (HGPR) has been introduced. This approach incorporates physics-informed priors derived from analytical models, thereby improving predictive capabilities in multi-objective material optimization scenarios. 53 Additionally, Deep Gaussian Processes (DGPs)—an extension of traditional GPs with multi-layered architectures—have been adopted in Bayesian Optimization (BO) for modeling non-stationary and complex input–output relationships. These DGP models utilized Stochastic Imputation (SI) for efficient inference and a bagging procedure to enhance robustness, making them well-suited for high-dimensional optimization tasks. 35
Beyond GP-based models, alternative machine learning techniques have also been explored as surrogates. For example, Random Forest (RF) and Multi-Layer Perceptron (MLP) models have been implemented within BO frameworks, particularly when computational speed or scalability favored deep-learning-based methods over traditional GPs. Comparative analyses have shown that while GP models generally excel in capturing global trends and uncertainty, alternative models like RF and MLP can offer advantages in terms of speed and scalability in specific applications.54–57
Collectively, these studies underscore the centrality of Gaussian Process-based models in SDML autonomy, while also pointing toward a growing interest in hybrid and deep learning-based surrogate modeling strategies. These approaches offer improved flexibility, scalability, and accuracy—critical attributes for enabling intelligent decision-making in autonomous materials discovery and manufacturing systems.
Bayesian optimization
Bayesian Optimization (BO) is a core component of autonomous experimentation in SDMLs, offering an efficient strategy for optimizing expensive and complex black-box functions. BO operates by constructing a surrogate model of the objective function and employing an acquisition function to guide the selection of new sampling points. These acquisition functions determine where to sample next by balancing exploration (sampling in regions of high uncertainty) and exploitation (sampling where predicted performance is high), a trade-off critical to the efficiency and success of the optimization process.
One of the most widely adopted acquisition functions is Expected Improvement (EI), which selects new points based on their expected improvement over the best-known solution. EI has demonstrated effectiveness in several studies, especially for optimizing process parameters and materials compositions. 58 For instance, it was used in the composition design of Refractory High-Entropy Alloys (RHEAs), where it prioritized candidates with high potential for yield strength enhancement. 3
Another frequently employed strategy is the Upper Confidence Bound (UCB) acquisition function, which incorporates both the predicted mean and uncertainty of the surrogate model. UCB is particularly advantageous in high-throughput BO applications where efficient coverage of the design space is required. It has been used successfully for material screening and process parameter tuning in several studies.59–61 Conversely, the Lower Confidence Bound (LCB) function was applied in constrained optimization settings, where penalizing potentially infeasible regions helped enforce design or physical constraints during exploration.62–64
To directly address optimization problems with feasibility constraints, more specialized acquisition functions have been proposed. Expected Feasible Improvement (EFI) and Constrained Expected Improvement (CEI) extend the standard EI by incorporating probabilistic models of constraint satisfaction, enabling more targeted sampling within feasible subregions of the design space. Additional methods such as Stepwise Uncertainty Reduction (SUR) and Augmented Lagrangian (AL) have been employed to better manage constraint satisfaction and uncertainty propagation in Bayesian search processes. 49
For multi-objective optimization problems, where trade-offs between competing objectives must be navigated, the Expected Hypervolume Improvement (EHVI) function is commonly used. EHVI enables the identification of Pareto-optimal solutions by maximizing the expected gain in the hypervolume of the objective space dominated by the Pareto front. This approach has proven effective in optimizing multiple material properties—such as hardness and elastic modulus in alloy design—by significantly reducing the number of required experiments.9,65,66 In one study, a two-loop BO framework integrating EHVI reduced optimization time by 95%, identifying Pareto-optimal candidates in just 13 iterations compared to a baseline of 280. 3
Recent advances in BO have introduced multi-scale acquisition functions to tackle challenges associated with high-dimensional or complex design spaces. Techniques such as Multi-Scale Multi-Resolution Expected Improvement (MSMR-EI) and Multi-Scale Multi-Resolution Upper Confidence Bound (MSMR-UCB) extend traditional EI and UCB functions by incorporating hierarchical search resolutions. These methods improve search efficiency by adaptively allocating sampling efforts across different scales and regions of the parameter space, making them particularly suited for large, nonlinear optimization problems.67–69
In summary, the choice of acquisition function is pivotal in determining the efficiency, accuracy, and reliability of Bayesian Optimization in SDMLs. While EI and UCB remain the most commonly used due to their simplicity and general applicability, specialized functions such as EHVI, EFI, and CEI enhance BO performance in multi-objective and constrained optimization scenarios. Emerging multi-scale approaches further expand the potential of BO, enabling SDMLs to address increasingly complex optimization tasks with reduced computational and experimental costs.
Research gaps and future directions
The evolution of SDMLs signifies a transformative shift in materials science and manufacturing, aiming to integrate automation and autonomy for accelerated discovery and optimization. Despite notable advancements, several critical challenges persist across both automation and autonomy domains. Addressing these challenges is essential for realizing fully integrated, efficient, and scalable SDML systems.
Automation: Enhancing integration and flexibility
Broadening Automation Across Diverse Manufacturing Processes: While SDMLs have demonstrated success in specific areas, such as polymer synthesis and solution-based chemistry, extending automation to encompass a wider array of manufacturing processes, notably metal additive manufacturing (AM), remains a significant hurdle. Metal AM processes, including Laser Powder Bed Fusion (LPBF) and Directed Energy Deposition (DED), present complexities in terms of process control, monitoring, and post-processing, which are not yet fully addressed by existing SDML frameworks. The variability in microstructures and mechanical properties inherent to metal AM necessitates advanced automation strategies for consistent quality and reliability. 70
Standardizing APIs and Developing a Unified Software Ecosystem: The integration of heterogeneous equipment from multiple vendors poses challenges due to the lack of standardized communication protocols. Initiatives like the Standardization in Lab Automation (SiLA) consortium have made strides in developing device and data interface standards, facilitating rapid integration of lab automation hardware and data management systems. 71 However, broader adoption and further development of such standards are crucial for seamless interoperability within SDMLs.
Advancing Material Handling Systems: Current SDML implementations often utilize robotic systems primarily for part transportation. Expanding robotic capabilities to include material pre-processing (e.g. powder handling, mixing) and post-processing (e.g. heat treatment, surface finishing) is essential for fully autonomous operations. Innovations in robotic manipulation and sensing technologies are needed to handle diverse materials with varying properties safely and efficiently.
Autonomy: Enhancing data, modeling, and optimization
Standardizing Data Generation and Uncertainty Quantification: The reliability of surrogate models and optimization algorithms heavily depends on the quality and consistency of input data. Establishing standardized protocols for data generation from experiments and simulations is imperative.54,72 Moreover, incorporating uncertainty quantification methods, such as Bayesian inference and ensemble modeling, can provide insights into data reliability and model confidence, thereby improving decision-making processes.73–75
Developing Digital Twins for Materials Design: Digital twins—virtual replicas of physical systems—offer the potential to simulate and optimize materials and processes before physical experimentation. Integrating physics-based models with data-driven approaches, can enhance the predictive capabilities of digital twins. For instance, combining finite element analysis with machine learning algorithms has been shown to effectively predict the properties of over molded thermoplastic composites, facilitating process optimization.76,77
Advancing Surrogate Modeling Techniques: Traditional surrogate models, like Gaussian Processes (GPs), face challenges in handling high-dimensional, nonlinear, and spatiotemporal data inherent in materials science. Emerging approaches, such as Multi-Fidelity Hierarchical Neural Processes (MF-HNP), offer scalable solutions by integrating data from multiple fidelity levels and capturing complex relationships. These models can improve predictive accuracy while managing computational costs. 78 To address these limitations, alternative surrogate models such as Random Forests and Bayesian Neural Networks (BNNs) have been explored. Random Forests offer advantages in handling high-dimensional optimization problems, making them particularly suitable for complex AM applications.54,79 Meanwhile, Bayesian Neural Networks provide a flexible modeling approach, capturing nonlinear relationships while incorporating uncertainty quantification.55,80 Employing ensemble modeling techniques, which integrate multiple surrogate models, could further enhance robustness and adaptability, providing a more effective framework for optimizing AM processes.32,54,81,82
Innovating Acquisition Functions for Bayesian Optimization: The effectiveness of Bayesian Optimization (BO) in SDMLs is influenced by the choice of acquisition functions. While functions like Expected Improvement (EI) and Upper Confidence Bound (UCB) are commonly used, they may not suffice for complex, constrained, or multi-objective problems. Advanced acquisition strategies, such as Expected Hypervolume Improvement (EHVI) for multi-objective optimization and Constrained Expected Improvement (CEI) for problems with feasibility constraints, have demonstrated improved performance in navigating complex design spaces.
Autonomous Hypothesis Generation and Novelty Discovery: The traditional structure of self-driving labs relies on a human domain expert to define the experimental objective and generate the initial hypothesis, thereby limiting the system to the optimization of a predefined target. The next critical evolution of the Autonomy Layer is the delegation of hypothesis generation and discovery to AI agents, transforming the SDML from a goal-directed optimizer into a more versatile AI scientist. Advanced systems like Kosmos 83 demonstrate this capability by performing iterative, coherent cycles of parallel data analysis, literature search, and hypothesis generation, using a structured world model to maintain research focus and ultimately produce novel scientific contributions. Simultaneously, the concept of Human-AI Collaborative (HAIC) 84 workflows is emerging, where Large Language Models (LLMs) act as “co-scientists” to generate testable hypotheses and refine experimental plans by engaging with human expertise between autonomous batches. Beyond simple optimization, AI can also be directed toward novelty discovery by integrating novelty scoring systems with strategic sampling mechanisms, enabling the system to explore under-sampled regions and enhance the likelihood of discovering previously unobserved physical phenomena in materials. 4
Distributed and decentralized SDMLs
As SDMLs become more complex, the underlying architecture must evolve. Future development of large-scale SDMLs, especially in manufacturing environments that require parallel processing and high-availability, will necessitate a transition from the simpler distributed model to a more complex decentralized architecture to manage complexity, redundancy, and efficiency effectively. A critical distinction exists between distributed and decentralized control architectures in large-scale manufacturing and autonomous research laboratories. Distributed control systems typically rely on a central supervisory authority coordinating multiple task-specific nodes, whereas decentralized systems eliminate the central controller and instead rely on peer-to-peer coordination among autonomous agents. Prior studies have shown that decentralized control architectures offer superior robustness, scalability, and fault tolerance, making them particularly well-suited for intelligent and adaptive automation frameworks.38,85–88
In fully decentralized SDMLs, laboratories across different geographical locations collaborate, presenting an opportunity to leverage diverse resources and expertise.89–91 Such networks can enhance efficiency, enable rapid replication of experimental findings, and democratize the discovery process.92,93
Frameworks like LabLinking propose interconnecting experimental environments across institutions, allowing for time-synchronous execution of experiments and continuous exchange between scientists. 94 Similarly, the MULTITASK framework demonstrates how multi-agent laboratory control can facilitate collaborations across large facilities. 95
Implementing decentralized SDMLs requires robust cyberinfrastructure, standardized protocols, and secure data sharing mechanisms. Drawing parallels from initiatives like the Network for Earthquake Engineering Simulation (NEES), which connects laboratories via a centralized data repository and interactive simulation tools, can provide valuable insights. 96
Managerial and strategic implications
The adoption of the SDML framework has profound strategic and managerial implications that extend beyond technical performance. For laboratory directors and R&D managers, implementing this framework requires a shift in operational philosophy.
In summary, while SDMLs hold immense promise for revolutionizing materials design and manufacturing, addressing the outlined challenges in automation, autonomy, and workforce development is essential. Collaborative efforts across academia, industry, and government agencies will be pivotal in overcoming these hurdles and realizing the full potential of SDMLs.
Conclusion
This work presents a comprehensive framework for understanding and implementing Self-Driving Manufacturing Labs by delineating their two foundational pillars:
Looking forward, the evolution of SDMLs into truly cognitive systems will require concurrent advancements across all layers. In Automation, the focus must shift to standardizing communication protocols (APIs) for heterogeneous equipment, expanding robotic capabilities to encompass complex material pre- and post-processing, and integrating metal additive manufacturing into autonomous loops. On the Autonomy front, the critical directions include developing robust Digital Twins, creating advanced surrogate models capable of handling high-dimensional, nonlinear, and spatiotemporal data, and refining acquisition functions for multi-objective and constrained problems. Critically, future SDMLs will transition from human-guided optimization to autonomous hypothesis generation and novelty discovery. Furthermore, they will increasingly rely on decentralized architectures to facilitate large-scale, fault-tolerant collaboration across labs. Addressing the associated managerial challenges—from high initial investment to workforce upskilling—is essential for realizing the profound strategic benefits of accelerated discovery, optimal resource allocation, and advanced intellectual property generation.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Maine Economic Improvement Fund (award number 6250296) and the Maine Space Grant Consortium (award number 6410247). The authors gratefully acknowledge these supports.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
