Sage Journals: Discover world-class research

Abstract

One of the challenges in applying automated chemistry workstations to problems of reaction optimization entails choosing an appropriate optimization algorithm. In the study described herein, 10 different algorithms have been examined for efficacy in searching reaction spaces using scenarios that explore effects of workstation parallelism and search space size. The algorithms differ in scheduling (serial vs. parallel), adaptive features (open loop vs. closed loop), and methods for stepping through the search space. Several two-tiered algorithms enable a breadth-first survey followed by an indepth optimization. For a workstation with modest parallel capacity, a parallel but nonadaptive algorithm is most effective in small or coarse-grained search spaces, whereas parallel adaptive algorithms are superior for examining large or fine-grained search spaces. The parallel adaptive algorithms become increasingly effective as the size of the search space increases. A serial algorithm is most attractive with a serial workstation, or when chemical resources are limited regardless of workstation or search space. The breadth-first survey of the twotiered algorithms significantly improves the efficiency of the subsequent in-depth optimization. The results obtained provide guidance in choosing optimization algorithms, designing more sophisticated algorithms, and developing workstations with parallel and/or adaptive features that use such algorithms.

Keywords

automated chemistry reaction optimization simplex parallel adaptive high throughput

Introduction

Reaction optimization is an integral part of experimental chemistry. Chemists who carry out reaction optimizations typically perform a lengthy search concerning the interplay of numerous variables in a given reaction (temperature, reagents, catalysts, concentration, solvent, time, etc.) in pursuit of conditions that afford increased yields. Efforts to mitigate the labor-intensive aspects of reaction optimization began in earnest more than 25 years ago with the development of automated single-batch reactors (principally at Roussel Uclaf,¹ Smith Kline and French,^{2 –5} Takeda Chemical Industries,^{6 –9} and Conservatoire National des Arts et Métiers^10,11), an effort that continues to this day.^{12 –14} The advent of laboratory robots some 20 years ago led to multireactor workstations, which engendered greater flexibility in sample handling and opened the possibility of extensive parallelism in carrying out chemical reactions.^{15 –19}

Several early designs of both batch reactors and robotic systems included provisions for adaptive experimentation, in which the results of one operation or experiment are used as feedback to guide the course of subsequent experimentation.^{2,6,16 –18,20,21} Use of the well-known Simplex algorithm,²² which provides for a sequential hill-climbing process, enables one reaction to be performed after another under successively modified conditions in pursuit of a higher reaction yield. Although attractive for adaptive experimentation, the inherently serial nature of the Simplex algorithm precluded any possible parallelism available with automated multireactor systems. This early work in reaction optimization largely predated combinatorial chemistry and the demands of parallel synthesis.^23,24

The automated chemistry workstations that have been constructed over the past decade^{25 –31} have chiefly been designed for applications in high-throughput screening and combinatorial chemistry rather than reaction optimization.³² The former systems place a premium on processing large numbers of samples, and the type of data-driven feedback that underlies adaptive experimentation apparently is either not needed or not readily obtainable. In many respects, the ability to gain information from multiple experiments at the same time (and undoubtedly the greater ease of building open-loop versus closed-loop automated systems)²³ has overshadowed the attributes of adaptive experimentation. Consequently, most current automated chemistry workstations include provisions for parallelism and few, if any, capabilities for adaptive experimentation.

Our work in automated chemistry has focused on (1) development of microscale multireactor automated chemistry workstations for use in fundamental studies and optimization of chemical reactions in domains of relatively clean chemistry^{33 –35} and (2) developing a software environment for diverse experimentation with such workstations.^{36 –45} A suite of experiment-planning modules enables scheduling of experiments in parallel as well as automated decision making concerning the data obtained from prior or ongoing experiments. The decision-making features support adaptive experimentation, either in a serial mode (e.g., the Simplex algorithm) or, more interestingly, in a parallel mode. Approaches that blend parallel and adaptive experimentation capabilities enable reaction optimization to be pursued aggressively. Altogether, 10 different algorithms were developed that could be applied to problems of reaction optimization.

During the course of developing approaches for blending parallel and adaptive experimentation, it became apparent that more information was needed concerning the efficacy of diverse optimization algorithms upon implementation in automated chemistry workstations. In this article, we provide a brief overview of the hardware and software architecture of the automated chemistry workstation. The features of each optimization module are summarized. The performance of the algorithms on fictive workstations is evaluated in scenarios that explore differences in the size of the search space and the parallel capabilities of the workstations. This work provides insight into the design and selection of appropriate algorithms for use in unconstrained searches, such as occur in studies of reaction optimization.

Results and Discussion

Automated Chemistry Workstation

The automated chemistry workstation hardware consists of a multivessel reaction station, robot for moving samples and chemicals, analytical instruments, and other components. The software package includes features for composing experimental plans, managing resources (chemicals and containers), scheduling experiments, and evaluating data.^{35 –45} Individual experimental plans are composed by drawing on a set of commands that describe operations of the robotic arm and other hardware. A timing table automatically determines the duration of each command.

A schedule of one experimental plan lists the duration required for each of the actions of the robotic arm. Multiple experimental plans can be implemented in parallel through the use of a scheduler that offsets the start time of intact experimental plans and interleaves (in a comblike manner) the individual commands of the respective plans. In this manner, the total duration of the set of parallel experimental plans is generally compressed by up to 10-fold compared with that for serial implementation.³⁶

An outline of the flow of information in the workstation is provided in Fig. 1. When a set of experimental plans is to be implemented, the plans are first examined for resource availability, including chemicals (solvents, reactants, reagents) and containers (reaction vessels, storage vials). The resource manager divides the experiments into those for which resources are available (resource sufficient; ready to be implemented) and those that must await resources (resource insufficient; waiting). The experiments for which resources are sufficient are then passed to the scheduler for parallel implementation to the greatest extent possible. As additional resources are made available, waiting experiments can be placed in queue for scheduling.

Figure 1.

Flow diagram illustrating the throughput of experiments in an automated chemistry workstation. (A) The scientist works with the experiment planner to compose an experimental plan. (B) Each experimental plan consists of a list of commands, including simple directives or conditionals that depend on experimental data. (C) A resource manager tabulates the total resource demands, including chemicals (reagents, reaction solvents, solvents used by instruments) and containers (reaction vessels, sample vials, etc.). The schedule is separated into executable experiments and experiments awaiting resources. (D) The scheduler renders the executable experiments in parallel to the extent possible. (E) The scheduled experiments are passed to the automated chemistry workstation. (F) Data are generated from analytical instruments. (G) In open-loop experimentation, in which no decisions are made automatically about ongoing or planned experiments, the data are combined in an output file. (H) In closed-loop experimentation, in which decisions are made without user intervention about ongoing or planned experiments, the data are passed to the evaluation unit of the experiment-planning module. (I) The data are evaluated in the context of the scientific objective as stated in the experimental plan. (J) Depending on the results of the evaluation, ongoing experiments can be terminated or altered, pending experiments can be expunged from the queue, or new experiments can be spawned. (K) The results from open-loop and closed-loop experiments are available for review by the scientist.

The workstation is equipped for open-loop experimentation as well as closed-loop experimentation. In open-loop experimentation, no decisions are made on the basis of data from previous or ongoing experiments. By contrast, in closed-loop experimentation, data are evaluated automatically, whereupon ongoing experiments can be altered and future experiments can be pruned, altered, or spawned in accord with the scientific objective stated at the outset by the user.^37,38 Closed-loop capabilities are essential for adaptive experimentation. Although decision making can involve the local course of individual experiments and/or the global course of sets of experiments, for the studies herein, the individual experiments are allowed to proceed as initially planned, and only the global course of experimentation is altered.

Optimization Algorithms

Ten distinct algorithms are available for reaction optimization. Three of the algorithms employ two-tiered searches (vide infra). The seven single-tiered algorithms are illustrated in Fig. 2 and are described as follows:

Figure 2.

Schematic illustration of seven optimization algorithms. (A) Factorial design/grid search (FD/GS). All 25 points are examined in the 5 × 5 grid shown. (B) Successive focused grid search (SFGS). The example provided starts with a 5 × 5 grid in the first cycle. After evaluation of all 25 experiments, the search continues with a second cycle that is half the size of the search space and focused on the optimum region. (C) Composite modified Simplex (CMS). The simplexes illustrated originate from initiation (1), reflection (2, 4, and 6) or expansion (3, 5). (D) Multidirectional search with mandatory points only (MDS-mo). Four cycles of experimentation are shown, illustrating initiation (1), reflection (2), expansion (3), and reflection (4). (E) Multidirectional search (MDS). One cycle is shown that consists of 3 mandatory points and 12 exploratory points; the latter are formed by reflection and expansion. (F) Parallel Simplex search (PSS). Two cycles are shown of a singly partitioned search space, wherein four CMS searches are performed in parallel. (G) Parallel multidirectional search (PMDS'). Two cycles are shown of a singly partitioned search space, wherein four MDS-mo searches are performed in parallel.

The factorial design or grid search (FD/GS) module enables a comprehensive investigation of all levels of each variable or factor.³⁹ The experiments examine the points in a regular grid and can be scheduled in parallel. The result of a single experiment (i.e., examination of one grid point) has no effect on the other experiments. Termination of the search is only dependent on the complete evaluation of all points that form the grid.

The successive focused grid search (SFGS) examines a regular grid similar to FD/GS and then examines a second grid of smaller size that is situated around the optimal region identified in the prior grid. This process is repeated until a termination criterion is satisfied.⁴¹

The composite modified Simplex (CMS)⁴⁶ uses the Simplex method, accompanied by a host of modifications (enabling expansion, contraction, fit to boundary, etc.).³⁷ A simplex contains n + 1 points for the initial cycle, where n is the number of dimensions of the search space. After evaluation of all points, the one worst point of the simplex is discarded, and a new simplex is created by reflection away from the worst point. Each cycle after the first cycle entails performing a single experiment; thus, the Simplex search is inherently serial. The search continues until a termination criterion is satisfied.

The multidirectional search with mandatory points only (MDS-mo) module, based on the work of Torczon⁴⁷ and Dennis and Torczon,⁴⁸ enables directed evolutionary searches in parallel.⁴⁰ A simplex is first created (mandatory points). Upon evaluation of the initial simplex, all points but the one best point are discarded, and a reflection is performed in the direction away from the poor responses, which gives the next simplex. Subsequent movement (reflection, expansion, contraction, and so on) proceeds through the search space until a termination criterion is satisfied.

The multidirectional search (MDS) module projects both mandatory points and exploratory points. The initial simplex constitutes the mandatory points. In anticipation of subsequent possible movements, exploratory points are projected along the axes of the initial simplex. Additional exploratory points can be projected to the extent that resources and workstation parallelism permit. Upon evaluation of the projected points, the search is refocused with a new set of mandatory and exploratory points and continues in this manner until a termination criterion is satisfied.⁴⁰

The parallel Simplex search (PSS) module enables multiple CMS searches to be performed in parallel.⁴² The search terminates when one simplex converges (or all simplexes converge) on an optimum.

The parallel multidirectional search (PMDS') module enables multiple MDS-mo searches to be performed in parallel. PMDS' is to MDS-mo what PSS is to CMS.⁴⁹ The search continues until one simplex converges (or all simplexes converge) on an optimum.

To focus searches more effectively, we added features for two-tiered searching as illustrated in Fig. 3.^43,44 In this mode, a broad survey is first performed (tier 1) to identify the best region of the search space. In tier 2, an in-depth optimization is performed. The latter is performed with the CMS, MDS, or PSS algorithms.

Figure 3.

Two-tiered experimentation. A breadth-first survey identifies the appropriate starting point for a subsequent in-depth optimization with the CMS, MDS, or PSS algorithm.

The features of the 10 experiment-planning modules are listed in Table 1. The CMS module performs serial, adaptive experiments (though multiple experiments are performed in parallel in the first cycle). The FD/GS module affords parallel, nonadaptive experimentation. All other algorithms afford parallel adaptive experimentation. Each adaptive algorithm constitutes a direct search method (i.e., nonderivative) for optimization.⁵⁰ We now turn to evaluate the performance of the various algorithms.

Table 1.

Features of the experiment-planning modules

Search module	Scheduling^a	Adaptive^b	Planning^c
Factorial design or grid search (FD/GS)	Parallel	No	All experiments
Successive focused grid search (SFGS)	Parallel	Yes	1 cycle (1 grid)
Composite modified Simplex (CMS)^d	Serial^e	Yes	1 cycle (1 experiment)
Multidirectional search (MDS)^d	Parallel	Yes	1 cycle (1 batch)
MDS mandatory only (MDS-mo)^d	Parallel	Yes	1 cycle (1 batch)
Parallel Simplex search (PSS)	Parallel	Yes	1 cycle
Parallel multidirectional search (PMDS')	Parallel	Yes	1 cycle (1 batch)

Parallel refers to more than one experiment proceeding at the same time. Serial refers to only one experiment proceeding at a given time, thus multiple experiments must be implemented in succession.

An adaptive process is one in which the search process is altered on the basis of data collected during experimentation.

Planning refers to the extent to which experiments can be dictated in advance of experimentation.

This module can be used in a two-tiered optimization in which a broad survey is performed first to identify the most promising starting point in the same search space.

The n + 1 experiments in the first cycle (corresponding to the first simplex) can be run in parallel; all subsequent experiments are performed serially.

Algorithm Evaluation

Three scenarios were employed to examine the performance of the optimization modules as a function of increasing parallel capabilities of the workstation and a function of increasingly large search spaces. The effect of increasing search space size was examined by altering the grain size of the search space. The key performance metrics are the total number of experiments performed, the total elapsed time, and the robot utilization. In all but the final simulation, the search space for the optimizations consists of two dimensions. The response surface is shown in Fig. 4.

Figure 4.

Response surface for a two-dimensional search space generated by the equation R = (1 − [x − 0.75] × [x − 0.75] -[y − 0.6] × [y − 0.6]) × 100.

Scenario 1 . This scenario examined the performance of the single-tiered optimization modules as a function of workstations of different capacity. The template for the simulations is shown in Table 2. The robotic commands include addition of solvent and reagents to initiate the experiment, a 1-h wait period for the reaction A + B → C to ensue, and several commands to carry out analysis procedures. The existence of sparse robotic operations over time provides the opportunity for interspersing multiple experiments, thereby achieving parallel experimentation.

Table 2.

Experimental template^a

No.	Start (h:mm:ss)^b	Finish (h:mm:ss)^b	Commands
1	0:00:00	0:00:18	Fill “Reaction Vessel[l]” to 5.000 mL using the “solvent[l]”.
2	0:00:18	0:00:50	Transfer 0.00 mL <reag_A_vol> from “Reagent A” to “Reaction Vessel[l]” to make 0.00 M.
3	0:00:50	0:01:22	Transfer 0.00 mL <reag_B_vol> from “Reagent B” to “Reaction Vessel[l]” to make 0.00 M.
4	0:01:22	1:01:22	Wait for 1 h.
5	1:01:22	1:01:54	Transfer 1.000 mL from “Reaction Vessel[l]” to “Cuvette”.
6	1:01:54	1:01:56	Collect sample spectrum.
7	1:01:56	1:02:16	Clear “Cuvette”.

For uniformity across the various simulations, the duration of each addition and transfer operation was fixed to the respective value indicated above regardless of the volumes required at different locations in the search space.

Time in hours:minutes:seconds.

The maximum number of experiments that can be implemented in parallel depends both on the experimental template employed and the capabilities of the workstation. With the template shown in Table 2 and our current workstation,³⁵ a maximum of 44 experiments can be implemented in parallel. To explore the effects of changes in parallel capabilities of fictive workstations, we have adopted the timing and mechanical features of our current workstation³⁵ but artificially adjusted one of the resource parameters (e.g., number of reaction vessels) to a specific value to limit the capacity (i.e., number of parallel experiments). The value ranged from 1 to 100 for fictive workstations, with capacity ranging from serial (1 experiment at a time) to highly parallel (up to 100 experiments possible). Note that all fictive workstations with capacity set at values >44 were then constrained in this case by the experimental template, which limited the maximum number of parallel experiments to 44.

Parallelism . The effect of workstation capacity is illustrated in Gantt charts for the case of 25 experiments to be implemented. The Gantt chart for a purely serial workstation is shown in Fig. 5A. No parallelism can be achieved, and the total search duration is equal to the number of experiments performed times the makespan of one experiment. The total robot utilization, which is the robot active time relative to the makespan of the set of experiments, is ≈3.5%. Figure 5B shows the Gantt chart for a workstation equipped for performing 10 experiments in parallel. The total search duration is <4 h, and the utilization is ≈26%. Figure 5C shows the Gantt chart for a workstation that can perform 44 experiments in parallel, which is the maximum that can be achieved with the template employed. The robot was inactive for about 27 min of ≈1.5 h of experiment duration (≈59% utilization).

Figure 5.

Gantt chart showing scheduled experiments in scenario 1. The marks indicate robot activity. The relative times of implementation of robotic motions for a given experiment are not changed but the start times are shifted to accomplish interleaving of commands in distinct experimental plans. The composite schedule of all individual experiments is displayed at the top of the Gantt chart. (A) Serial implementation. (B) A workstation capable of 10 experiments in parallel. (C) A workstation capable of 44 experiments in parallel (the maximum achievable with the template employed).

Search efficacy . The performance of the modules depends in part on the nature of the search space and the various parameter settings of the respective algorithms. The dimension of each axis in the two-dimensional search space is set at 10, affording 10 × 10=100 experiments to be performed by the FD/GS module. For comparison, the simplex size in the adaptive algorithms (CMS, MDS-mo, MDS, PMDS', PSS) also is set at 10% of the range, and the expansion factor and contraction factor for the latter algorithms are set at 2 and 0.5, respectively. The CMS, MDS-mo, and MDS algorithms positioned the initial simplex in the lower left-hand corner [(0.25, 0.25), (0.35, 0.25), (0.25, 0.35)] of the search space, far from the optimum. The PSS and PMDS' algorithms initiated four simplexes each at the center of an equal quadrant in the search space.

In SFGS, the initial grid size was set at 75% of the search space. The results are summarized in Table 3.

Table 3.

Results from Scenario 1: performance of optimization modules with workstations of different capacity

			Duration for different workstations
Module^a	No. of planning cycles^b	Total no. of experiments^c	Serial^d	10 parallel^e	44 parallel^f
FD/GS	1	100	104:32:00	13:01:32	05:53:04
SFGS	2	200	209:05:21	26:05:22	11:47:24
CMS	20	22	23:00:05	20:54:45	20:54:45
MDS-mo	21	42	43:45:17	22:34:16	22:34:16
MDS	21,^d 9,^e 4^f	42,^d 66,^e 115^f	43:45:17	11:14:25	07:05:23
PSS	12	56	58:34:31	15:01:52	13:39:30
PMDS'	9	59 (3 nc)^g	58:35:20	12:06:37	10:36:41

The convergence criterion is based on the difference of responses satisfying a minimum tolerance value (0.1% for each simplex-based module; 0.5% for SFGS).

The number of cycles of experimentation in each search.

The total number of experiments in all cycles.

A workstation restricted to serial searches.

A workstation equipped for 10 experiments in parallel.

A workstation equipped for 44 experiments in parallel.

Three experiments were not complete (nc) when the search was stopped.

The notable results are as follows:

For a serial workstation, the fastest search is provided by CMS, which (after the first cycle) only performs one experiment per cycle. CMS provides the fastest overall search because this algorithm projects the fewest experiments. The slowest searches result from those algorithms that afford the least amount of adaptive experimentation, namely FD/GS and SFGS. The algorithms that afford a blend of parallel and adaptive experimentation give searches of intermediate speed.

As the workstation is equipped with increasingly parallel capabilities (to 10 or 44 experiments in parallel), the open-loop FD/GS (and related SFGS) algorithm affords a dramatic increase (≈20-fold) in search speed. Essentially no change is observed with the inherently serial CMS algorithm. The parallel adaptive algorithms afford speed improvements of approximately two- to fivefold.

Considering the total number of experiments performed during the searches, SFGS was most profligate (200 experiments), whereas CMS was most parsimonious (22 experiments). The PSS, MDS-mo, and PMDS' algorithms each performed 40 60 experiments, regardless of available workstation processing power. The number of experiments performed with the MDS module increased, as expected, with the availability of increasingly parallel capabilities: from 42 for the serial workstation to 115 for the workstation capable of 44 experiments in parallel.

Other modes of comparison are possible, such as the measure of (1) robot utilization or (2) the number of validations of the optimum response that a search provides. Although an experimentalist may only be concerned about a given search, those searches that are performed with lower levels of robot utilization leave the workstation available for other searches. Some of the searches provided two (SFGS) or three (PSS) validations, whereas the others generally provided only one validation (data not shown). In summary, the goodness of a particular algorithm depends both on the parallel capabilities of the workstation, the overall scientific objectives, and the real-world constraints encountered by the experimentalist. The latter include the urgency of the search and the available chemical resources.

Scenario 2 . Scenario 2 probes the performance of the various optimization modules with a workstation equipped for a modest degree of parallel experimentation (10 experiments in parallel) as the search space increases in size. The size of the search space was simulated by a systematic increase in the grain size while maintaining a two-dimensional space. The grain sizes include 20%, 10%, 5%, and 1%. (Note that the 10% size is identical to that in Scenario 1.) The parameters employed in Scenario 1 were also used in Scenario 2 with appropriate changes to render a fair comparison (e.g., simplex size was changed in accordance with the grain size).

The results for the 10 optimization algorithms are listed in Table 4. The notable results are as follows:

Table 4.

Results from Scenario 2: performance of optimization modules on increasingly fine-grained search spaces with a modestly parallel workstation (10 experiments in parallel)^a

Module	No. of planning cycles	Total no. of experiments	Duration
20% grain-size search space
FD/GS	1	25	03:55:05
SFGS	3	62	09:53:11
CMS	22	24	23:03:04
MDS-mo	14	25	13:51:44
MDS	8	48	09:23:43
PSS	12	55	14:40:08
PMDS'	8	55 (4 nc)^b	10:30:31
1st tier + CMS	1 + 12	4 + 14	13:43:02
1st tier + MDS-mo	1 + 10	4 + 21	11:50:03
1st tier + MDS	1 + 6	4 + 35	08:07:52
10% grain-size search space
FD/GS	1	100	13:01:32
SFGS	2	200	26:05:22
CMS	20	22	20:54:45
MDS-mo	21	42	22:34:16
MDS	9	66	11:14:25
PSS	12	56	15:01:52
PMDS'	9	59 (3 nc)^b	12:06:37
1st-tier + CMS	1 + 12	4 + 14	13:43:14
1st-tier + MDS-mo	1 + 9	4 + 18	10:44:44
1st-tier + MDS	1 + 5	4 + 34	07:05:37
5% grain-size search space
FD/GS	1	400	52:26:21
SFGS	1	400	52:26:21
CMS	30	32	31:25:01
MDS-mo	20	36	20:14:40
MDS	6	36	07:02:40
PSS	12	56	14:41:51
PMDS'	10	72 (3 nc)^b	13:01:33
1st-tier + CMS	1 + 12	4 + 14	13:43:19
1st-tier + MDS-mo	1 + 5	4 + 11	06:29:21
1st-tier MDS	1 + 3	420	04:41:56
1% grain-size search space
FD/GS	1	10,000	1300:56:45
SFGS	1	10,000	1300:56:45
CMS	44	46	46:04:03
MDS-mo	21	38	20:19:10
MDS	9	57	10:38:23
PSS	12	56	14:41:40
PMDS'	11	91	14:37:02
1st-tier + CMS	1 + 12	4 + 14	13:43:14
1st-tier + MDS-mo	1 + 11	4 + 22	12:52:23
1st-tier + MDS	1 + 6	4 + 40	08:16:17

All search features are as described in Table 3.

The indicated number of experiments were not complete (nc) when the search was stopped.

For the coarsest grain size, the fastest search was provided by the FD/GS module. The slowest search was observed with the algorithm having the least parallelism, namely CMS. The difference between FD/GS and CMS was approximately sixfold (4 h vs. 23 h), although the number of experiments was essentially identical (25, 24). The parallel adaptive algorithms provided searches of intermediate speed (≈10–16 h) and generally approximately twice as many experiments. However, each of the simplex-based algorithms (CMS, PSS, MDS, MDS-mo, PMDS') has the ability to contract, whereupon a more fine-grained search results. The focusing ability of the latter algorithms enables the identification of the optimum region of the response surface more readily than the fixed, coarse-grained search provided by FD/GS.

For the finest grain size, the fastest search among the single-tiered algorithms was provided by MDS (≈10.5 h). The slowest search was observed with FD/GS or SFGS, each of which required 1300 h. Indeed, each of the simplex-based searching modules gave vastly superior speeds (<50 h) compared with that of the FD/GS or SFGS searches. Moreover, the number of experiments performed by the simplex-based searching modules was < 100, with a minimum achieved by MDS-mo (38 experiments), to be compared with 10,000 experiments for FD/GS. The increased speed and fewer experiments provided by the simplex-based algorithms stems from their evolutive nature as well as the ability to expand the simplex size (and therefore move more rapidly) as the search evolves.

Regardless of search-space grain size, the inclusion of a breadth-first survey to identify the most promising region of the search space followed by in-depth optimization (CMS, MDS-mo, MDS) resulted in a faster search in every case. Indeed, the fastest search overall in all search spaces except the most coarse-grained was provided by the two-tiered search with MDS as the second-tier optimization algorithm.

Consider the performance of a given single-tiered algorithm as the search space becomes more fine-grained. The duration of the FD/GS search increased by 325-fold upon the 20-fold change in grain size in the two-dimensional space. On the other hand, the duration of the CMS search increased by twofold, whereas that of the searches provided by the PSS and MDS algorithms remained almost unchanged. In terms of number of experiments performed, the search provided by the FD/GS module resulted in an increase of 400-fold, totaling 9975 additional experiments. On the other hand, the number of experiments with the simplex-based searches increased by less than twofold for a given algorithm, ranging from 24 to 59 for the 20% grain size to 38 to 91 total experiments for the 1% grain size. Indeed, the search provided by the PSS algorithm was completed in essentially the same duration and required only one additional experiment at the 1% versus 20% grain size.

Scenario 3 . Here we examined more challenging searches using a workstation equipped for increased parallelism (44 experiments in parallel, the maximum achievable with the template employed). We first employed a search space with a 1% grain size. The results are shown in Table 5. The duration of the CMS search is unchanged, as expected, from that with 10 experiments. In this case, the parallel simplex-based searches are again more efficient on the basis of any metric (duration, number of experiments) than the FD/GS module. However, the MDS search was > 50% faster (8 h) than that of the other parallel simplex searches (PSS, PMDS', MDS-mo). The greater speed of MDS versus PMDS' is interesting. The exploratory points projected by MDS, which are possible given the high degree of parallelism, move the search process into the region of optimal response quite quickly. PMDS', on the other hand, in this case initiates four distinct searches and arrives at the optimal region slightly more slowly. In terms of number of experiments, the more conservative MDS-mo algorithm provides the best performance, identifying the optimal region with only 38 experiments, to be compared with 159 experiments by MDS itself. Note that CMS is second-best in terms of conserving chemical resources (46 experiments), but the serial nature of the CMS algorithm makes for a search (46 h) that is about six times slower than that performed by MDS.

An analogous search using the simplex-based modules was performed on a three-dimensional surface. The search space grain size was 4.54% (22 points per each dimension), which would correspond to 10,648 points in the three-dimensional space if every point in the regularly gridded space were examined. The results are shown in Table 5. On the basis of search duration, the rank order of the single-tier modules was identical to that in the corresponding two-dimensional search, as was the rank order of the two-tier modules. These quantitative data provide the basis for evaluation of the performance of diverse search algorithms in challenging search spaces.

Table 5.

Results from Scenario 3: performance of optimization modules on challenging search spaces with a relatively efficient workstation (44 experiments in parallel)^a

Module	No. of planning cycles	Total no. of experiments	Duration
Two-dimensional search space with very fine grain size^b
FD/GS	1	10,000	460:04:59
SFGS	1	10,000	460:04:59
CMS	44	46	46:04:03
MDS-mo	21	38	20:19:10
MDS	4	159	08:15:37
PSS	12	56	13:39:01
PMDS'	11	91	13:35:04
1st-tier + CMS	1 + 12	4 + 14	13:43:14
1st-tier + MDS-mo	1 + 11	4 + 22	12:52:23
1st-tier + MDS	1 + 4	4 + 84	07:27:10
Three-dimensional search space with relatively fine grain size^c
CMS	41	44	61:06:01
MDS-mo	27	75	38:49:52
MDS	10	222	22:07:52
PSS	15	144	26:45:51
PMDS'	11	257	24:45:33
1st-tier + CMS	1 + 19	8 + 22	30:05:23
1st-tier + MDS-mo	1 + 11	8 + 32	18:46:54
1st-tier + MDS	1 + 6	8 + 164	16:04:14

All search features are as described in Table 3.

The grain size was 1.0% (100 points per dimension).

The grain size was 4.54% (22 points per dimension). The equation for the three-dimensional surface was R =[1 − (x − 0.75) × (x − 0.75) − (y − 0.6) × (y − 0.6) − (z − 0.7) × (z − 0.7)] × 100. The experimental template included seven commands and a delay of 1:26:00 after reagent additions until analysis.

Conclusions

Automated chemistry workstations suitably designed for investigation of chemical reactions can enable rapid optimization of reaction conditions. The question of which optimization algorithm to employ depends on the availability of chemical resources, the urgency in obtaining results, and the scientific objectives of the search. Ten optimization algorithms were examined. With a coarse-grained (or small) search space and a workstation with modest parallel capabilities, the parallel but nonadaptive FD/GS module provides the most rapid search. With a serial workstation, the traditional serial, adaptive CMS algorithm provides the most rapid search. With a more fine-grained (or large) search space, the parallel, adaptive algorithms (MDS, PMDS', PSS) provide the most rapid searches. The two-tiered algorithms appear generally beneficial regardless of search size or parallel capabilities. The findings presented herein should be valuable in choosing or modifying algorithms for examination of search spaces and in guiding the designs of software and hardware features in workstations for automated experimentation.

Acknowledgment

The authors thank Sumitomo Chemical Co., Ltd. (Osaka, Japan) for support of this work.

References

Legrand

Foucard

Automation on the laboratory bench. J. Chem. Ed. 1978, 55, 767–771.

Winicov

Schainbaum

Buckley

Longino

Hill

Berkoff

C. E.

Chemical process optimization by computer—a self-directed chemical synthesis system. Analyt. Chim. Acta 1978, 103, 469–476.

Chodosh

D. F.

Wdzieckowski

F. E.

Schainbaum

Berkoff

C. E.

Automated chemical synthesis. Part 2: interfacing strategies. J. Autom. Chem. 1983, 5, 99–102.

Chodosh

D. F.

Levinson

S. H.

Weber

J. L.

Kamholz

Berkoff

C. E.

Automated chemical synthesis. Part 3: temperature control systems. J. Autom. Chem. 1983, 5, 103–107.

Chodosh

D. F.

Kamholz

Levinson

S. H.

Rhinesmith

Automated chemical synthesis. Part 4: batch-type reactor automation and real-time software design. J. Autom. Chem. 1986, 8, 106–121.

Hayashi

Sugawara

Computer-assisted automated synthesis-I. Computer-controlled reaction of substituted N-(carboxyalkyl)amino acids. Tetrahedron Computer Meth. 1989, 1, 237–246.

Hayashi

Sugawara

Shintani

Kato

Computer-assisted automatic synthesis II. Development of a fully automated apparatus for preparing substituted N-(carboxyalkyl)amino acids. J. Autom. Chem. 1989, 11, 212–220.

Sugawara

Kato

Okamoto

Development of fully-automated synthesis systems. J. Autom. Chem. 1994, 16, 33–42.

Sugawara

Cork

D. G.

Past and present development of automated synthesis apparatus for pharmaceutical chemistry at Takeda Chemical Industries. Lab. Robotics Autom. 1996, 8, 221–230.

10.

Delacroix

Veltz

J.-N.

Le Berre

Automatisation de la preparation au laboratoire du chlorure de tosyloxy-2 ethanesulfinyle. Bull. Soc. Chim. Fr. II 1978, 481–484.

11.

Porte

Roussin

Bondiou

J.-C.

Hodac

Delacroix

The “automated versatile modular reactor”: construction and use. J. Autom. Chem. 1987, 9, 166–173.

12.

Orita

Yasui

Otera

Automated synthesis: development of a new apparatus friendly to synthetic chemists (MEDLEY). Org. Process Res. Dev. 2000, 4, 333–336.

13.

Simms

Singh

Rapid process development and scale-up using a multiple reactor system. Org. Process Res. Dev. 2000, 4, 554–562.

14.

Sugawara

Cork

D. G.

Nonrobotic automated workstations for solution phase synthesis. In Laboratory Automation in the Chemical Industries; Cork

D. G.

Sugawara

, Eds.; Marcel Dekker, Inc.: New York; 2002; pp 41–72.

15.

Frisbee

A. R.

Nantz

M. H.

Kramer

G. W.

Fuchs

P. L.

Robotic orchestration of organic reactions: yield optimization via an automated system with operator-specified reaction sequences. J. Am. Chem. Soc. 1984, 106, 7143–7145.

16.

Weglarz

T. E.

Morabito

P. L.

Jr. Garner

J. L.

Automated reaction optimization with laboratory robotics. Lab. Robotics Autom. 1988, 1, 43–51.

17.

Matsuda

Ishibashi

Takeda

Simplex optimization of reaction conditions with an automated system. Chem. Pharm. Bull. 1988, 36, 3512–3518.

18.

Lindsey

J. S.

Corkan

L. A.

Erb

Powers

G. J.

Robotic workstation for microscale synthetic chemistry: on-line absorption spectroscopy, quantitative automated thin layer chromatography, and multiple reactions in parallel. Rev. Sci. Instrum. 1988, 59, 940–950.

19.

Josses

Joux

Barrier

Desmurs

J. R.

Bulliot

Ploquin

Metivier

Carrying out multiple reactions in organic synthesis with a robot. In Advances in Laboratory Automation Robotics, Vol. 6; Strimaitis

J. R.

Helfrich

J. P.

, Eds.; Zymark Corp: Hopkinton, MA; 1990; pp 463–475.

20.

Corkan

L. A.

Plouvier

J.-C.

Lindsey

J. S.

Application of an automated chemistry workstation to problems in synthetic chemistry. Chemom. Intell. Lab. Syst.: Lab. Inf. Mgt. 1992, 17, 95–105.

21.

Porte

Debreuille

Draskovic

Delacroix

Automation and optimization by simplex methods of 6-chlorohexanol synthesis. Process Contr. Qual. 1996, 8, 111–122.

22.

Spendley

Hext

G. R.

Himsworth

F. R.

Sequential application of simplex designs in optimisation and evolutionary operation. Technometrics 1962, 4, 441–461.

23.

Guette

J. P.

Crenne

Bulliot

Desmurs

J. R.

Igersheim

Automation in the organic chemistry laboratory: why? How?. Pure Appl. Chem. 1988, 60, 1669–1678.

24.

Lindsey

J. S.

A retrospective on the automation of laboratory synthetic chemistry. Chemom. Intell. Lab. Syst.: Lab. Inf. Mgt. 1992, 17, 15–45.

25.

Meier

M. A. R.

Hoogenboom

Schubert

U. S.

Combinatorial methods, automated synthesis and high-throughput screening in polymer research: the evolution continues. Macromol. Rapid Commun. 2004, 25, 21–33.

26.

Dar

Y. L.

High-throughput experimentation: a powerful enabling technology for the chemicals and materials industry. Macromol. Rapid Commun. 2004, 25, 34–47.

27.

Hird

MacLachlan

Robotic workstations and systems. In Laboratory Automation in the Chemical Industries; Cork

D. G.

Sugawara

, Eds.; Marcel Dekker, Inc.: New York; 2002; pp 1–39.

28.

Mills

J. E.

Optimization of organic process chemistry. In Laboratory Automation in the Chemical Industries; Cork

D. G.

Sugawara

, Eds.; Marcel Dekker, Inc.: New York; 2002; pp 263–301.

29.

Okamoto

Deuchi

Design of a robotic workstation for automated organic synthesis. Lab. Robotics Automat. 2000, 12, 2–11.

30.

Harre

Tilstam

Weinmann

Breaking the new bottleneck: automated synthesis in chemical process research and development. Org. Process Res. Dev. 1999, 3, 304–318.

31.

Hird

N. W.

Automated synthesis: new tools for the organic chemist. Drug Discov. Today 1999, 4, 265–274.

32.

High-Throughput Synthesis; Sucholeiki

, Ed.; Marcel Dekker, Inc.: New York; 2001.

33.

Wagner

R. W.

Lindsey

J. S.

Investigation of cocatalysis conditions using an automated microscale multireactor workstation: synthesis of meso-tetramesitylporphyrin. Org. Process Res. Dev. 1999, 3, 28–37.

34.

Cork

D. G.

Sugawara

Lindsey

J. S.

Corkan

L. A.

Further development of a versatile microscale automated workstation for parallel adaptive experimentation. Lab. Robot. Autom. 1999, 11, 217–223.

35.

Corkan

L. A.

Yang

Kuo

P. Y.

Lindsey

J. S.

An automated microscale chemistry workstation capable of parallel, adaptive experimentation. Chemom. Intell. Lab. Syst. 1999, 48, 181–203.

36.

Corkan

L. A.

Lindsey

J. S.

Experiment manager software for an automated chemistry workstation, including a scheduler for parallel experimentation. Chemom. Intell. Lab. Syst.: Lab. Inf. Mgt. 1992, 17, 47–74.

37.

Plouvier

J.-C.

Corkan

L. A.

Lindsey

J. S.

Experiment planner for strategic experimentation with an automated chemistry workstation. Chemom. Intell. Lab. Syst.: Lab. Inf. Mgmt. 1992, 17, 75–94.

38.

Shen

Kuo

P. Y.

Lindsey

J. S.

Decision-tree programs for an adaptive automated chemistry workstation: application to catalyst screening experiments. Chemom. Intell. Lab. Syst., 48, 205–217.

39.

Kuo

P. Y.

Corkan

L. A.

Yang

Lindsey

J. S.

A planning module for performing grid search, factorial design, and related combinatorial studies on an automated chemistry workstation. Chemom. Intell. Lab. Syst. 1999, 48, 219–234.

40.

Jindal

Lindsey

J. S.

Implementation of the multidirectional search algorithm on an automated chemistry workstation. A parallel yet adaptive approach for reaction optimization. Chemom. Intell. Lab. Syst. 1999, 48, 235–256.

41.

Dixon

J. M.

Cork

D. G.

Lindsey

J. S.

An experiment planner for performing successive focused grid searches with an automated chemistry workstation. Chemom. Intell. Lab. Syst. 2002, 62, 115–128.

42.

Matsumoto

Lindsey

J. S.

A parallel simplex search method for use with an automated chemistry workstation. Chemom. Intell. Lab. Syst. 2002, 62, 129–147.

43.

Matsumoto

Lindsey

J. S.

A two-tiered strategy for simplex and multidirectional optimization of reactions with an automated chemistry workstation. Chemom. Intell. Lab. Syst. 2002, 62, 149–158.

44.

Lindsey

J. S.

An approach for parallel and adaptive screening of discrete compounds followed by optimization using an automated chemistry workstation. Chemom. Intell. Lab. Syst. 2002, 62, 159–170.

45.

Aarts

R. J.

Lindsey

J. S.

Corkan

L. A.

Smith

S. F.

Flexible protocols improve parallel experimentation throughput. Clin. Chem. 1995, 41, 1004–1010.

46.

Betteridge

Wade

A. P.

Howard

A. G.

Reflections on the modified simplex-II. Talanta 1985, 32, 723–734.

47.

Torczon

On the convergence of the multidirectional search algorithm. SIAM J. Optimization 1991, 1, 123–145.

48.

Dennis

J. E.

Jr. Torczon

Direct search methods on parallel machines. SIAM J. Optimization 1991, 1, 448–474.

49.

Dixon

J. M.

Lindsey

J. S.

An experiment planner for parallel multidirectional searches using an automated chemistry workstation. J. Assoc. Lab. Autom. 2004, 9, 355–363.

50.

Kolda

T. G.

Lewis

R. M.

Torczon

Optimization by direct search: new perspectives on some classical and modern methods. SIAM Rev. 2003, 45, 385–482.

Performance of Search Algorithms in the Examination of Chemical Reaction Spaces with an Automated Chemistry Workstation

Abstract

Keywords

Introduction

Results and Discussion

Automated Chemistry Workstation

Optimization Algorithms

Algorithm Evaluation

Conclusions

Acknowledgment

References