Abstract
We study the effect of latency arbitrage on allocative efficiency and liquidity in fragmented financial markets. We employ a simple model of latency arbitrage in which a single security is traded on two exchanges, with price quotes available to regular traders only after some delay. An infinitely fast arbitrageur reaps profits when the two markets diverge due to this latency in cross-market communication. Using an agent-based approach, we simulate interactions between high-frequency and zero-intelligence trading agents. From simulation data over a large space of strategy combinations, we estimate game models and compute strategic equilibria in a variety of market environments. We then evaluate allocative efficiency and market liquidity in equilibrium, and we find that market fragmentation and the presence of a latency arbitrageur reduces total surplus and negatively impacts liquidity. By replacing continuous-time markets with periodic call markets, we eliminate latency arbitrage opportunities and achieve further efficiency gains through the aggregation of orders over short time periods.
Introduction
The predominantly electronic infrastructure of the U.S. stock market has come under intense scrutiny in recent years, during which several major technology-related disruptions have roiled the markets. In August 2013, for example, an overflow of market quotes caused a three-hour halt in trading at Nasdaq (De La Merced, 2013) and, in a separate incident, Goldman Sachs unintentionally flooded U.S. exchanges with a large number of erroneous stock-option orders (Gammeltoft and Griffin, 2013). Nasdaq’s computer systems were similarly overwhelmed during the Facebook IPO on May 18, 2012, when a surge in order cancellations and updates delayed the opening of the shares for trading (Mehta, 2012). Even more disruptive were the massive losses incurred by Knight Capital due to software misconfiguration in August 2012 (Securities and Exchange Commission, 2013) and the so-called “Flash Crash” of May 6, 2010, during which the Dow Jones Industrial Average exhibited its largest single-day decline (approximately 1,000 points) (Bowley, 2010).
These episodes of market turbulence are symptomatic of today’s trading landscape, a fragmented and complex system of interconnected electronic markets that compete with each other for order flow. There are over 40 trading venues for stocks in the U.S. alone (O’Hara and Ye, 2011). The majority of activity on these markets comes from
In trading,
Trading on these latency advantages has been estimated to account for $21 billion in profit per year (Schneider, 2012) 1 . High-frequency traders gain latency advantages through various means. One method is co-location, in which HFT firms pay a premium to place their computers in the same data center that houses an exchange’s servers. Many HFT firms also pay for direct data feeds in order to receive market data and market-moving information faster than non-HF investors. However, firms may spend millions of dollars to build a new, faster communication line only to be made obsolete by technology improvements that shave off additional milliseconds. One example of this rapid antiquation is Spread Networks’ fiber optic cable, which was deprecated less than two years after its completion by the introduction of a network reliant on microwave beams through air (Adler, 2012). According to estimates by the Tabb Group, firms spent approximately $1.5 billion in 2013 on technology to reduce latency (Patterson, 2014).
The HFT strategy we examine here is
We illustrate this process and the potential for latency arbitrage in Fig. 1. Given order information from exchanges, the SIP takes some finite time, say

Exploitation of latency differential. Rapid processing of the order stream enables private computation of the NBBO before it is reflected in the public quote from the SIP.
The latency arms race as sketched above is fundamentally an outgrowth of
More broadly, we seek to understand not only the effects of latency arbitrage on market efficiency and liquidity, but also the interplay between fragmentation, clearing mechanisms, and latency arbitrage strategies in producing this performance. Such questions about HFT implications are inherently computational, as the very speed of operation renders details of internal market operations—especially the structure of communication channels—systematically relevant to market performance. In particular, the latencies between market events (transactions, price updates, order submissions) and when market participants observe these activities become pivotal, as even the smallest latency differential can significantly affect tradingoutcomes.
Previous work on the effects of high-frequency trading and market structure has relied primarily on either analytical models or examination of historical order and transaction data. Historical market data alone is insufficient as it cannot be used to answer counterfactual questions about the impact of modifying strategies or market rules. Analytical models, on the other hand, can capture essential aspects of market structure, but would require stifling complexity to specify the interactions between multiple entities or the precise timing of event occurrences (such as the propagation of information between markets and participants)—at which point a closed-form solution or any other reasoning would be rendered infeasible or otherwise unhelpful. Lacking suitable data to study these questions empirically 2 we pursue a simulation approach.
We present a simple model that captures the effect of latency across two markets with a single security. Our model captures the interplay of latency and fragmentation as well as the regulatory environment responsible for current equity market structure, and we quantify the effect of latency arbitrage on surplus allocation as a function of latency and market rules. Using an agent-based approach, we implement our two-market model in a discrete-event simulation system that explicitly models the communication patterns between background investors, exchanges, and the SIP operating in current U.S. equity markets. We simulate the interactions between high-frequency and background traders, and we employ empirical game-theoretic analysis to identify equilibria under different market conditions.
We are primarily interested in the impact on efficiency of three different market features: presence of latency arbitrage, market fragmentation, and switching to discrete-time market clearing. Therefore, we compare allocative efficiency in equilibrium in our two-market model with equilibrium welfare in other models of market structure, including a consolidated continuous double auction market and a frequent call market. Our main finding is that in most of the model settings studied, latency arbitrage not only reduces profits of the background investors, but also diminishes surplus overall—even when the profits of LA are counted. Perhaps surprisingly, market fragmentation per se does not harm efficiency; in fact some degree of fragmentation mitigates the inefficient trades that are often executed by a continuous mechanism. The discrete-time frequent call market eliminates latency arbitrage by construction and, by virtue of temporal aggregation, yet more effectively matches orders, producing significantly greater surplus.
This study extends and supersedes our previous work (Wah and Wellman 2013). In that study, traders employed a fixed strategy for all market configurations and latency settings. The analysis presented here employs empirical game-theoretic methods to perform strategy selection for traders. The qualitative conclusions presented in the original study still hold; the results we report here serve to confirm those main points in a more extensive and strategically robust evaluation.
This paper is structured as follows. In Section 2, we discuss related work on agent-based financial markets and models of HFT and market structure. We present the general framework for our agent-based financial market models in Section 3. We describe our two-market model in Section 4, the computational approach we employ in Section 5, and our experimental setup in Section 6. In Section 7 we present our results, and we conclude in Section 8.
Agent-based financial markets
There is a substantial literature on agent-based modeling (ABM) of financial markets (Buchanan, 2009; Chen et al., 2012; Farmer and Foley, 2009; LeBaron, 2006), much of it geared to reproduce and thereby explain stylized facts from empirical studies of market behavior. For example, simulated markets have been constructed to reproduce phenomena observed in real stock markets, such as bubbles and crashes (LeBaron et al., 1999; Lee et al., 2011). Because agent behavior is shaped by the market environment, which includes interactions with other agents over time, such models can support causal reasoning (as in the study by (Thurner et al. (2012) establishing the effect of leverage on price volatility). One prominent example of an agent-based financial market is the Santa Fe artificial stock market (Palmer et al., 1994; LeBaron, 2004). ABM has also been used to model financial markets for applications such as portfolio selection (Jacobs et al., 2004) and determining the distributions of order and trading waiting times in a limit order book (Raberto and Cincotti, 2005).
High-frequency trading models
Much of the current literature on the effects of HFT relies on the evaluation of historical order data. Hasbrouck and Saar (2013) use Nasdaq order data to construct sequences of linked messages describing trading strategies. They find that this low-latency activity improves short-term volatility, spreads, and market depth. Brogaard (2010) analyzes a 120-stock Nasdaq dataset that distinguishes HFT from non-HFT activity in order to assess the impact of high-frequency trading on liquidity, price discovery, and volatility. Prior work suggests that algorithmic trading improves liquidity (Hendershott et al., 2011); Angel et al. (2011) reach similar conclusions, finding that the emergence of automated trading and HFT has improved various market measures such as execution speed and spreads. Additional work suggests a link between HFT and increased volatility (Arnuk and Saluzzi, 2012). Foucault et al. (2015) examine latency arbitrage opportunities in currency markets, and provide evidence of a tradeoff between pricing efficiency and liquidity. In another study, Baron et al. (2012) find that some kinds of HFT activities directly harm ordinary investors.
Others rely on theoretical analysis to determine the optimal behavior of high-frequency traders. Avellaneda and Stoikov (2008) derive an optimal limit order submission strategy for a single high-frequency trader acting as a liquidity provider, running numerical simulations to assess the agent’s performance under varying strategies. Cohen and Szpruch (2012) propose a single-market model of latency arbitrage with one limit order book and two investors operating at different speeds. The fast trader employs a strategy that determines in advance the quantity the slow investor intends to trade, using this information to generate a risk-free profit. Jarrow and Protter (2012) develop a model of traders with differentials in speed and access to information, showing that HFT transactions can degrade price discovery, exacerbate volatility and increase mispricings—which HF arbitrageurs can then exploit.
In a rare application of ABM to HFT, Hanson (2012) finds that market liquidity and total surplus vary directly with the number of HF traders.
Modeling market structure and clearing rules
Several prior works seek to identify the effects of market fragmentation and clearing rules, mainly via anecdotal evidence elicited from historical data. On the theoretical side, Mendelson (1987) investigates the effect of consolidation versus fragmentation of periodic call markets, without consideration of arbitrage between the submarkets. O’Hara and Ye (2011) use historical quote data and execution metrics to demonstrate that market fragmentation does not appear to harm measures such as spreads, execution speed, and efficiency. Bennett and Wei (2006) compare the execution costs of stocks that have switched from Nasdaq to the more consolidated NYSE, finding evidence that execution costs decline with order flow consolidation. Amihud et al. (2003) examine the response of equities on the Tel Aviv Stock Exchange to the exercise of corporate warrants, concluding that consolidation improves liquidity.
However, few prior studies attempt to directly model the communication latencies arising from market fragmentation and the resultant arbitrage opportunities, with the exception of Ding et al. (2014), who analyze NBBO latencies and the ability of HFTs to generate a synthetic NBBO. They conclude that price dislocations between the official and synthetic NBBOs can be exploited by HFTs for profit.
Switching to a discrete-time clearing mechanism, as in a frequent call market, has already been proposed as a means to eliminate the exploitation of latency differentials across multiple exchanges Wellman, 2009; Schwartz and Peng, 2013; Sparrow, 2012). Budish et al. (2013) analyze a theoretical model of a continuous limit order book, showing that HFT profits in equilibrium come from investors via wider spreads and that frequent batch auctions reduce the value of very small speed advantages. Others have proposed variants on the frequent call market with randomized clearing intervals (Sellberg, 2010; Industry Super Network, 2013), or randomized batching in conjunction with pro rata trade allocation rules, which may promote more equitable allocation of trades among investors (Farmer and Skouras, 2012; McPartland, 2013).
A number of other studies have focused not on the role of call markets in mitigating the harmful effects of HFT, but on the differences in market quality offered in a discrete-time versus a continuous market (Pancs, 2013; Pellizzari and Dal Forno, 2007) or an alternative market rule such as selective delay, in which cancellation orders are processed immediately but all other order types have a small delay (Baldauf and Mollner, 2014).
Empirical work on the effects of switching to periodic clearing is limited and again relies largely on the analysis of historical events (Webb et al., 2007; Kalay et al. 2002). For example, Amihud et al. (1997) find that switching from a daily call auction to a combination of discrete and continuous trading in the Tel Aviv Stock Exchange is associated with improvements in liquidity.
Agent-based financial market models
In this section, we present our general framework for constructing computational agent-based financial market models. We focus on two types of markets in this study. The
Our market models are populated by
Market clearing mechanisms
The continuous double auction is a simple and standard two-sided market that forms the basis for most financial and commodities markets (Friedman, 1993). Agents submit bids, or
An alternative to continuous trading is a
A frequent call market effectively eliminates the latency advantages of HFTs by hiding all submitted orders within each clearing interval, as in a sealed-bid auction. The removal of time priority within each batch period helps ensure that standing offers cannot be readily picked off by incoming orders, thereby transforming the competition on speed into a competition on price. This ensures that there is no significant advantage to receiving and responding to information faster than other traders, because all orders within a clearing interval are processed and matched at the same time. Periodic clears every second or so would be imperceptible to most investors but would prevent the exploitation of small speed advantages, thus curbing HFT participation in the latency armsrace.
In our implementation of these market models, prices are fine-grained but discrete, taking values at integer time points. Agents arrive at designated times, and submit limit orders to their associated market(s). Each market continually publishes a price quote consisting of two parts, the
Valuation model
Each background trader has a valuation for the security in question, comprised of private and common components. The common component is defined as follows. We denote by
Parameter
The private component for agent
Element is the incremental private benefit obtained from selling one unit of the security given current position
We generate
Background trader
For a single-quantity limit order transacting at time
Since the price and fundamental terms cancel out in exchange, the total surplus achieved when agent
Background-trader strategies
There is an extensive literature on autonomous bidding strategies for CDAs (Das et al., 2001; Friedman, 1993; Wellman, 2011). In this study, we consider trading strategies in the so-called
The background traders arrive at the market according to a Poisson process with rate
Recall that each background trader has an individual valuation for the security comprised of private and common components, as described in the previous section. Based on this valuation, each background trader obtains a payoff at the end of the simulation period. This payoff is computed as the sum of the private value of the trader’s holdings, the net cash flow from trading, and the liquidation proceeds of any accumulated inventory at the end-time fundamental value
A ZI trader assesses its valuation
The ZI agent then submits a bid shaded from this estimate by a random offset—the degree of surplus it demands from the trade. The amount of shading is drawn uniformly from range [
We extend ZI by including a threshold parameter
Two-market model
We present a simple model for latency arbitrage across two markets populated by a single high-frequency trader and multiple background traders. We describe the specifics of this model in Section 4.1. The valuation model and class of strategies employed by the background investors are as described in Sections 3.2 and 3.3, respectively. In Section 4.2, we discuss the behavior of the latency arbitrageur. We present an example of how a latency arbitrage opportunity can arise in this two-market model in Section 4.3.
Model description
Our model of latency arbitrage consists of one security traded on two markets, each employing a continuous double auction mechanism (Section 3.1). The two markets are linked by a public NBBO signal (see Fig. 2). Limit orders lodged in either market are forwarded to the SIP, which calculates and reports an NBBO—based on the quotes from the two markets—with some finite delay

Two-market model with one infinitely fast latency arbitrageur and multiple background investors. A single security is traded on the two markets. Each background investor is associated primarily with one of the two markets, and its order is routed to its alternate market if and only if the NBBO quote indicates an immediate execution. The latency arbitrageur has undelayed access to both markets, so it can immediately detect arbitrage opportunities arising from the delay in NBBO calculation.
Retail and institutional investors generate limit orders according to an evolving fundamental (driven by news) and other private factors. Each non-HF investor is primarily associated with one of the two markets. An order is sent to the trader’s primary market unless the NBBO indicates that it could be executed in the alternate market at a price better than that available on the primary market.
More precisely, let
The latency arbitrageur in this model can determine the best prices in each market before the NBBO updates, due to its ability to receive and process order streams faster than background investors. It can thus directly detect an arbitrage situation, which occurs whenever
The latency arbitrageur (LA) in the two-market model operates as follows. LA first obtains current price quotes in both markets, then checks whether an arbitrage situation exists. We denote the best price available to sell at by
Example
Figure 3 illustrates how a latency arbitrage opportunity may arise in our two-market model. At time

Emergence of a latency arbitrage opportunity over two time steps in the two-market model.
All orders are for single-unit quantities.
A red, bolded price highlights a discrepancy between the actual market state and the NBBO, represented in the diagram as (
To answer questions regarding the interplay between trader behavior and market structure, we employ a computational approach that combines agent-based modeling (ABM), simulation, and equilibrium computation. In ABM, autonomous agents interact dynamically based on algorithmic rules. These rules govern each agent’s actions and responses, but do not explicitly define or specify aggregate outcomes; instead, system-level phenomena are a consequence of collective agent behavior. We simulate interactions between agents in a variety of market environments to study the effect of market structure and trader strategies on market performance. We present our simulation system in Section 5.1.
Using trader performance assessed from simulation runs, we employ game-theoretic analysis to evaluate traders’ strategic interactions with each other under a variety of market settings. We focus on trader behavior in equilibrium, when all market participants are best responding to each others’ strategies in order to optimize their own gains from trade. Equilibrium outcomes offer a basis for predicting agents’ actions taking account of their strategic decision making. We explore various market scenarios and environments in order to characterize trader behavior in equilibrium under different market conditions. We describe the methodology we employ to identify equilibria in Section 5.2.
In order to mitigate the stochasticity in our simulations and reduce sampling error, we collect large numbers of observations for each environment setting and trader population of interest. We utilize the EGTAOnline infrastructure (Cassell and Wellman, 2013) to conduct and manage our experiments, and we run our simulations on the high-performance computing cluster at the University of Michigan.
Discrete-event simulation
The financial markets we study are stochastic, dynamic systems with discrete states that change in response to communication events. These events occur at high frequency, even on the order of microseconds. To faithfully model such systems in simulation, ensuring the unambiguous timing of agent and market interactions is paramount. This necessitates fine-grained modeling at the level of communication. We therefore design our system based on principles of
Our financial market simulation system, based on that described in detail by Wah and Wellman (2013), affords sufficient versatility to model a wide range of market environments, including variform populations of market participants, as well as different market structures (e.g., varying in the number of markets or types of market mechanisms employed). The simulator has been extended by other members of the Strategic Reasoning Group at the University of Michigan and employed in several other studies.
Empirical game-theoretic analysis
We model the strategic situation of background traders as a
The simulation system discussed in the previous section takes as input a strategy profile and generates as output a sample outcome. Through a process of
In this study, we model the market as a
EGTA process
To analyze a game, we apply EGTA in an iterative manner, interleaving exploration of the profile space with analysis of the
Observed payoffs from simulation runs of a given profile are added incrementally to the empirical game’s payoff matrix. For this reason, the game is incomplete at any point during the EGTA process, as some profiles have been empirically evaluated whereas others have not. Each update to the empirical game’s payoff matrix generates an intermediate game model. As payoffs from simulation are incorporated into the empirical game, we analyze each successive intermediate game model by computing (mixed) equilibria for each
We simulate additional profiles for a game until we have confirmed at least one symmetric NE, evaluated every pure-strategy symmetric profile, and pursued with some degree of diligence every equilibrium candidate encountered. More specifically, we continue to refine the empirical game with additional simulations until the following conditions are met: at least one equilibrium is confirmed, all non-confirmed candidates are refuted (up to a threshold support size), and for all refuted candidates (up to the threshold support size), we have explored subgames formed by adding the best response to the candidate’s support.
When this process reaches quiescence, we consider the search to have satisfied the diligence requirement.
The procedure described above seeks to either confirm or refute the equilibrium candidates detected in our exploration of the strategy space. As we are not able to exhaustively search the entire profile space, however, additional qualitatively distinct equilibria are always possible. In addition, the equilibria we find are subject to refutation by other strategies outside the specified set. Our search process described above attempts to evaluate all promising equilibrium candidates (e.g., by exploring subgames extending the support of a refuted candidate with the best response), but identifying these is not guaranteed.
Game reduction
Even with a moderate number of players, the
DPR preserves the payoffs from single-player, unilateral deviations, and maintains in the reduced game the same proportion of opponents playing each strategy as in the full game. In a deviation-preserving reduced game, each player views itself as controlling one full-game agent and views the other-agent profile in the reduced game as an aggregation of all other players in the full game. Although the equilibrium approximations obtained via DPR are not guaranteed estimates, DPR has been shown to produce good approximations in other games (Wiedenbeck and Wellman, 2012).
DPR defines reduced-game payoffs in terms of payoffs in the full game as follows. Consider first an
Experiments
To isolate the ramifications of market fragmentation, we consider two consolidated market configurations in addition to the two-market model: a CDA and a frequent call market. Recall that in contrast to a continuous-time market, clearing in a frequent call market takes place at designated intervals (Section 3.1). A frequent call market eliminates latency arbitrage opportunities, as the periodic clearing mechanism makes it impossible to gain or exploit informational advantages over other market participants within the clearing interval.
In exploring the relationship between trader behavior and market structure, we are interested in the following performance characteristics:
Our experiments (Table 1) evaluate a number of market features, defined by different combinations of market configurations:
Experimental design for evaluating different market features
Each row of the table describes the market configurations included (as indicated by the plus symbol) in evaluating a given market feature. The four market configurations are the two-market model (2M) both with and without LA, the consolidated CDA, and the frequent call market.
Experimental design for evaluating different market features
Each row of the table describes the market configurations included (as indicated by the plus symbol) in evaluating a given market feature. The four market configurations are the two-market model (2M) both with and without LA, the consolidated CDA, and the frequent call market.
We evaluate and compare the performance of the four market structure configurations (two-market model with and without LA, CDA, and frequent call market) within three distinct environments. For the fragmented cases, an equal proportion of background traders is assigned primary affiliation with each market in a model. In the consolidated call market, orders transact at a uniform price each time the market clears; this price is computed to best match supply and demand (Section 3.1).
In defining our environments, we selected environment parameters that generate sufficient arbitrage opportunities and also replicate the original findings for fixed-strategy, non-equilibrium comparisons from our previous study (Wah and Wellman, 2013). To do so, we explored a number of environments, varying the number of traders, trading horizon length, degree of mean reversion, and variance in both the fundamental and private values. In these runs, all traders employed a fixed strategy with
The threshold
ZI strategy combinations included in empiricalgame-theoretic analysis
ZI strategy combinations included in empiricalgame-theoretic analysis
The environments differ in number of background traders (
Parameter settings for the three market environments
We examine 23 empirical games within environment 1, which cover the four market configurations across 8 settings of latency
Empirical games across the three market environments
The empirical games for each environment include a consolidated CDA, which is independent of latency, and one game at each latency setting for the other three market configurations.
At latency 0, the fragmented models are equivalent to the CDA, as there are no arbitrage opportunities. Latency here with regards to the frequent call market indicates the length of the clearing interval.
Empirical games across the three market environments
The empirical games for each environment include a consolidated CDA, which is independent of latency, and one game at each latency setting for the other three market configurations. At latency 0, the fragmented models are equivalent to the CDA, as there are no arbitrage opportunities. Latency here with regards to the frequent call market indicates the length of the clearing interval.
We compare the equilibria found in the empirical games according to the experimental design described in Table 1 to evaluate the effect of latency arbitrage, market fragmentation, and batching on market efficiency and liquidity.
The strategic situations for each market structure are modeled as symmetric games (Section 5.2). We apply deviation-preserving reduction (Section 5.4) to generate an approximation of the full game with fewer players. Specifically, we estimate 4-player reduced games from full games with
We find in these settings that the presence of a latency arbitrageur reduces total surplus (Section 7.1) and has a mixed effect on market liquidity (Section 7.3). Eliminating fragmentation can improve surplus (Section 7.2) and execution metrics (Section 7.3). Replacing continuous markets with frequent call markets eliminates latency arbitrage opportunities and achieves substantial efficiency gains in all three environments (Section 7.4). Our results are summarized in Table 5.
Overview of experimental results
Overview of experimental results
We identified 1–3 equilibria for each of the 23 games in environment 1 (Tables 6 and 7), the 8 games in environment 2 (Tables 8 and 9), and the 14 games in environment 3 (Tables 10 and 11). For each equilibrium, we estimated background-trader surplus, as well as LA profit if applicable, by sampling 500 profiles according to the equilibrium mixture, and running 100 simulations per sampled profile (50,000 full-game simulations in total).
Symmetric equilibria for environment 1,
Symmetric equilibria for empirical games for environment 1, one per latency (or clearing interval) setting per market configuration,
Complete specifications of symmetric equilibria for environment 1,
Symmetric equilibria for empirical games for environment 1,
Symmetric equilibria for environment 2,
Symmetric equilibria for empirical games for environment 2, one per latency (or clearing interval) setting per market configuration,
Complete specifications of symmetric equilibria for environment 2,
Symmetric equilibria for empirical games for environment 2,
Symmetric equilibria for environment 3,
Symmetric equilibria for empirical games for environment 3, one per latency (or clearing interval) setting per market configuration,
Complete specifications of symmetric equilibria for environment 3,
Symmetric equilibria for empirical games for environment 3,
Figure 4 shows the total surplus, for the consolidated CDA and the two-market model with and without a latency arbitrageur, over multiple latency settings in the three environments. The total surplus of the two-market model without LA, as well as that of the single CDA market (an unfragmented continuous-time market), generally exceeds that of the two-market model with LA, whether or not the profits of LA are counted. This holds across the three environments. In other words, the latency arbitrageur takes surplus away from the background investors, and the amount it deducts exceeds the gross trading profit it accrues.

Total surplus in the two-market (2M) model, both with and without a latency arbitrageur, and in the consolidated CDA market, for the three environments. In the two-market model with LA, both the total surplus (ZI + LA) and background-trader surplus (ZI only) are plotted. Each point reflects the average over 50,000 simulation runs of the maximum-welfare equilibrium for each market configuration and latency setting.
The intuition behind this result lies in differences in the orders selected to trade. Figure 5 demonstrates how changes in the order arrival sequence may lead to different levels of surplus. A continuous market will match two orders to trade immediately, regardless of whether this transaction improves allocative efficiency. The LA, by matching orders across the two fragmented markets, is prone to facilitate some inefficient trades that would not execute in the two-market model without latency arbitrage.

Welfare differences that arise from changes in order sequencing in continuous markets. The order book initially has two sell orders. Two buy orders arrive, with different sequencing, over the course of two time steps. In the top scenario, the buy order at price 9 arrives before the buy order at 15, resulting in total surplus of 5 from two trades (assuming traders submit orders priced at their valuations). In the bottom scenario, the buy order at price 15 arrives first, which results in a more efficient transaction (with a higher surplus of 7) than the alternate scenario. Each pair of green circles indicate orders that have matched and traded at a given moment in time.
In environment 1, LA significantly degrades efficiency in the two-market model, and total LA profit accounts for half of aggregate surplus once nonzero latency is introduced. Environments 2 and 3, however, have reduced mean reversion, which increases background traders’ risk of adverse selection and having the LA pick off their standing orders. As a result, background traders in these two environments shade more in response to the LA. This can be seen by higher
Note that when latency is zero, the two fragmented models and the CDA market in Fig. 4 are effectively identical. The NBBO is always correct if there is no delay, so it is not possible for any latency arbitrage opportunities to emerge. It follows that the various market configurations at zero latency produce similar total surplus in equilibrium. Some differences between the consolidated and fragmented models, however, may arise due to strategies with
Consolidating the markets in a single CDA generally outperforms the fragmented market with LA in environments 1 and 2. This effect is muted in thinner markets when there are fewer trading opportunities, such as environment 3. As for the case without latency arbitrage, it may seem counterintuitive that welfare in the two-market model without LA is higher than in the consolidated CDA in some environments. It turns out that fragmentation can actually provide a benefit for continuous markets. The separated markets are less likely to admit inefficient trades (i.e., where both traders’ values fall on the same side of the longer-term equilibrium price) that arise due to the vagaries of arrival sequences (Wah and Wellman, 2013), as illustrated in Fig. 5. LA can defeat this benefit by ensuring that any orders that would match in the central CDA also trade in the fragmented case, albeit with LA rather than with a counterpart investor. This primarily applies when there are sufficient trading opportunities, as in environments 1 and 2. In a thicker market as in environment 2, fragmentation does not always boost surplus in the two-market model without LA, as there are many traders in each market who can act as counterparties for trade.
Effect of LA and fragmentation on liquidity
We also evaluate the effect of latency arbitrage on market liquidity, as measured via execution times and

Execution time in the two-market (2M) model, both with and without a latency arbitrageur, and in the consolidated CDA market, for the three environments. Execution time is the difference between bid submission and transaction times. Each point reflects the average over 50,000 simulation runs of the maximum-welfare equilibrium for the market configuration and latency setting.
Traders in the other two environments, however, do not shade more in equilibrium in the two-market model without LA. In these cases, the fastest execution is achieved in the consolidated CDA, which makes sense given the absence of both communication latencies and thinness induced by fragmentation.
Spreads can also be viewed as a measure of liquidity, with tighter spreads corresponding to greater market liquidity. The widest spreads are generally in the two-market model with LA (Fig. 7). LA also slightly exacerbates NBBO spreads, which are generally narrower than spreads of individual markets. The increase in spread could reflect an implicit transaction cost responsible for part of the surplus reduction observed above.

Median spread and NBBO spread in the two-market (2M) model, both with and without a latency arbitrageur, and in the consolidated CDA market, for the three environments. Spread is the amount by which
Lastly, we evaluate the effect of switching to a discrete-time frequent call market. In our frequent call market configuration, the latency setting dictates the clearing period. Figure 8 shows that the total surplus in the consolidated call market far exceeds that of the two-market model with LA, and the call market surplus is higher for all latency settings

Total surplus for the consolidated frequent call market and the two-market (2M) model with LA, for the three environments. Each point reflects the average over 50,000 simulation runs of the maximum-welfare equilibrium for each market configuration and latency setting.
As shown in Fig. 9, the mean execution time in the consolidated call market is much higher than that of the two-market model with LA. Unsurprisingly, we find that execution time in the call market is higher than that observed in the other market configurations. As market clears occur less frequently in the call market, it takes longer for a bid to match and be removed from the order book. In environment 1, execution time in the frequent call market plateaus at approximately 20 time steps, which is equivalent to the average time between trader reentries. In the other two environments, the execution time in the call market increases monotonically with the length of the clearing interval, since traders reenter less frequently than the market clears. This tradeoff between discrete clearing and execution speed may not significantly affect investors if the frequent call market matches orders frequently, such as once everysecond.

Execution time for the consolidated frequent call market and the two-market (2M) model with LA, for the three environments. Each point reflects the average over 50,000 simulation runs of the maximum-welfare equilibrium for each market configuration and latency setting.
In Fig. 10, we observe that the tightest spread is realized in the consolidated call market, for all three environments. Spreads in the frequent call market are measured at the end of each market clear. They represent the market liquidity after orders have traded in each interval. Since the call market generally matches orders to trade more efficiently than the CDA, its spreads tend to be tighter. The median spread decreases to some degree with latency due to the accumulation of bids in the order book, which is indicative of greater liquidity in the market. The temporal aggregation in the consolidated call market is also responsible for similarly tight NBBOspreads.

Median spread and NBBO spread for the consolidated frequent call market and the two-market (2M) model with LA, for the three environments. Each point reflects the average over 50,000 simulation runs of the maximum-welfare equilibrium for each market configuration and latency setting.
Figure 11 shows the total number of transactions in each market configuration, for the three environments, averaged over all observations at a given latency. In all three environments, the total number of transactions in the consolidated CDA and the two-market model without LA are generally comparable, though slightly lower in the latter. This is consistent with our observations of surplus patterns in Fig. 4. The two-market model without LA results in higher surplus despite a reduction in number of transactions, indicating that each transaction in the fragmented model is associated with more surplus on average than in the consolidated CDA. The number of LA transactions does not increase with latency, although the number of arbitrage opportunities grows as the NBBO update delay increases. Since the background traders strategically respond to the presence of the LA by submitting executable orders over limit orders, they are less likely to be picked off by the LA.

Total number of transactions in each of the four market configurations for the three environments. In the two consolidated markets (call and CDA) and the two-market (2M) model without LA, there is no latency arbitrage so transactions only occur between ZI traders. The rightmost bar in each group of four shows the total number of transactions in the two-market model with LA, with the top portion of the stacked bar representing the number of LA transactions and the bottom portion representing the ZI transactions. Each bar reflects the average over 50,000 simulation runs of the maximum-welfare equilibrium for each market configuration and latency setting.
In addition, the highest number of trades for a market configuration at a given latency setting in environment 1 is generally (although not always) observed in the call market. This is a result of the reduced
Our two-market model captures fragmentation in its simplest form, enabling our investigation of an important phenomenon in high-frequency trading: latency arbitrage. We implemented this model in a system combining agent-based modeling and discrete-event simulation. We employed empirical game-theoretic analysis to compute equilibria in games with variations in market structure and within three parametrically distinct environments, and we compared equilibrium outcomes in order to evaluate the interplay of latency arbitrage, market fragmentation, and market design, as well as their consequences for market performance.
Our results demonstrate that market efficiency in equilibrium is negatively affected by the actions of a latency arbitrageur, with no countervailing benefit in liquidity or any other measured market performance characteristic. Taking into consideration the substantial operational costs of the latency arms race would only amplify our conclusions about the harmful implications of this practice.
Somewhat counterintuitively, welfare in some environments of the fragmented model without LA is higher than in the consolidated CDA. It turns out fragmentation can provide a benefit in continuous markets, as the separation of markets mitigates the inefficient transactions that result from continuous trading. We find that the effect of fragmentation on liquidity varies depending on how traders strategically respond to the presence of LA.
Virtually all modern financial markets employ continuous trading, which enables speed-advantaged traders to exploit price differentials over fragmented markets. A frequent call market prevents high-frequency traders from gaining a meaningful latency advantage, thereby eliminating latency arbitrage opportunities and increasing surplus for background traders. Aggregating orders over small, regular time intervals provides efficiency gains over fragmented and continuous markets, and in fact these benefits appear to overshadow the gains attributable specifically to neutralizing latency arbitrage.
As with any simulation model, our results are valid only to the extent our assumptions capture the essence of real-world markets. Additional avenues for further study include examining the effect of more sophisticated HFT and background-trader strategies (such as those using historical information or responding to LA price signals), introducing other types of traders such as market makers, and further quantifying the impact of price discovery on efficiency.
Footnotes
Profit figures are considerably more uncertain than volume estimates. Kearns et al. (2010) present an interesting approach to derive an upper bound on HFT profits. Presumably the billions HFT firms invest annually in technology and infrastructure (Adler, 2012) represent a lower bound on gross trading profit.
Order activity at the temporal granularity of interest here is generally unavailable for public research, and it is unclear whether data on communication latencies and the end-to-end routing of orders among brokers and exchanges is available from any source. What high-frequency trading data does exist commercially is prohibitively expensive. Moreover, even full details on conceivably observable trading activity could not directly resolve counterfactual questions, such as the response of financial markets to possible shocks or the effects of alternative market rules and regulations.
With the exception of environment 1, the number of players
for the complete definition of the number of reduced-game players when divisibility does not hold.
Acknowledgments
We are grateful for constructive suggestions from Jacob Abernethy, Michael Barr, Uday Rajan, and members of the Strategic Reasoning Group at the University of Michigan, along with an anonymous reviewer, leading to improvements in the substance and presentation of this work. This research was supported in part by grant IIS-1421391 from the US National Science Foundation.
