Abstract
Detector data can be used to construct cumulative flow curves, which in turn can be used to estimate the traffic state. However, this approach is subject to the cumulative error problem. Multiple studies propose to mitigate the cumulative error problem using probe trajectory data. These studies often assume “no overtaking” and thus that the cumulative flow is zero over probe trajectories. However, in multi-lane traffic this assumption is often violated. Therefore, we present an approach to estimate the change in cumulative flow along probe trajectories between detectors based on disaggregated detector data. The approach is tested with empirical data and in microsimulation. This shows that the approach is a clear improvement over assuming “no overtaking” in free-flow conditions. However, the benefits are not clear in varying traffic conditions. The approach can be applied in practice to mitigate the cumulative error problem and estimate the traffic state based on the resulting cumulative flow curves. As the performance of the approach depends on the changes in traffic conditions, it is suggested to use the probe speed observations between detectors to assign an uncertainty to the change in cumulative flow estimates. Furthermore, a potential option for future work is to use more elaborate schemes to estimate the probe relative flow between detectors, which may, for instance, combine probe speeds with estimates of the macroscopic states along the probe trajectory. If these macroscopic estimates are based on the cumulative flow curves at the detector locations, this would result in an iterative approach.
Traffic state estimation (TSE) is important in dynamic traffic management (DTM) applications ( 1 ). TSE aims to infer the traffic state (which may be described using different variables) from incomplete and inaccurate information, for example, partially observed and noisy traffic-sensing data and traffic-flow models. The traffic state estimates can be used as input for different types of DTM applications, for example, local ramp-metering ( 2 ) or network-wide traffic management ( 3 ).
Throughout this study, road segments without discontinuities (which are denoted as links) are considered. In these links, the conservation-of-vehicles condition holds. Vehicles enter (flow in) the link at the upstream boundary and leave (flow out) the link at the downstream boundary.
Traffic can be described using three dimensions: space
Multiple methodologies have been proposed to estimate the traffic state in or via the cumulative flow plane. For instance, Newell’s (three-detector) method (
5
,
6
), Claudel’s method (
7
,
8
), Sun’s method (
9
) and Van Erp’s principles (
10
), all apply variational theory (
11
,
12
) to estimate the cumulative flow over space and time, that is,
To obtain the cumulative flow curves, we rely on traffic-sensing data. Stationary observers such as a loop-detectors can be used to observe the flow with respect to a fixed position and can thus be used to construct the cumulative flow curves at these locations. However, these curves should be initialized; that is, we want to know the number of vehicles that are between the detectors at the initial time, and we need to address the cumulative error problem. Over time, error in the flow observations accumulates, causing a drift in the cumulative flow estimation error. If the problem is not mitigated, the traffic state estimates that are based on the cumulative curves will become highly inaccurate. Therefore, multiple studies propose to use other data to periodically recover the cumulative error ( 9 , 13 , 14 , 16 ).
Bhaskar et al. (
16
), Van Lint and Hoogendoorn (
14
), and Sun et al. (
9
) use probe trajectory or vehicle re-identification data to mitigate the cumulative error problem. In these studies, it is assumed that there is no overtaking, that is, the cumulative flow value is constant over the probe trajectory (
To study the possibility to estimate the change in cumulative flow (
The main contribution of this paper is the design and evaluation of a methodology to estimate the change in cumulative flow along probe trajectories based on detector data. This methodology estimates the probe relative flow at the detector locations and uses these relative flows to estimate the relative flow over the full probe trajectory between detector locations. Evaluation using real and simulated data shows that in most cases estimation of relative flow using detector data is an improvement over the assumption that the relative flow is zero. However, changes in traffic conditions (e.g., when a probe encounters a traffic jam) negatively affect the estimation performance. The methodology and the insight that its estimates are more accurate when the probe does not encounter large changes in traffic conditions are both valuable to construct cumulative flow curves. These curves can, for instance, be constructed based on detector and probe trajectory data using a Bayesian approach (e.g., using a Kalman Filter). In such an approach, the proposed methodology can be used to obtain prior estimates, while the expected accuracy of this approach can be used to assign the error characteristics to the prior estimates.
This article is structured as follows: First, the theoretical foundations that are relevant for this study are explained. Next, a methodology to estimate the change in cumulative flow along probe trajectories between detector locations is presented. The performance methodology is testing using simulated and real data. After explaining how these data are used to assess the estimation performance of the methodology, we present the results. Finally, the conclusions and insights of this study are presented.
Theoretical Foundations
As explained in the introduction, this study aims to estimate the change in cumulative flow along probe trajectories between two detector locations. This describes the number of vehicles that have overtaken the probe vehicle minus the number of vehicles that are overtaken by the probe vehicle. This section provides the theoretical foundations that are relevant in this study. First, we explain that the change in cumulative flow along a probe trajectory depends on the individual probe speed and macroscopic traffic-flow variables. Second, we explain how the change in cumulative flow along two probe trajectories relates to detector passing observations. The former is important to design the methodology that is proposed in the next section, whereas the latter is important to evaluate that methodology in an empirical case study.
Change in Cumulative Flow along a Probe Trajectory
The position of vehicle

Visualizations related to the theoretical foundations. (a) Probe trajectory
The cumulative flow
A positive value indicates that
The change in cumulative flow
where
Figure 1b visually shows how Equation 2 can be interpreted. The probe relative flow
Equations 1 and 2 state that the change in cumulative flow along a probe trajectory depends on the probe speed and the macroscopic variables along this trajectory. In this study, probe trajectory data are considered that do not contain observations of the relative flow. However, the relations provided in this section show that other data related to the macroscopic variables (i.e., detector data) can be used to estimate the probe relative flow. Therefore, in the next section, a methodology is proposed to estimate the change cumulative flow along probe trajectories using detector data.
Differences in the Change in Cumulative Flow between Probe Trajectories
Equation 2 shows that it is possible to evaluate the accuracy of
Let us consider a combination of two (consecutive) detectors and two (consecutive) probe vehicles; see Figure 1c. The thick solid black lines in this figure show the observation paths of the detectors, and the thick dashed blue lines show two probe trajectories. In case we combine detector and probe trajectory data, the change in cumulative flow
where the elements in the left part (
In Equation 3,
The other parts of Equation 3, that is,
Therefore, we can describe the sum of the (net) detector observation errors (
This relation is used in the empirical case study to evaluate the accuracy of estimates related to the change in cumulative flow along probe trajectories (i.e.,
Methodology to Estimate the Change in Cumulative Flow between Detectors
To estimate the change in cumulative flow over probe trajectories between detectors, we will rely on disaggregated detector data. Disaggregated lane-specific detector data are collected using double loop-detectors. These data describe each individual passing
The disaggregated data can be used to calculate the macroscopic traffic states. Within a defined period
Equation 1 shows how the probe relative flow relates to the macroscopic traffic-flow variables and the individual probe speed. This equation is applied to estimate the probe relative flow at the times that it passes the detector locations. For this purpose, a time-window of length
The detector data provide probe relative flow
Depending on probe vehicle driving behavior and the traffic conditions that are encountered by the probe vehicle, the probe relative flow can change along its trajectory. The probe trajectory data contain information on the probe speed
As the probe speed does not provide sufficient information to estimate
where
If we solely want to estimate the change in cumulative flow along the probe vehicle between the detector locations, that is,
In this study, we consider this simple (linear) scheme to estimate how the probe-specific relative flow changes between detector locations. This scheme solely uses probe data that describe the times and speeds at which the probe passes the detectors and detector data around these times. If the traffic conditions change significantly along the probe trajectory, this scheme may be too simplistic. In our experiment, we will evaluate the effect of the traffic condition on the accuracy of this scheme. Furthermore, it is important to note that more extensive schemes may be used that incorporate information related to the full probe trajectory and potentially also estimates of the traffic state between the detectors. The former would require that we have high-frequency probe data, whereas for the considered scheme it suffices that probe vehicle share the time and speed at which they pass the detector locations. For the latter (i.e., estimating the traffic state between detectors), different methodologies to estimate the traffic state may be used, for example, the ASM-filter ( 19 ).
Case Study
In the case study, both real and simulated traffic data are used. Real data have the advantage of real traffic behavior and real observation errors. However, we lack a ground truth for real data. Therefore, we also use microscopic simulation to construct traffic-sensing data with similar characteristics, while having access to a ground truth.
Below, we first explain which traffic-sensing data are collected for the two studies and which traffic conditions occur in the study period and road stretch. Next, we explain which experiments will be conducted and which insights these experiments should provide.
Traffic-Sensing Data and Traffic Conditions
Figure 2 shows the road layouts and traffic conditions for the two case studies. For both studies, we will discuss which data are available and why certain study periods are selected.

Road layouts and traffic conditions for the two case studies. (a) Road layout, simulation study, (b) individual vehicle speeds (m/s) (every 10th vehicle), simulation study, (c) road layout, empirical study, and (d) individual probe vehicle and detector passing speeds (km/h), empirical study.
Simulation Study
In the simulation study, the microscopic simulation program FOSIM (
18
) is used. The model used in this program is validated for Dutch freeway traffic (
20
). Figure 2a shows that we consider a three-lane road segment with an on-ramp that is located at
Empirical Study
Real disaggregated detector and probe trajectory data are available for a test section on the A9 in The Netherlands on June 18th, 2019. These data are respectively made available by the Dutch road authority (RWS) and BeMobile as part of a project that aims to evaluate the value of fusing these two data types to gain more accurate traffic state estimates and potentially reduce the required road-side sensing equipment. BeMobile provides high-frequency (1 s) probe GPS-data, which are map-matched by Modelit. This yields probe trajectory data that describe the position over time (i.e., trajectory) of a subset of the vehicles (i.e., the probe vehicles).
The layout (including detectors locations) of the road segment that is considered in the empirical study is shown in Figure 2c. An off-ramp is located directly downstream of the considered segment. Two 1-hour peak-periods are selected, that is, 07:30–08:30 h and 16:00–17:00 h. These periods are selected to study the effect of changing traffic conditions on the ability to correctly estimate the change in cumulative flow over probe trajectories between detector locations. In the first period, some probes experience congested conditions, see Figure 2d. This figure shows that a stop-and-go wave propagates upstream. The cause of this jam lies downstream of the considered segment. In the second period, solely free-flow conditions are observed.
Experimental Set-Up
Multiple experiments are performed that provide insight in the estimation accuracy. The aim of the experiments is to evaluate the accuracy of estimating the change in cumulative flow over probe trajectories between two detector locations based on disaggregated detector data.
Selecting the period is a trade-off between capturing the local and current traffic conditions (which may be missed if we consider a very long period) and observing extreme flow values (which may happen if we consider a very short period). We tested with periods of 30 s, 60 s, and 120 s. Although small changes are observed in the results, the overall findings and conclusions remain the same. Therefore, we solely present the estimates resulting from using
In both the empirical and simulation studies, we compare the estimates for different detector spacings. In the simulation study, the considered detector spacings are 1,000, 2,000, 3,000, and 4,000 m. In the empirical study, the considered detector spacings are 920, 1,500, and 2,030 m (which includes all detectors installed on the test section).
The availability of the ground truth in the simulation study allows us to visualize and quantify the estimation errors. As this provides a detailed insight into the estimation performance, we will first perform the simulation study. The resulting insights help to analyze the results of the empirical study.
In the simulation study, two steps are taken. First, we compare the estimated and true changes in cumulative flow (
Because of the absence of the ground truth, it is not possible to directly compare the estimated and true changes in cumulative flow over probe trajectories for the empirical study. Therefore, an alternative comparison is considered in the empirical study; we analyze the difference in
Results
This section presents the results from the simulation study and the empirical study. The simulation study is discussed first because it yields insights that are valuable in analysis the empirical study results.
Simulation Study
Figure 3 shows the true and estimated change in cumulative flow over vehicle trajectories between two detector locations. In these figures the vehicle travel time between detectors is indicated by the color. Furthermore, Figure 4 shows time-series of the true and estimated change in cumulative flow together with the individual speed for four vehicles. These time-series are used to provide more detailed explanations on the features that are observed in Figure 3.

Simulation study: True and estimated change in cumulative flow over vehicle trajectories between two detector locations. The color of the dots indicates the vehicle travel time between the considered detector locations. (a) Between detectors located at

Simulation study: True and estimated changes in the cumulative flow over probe trajectories together with the probe speeds. (a) Slow vehicle solely in free-flow, (b) slow vehicle solely in free-flow, (c) fast vehicle solely in free-flow, (d) fast vehicle solely in free-flow, (e) fast vehicle shortly experiencing congestion at downstream detector, (f) fast vehicle shortly experiencing congestion at downstream detector, (g) slow vehicle shortly experiencing negative relative flow, and (h) slow vehicle shortly experiencing negative relative flow.
In free-flow conditions, estimating the change in cumulative flow over vehicle trajectories is a clear improvement over assuming that there is no overtaking. As the congestion does not spill back upstream of
Error Statistics for the Four Detector Spacings.
Note: A travel time threshold is used to distinguish vehicles that solely experienced free-flow (FF) conditions and those that experienced congested (CG) conditions. RMSE = root mean squared error; na = not applicable.
If vehicles experience congested conditions, the estimated changes in cumulative flow are less accurate than for vehicle solely experiencing free-flow conditions, that is, the black diagonal line is a better fit for Figure 3, a and
b
, than for Figure 3,
c
and
d
; however, also in these cases it is still more accurate to estimate the change in cumulative flow than assume “no overtaking,” see Table 1. In this table for a detector spacing of 4,000 m, the RMSE of vehicles solely experiencing free-flow conditions is relatively large with respect to the other spacings, that is, it jumps from 7.92 vehicles to 17.85 vehicles for detector spacings of 3,000 m and 4,000 m, respectively. In Figure 3d there are some observations that show a travel time below the threshold (i.e., smaller than 180 s) and for which the change in cumulative flow is highly underestimated (i.e., estimates around
Empirical Study
Figure 5 shows the results of the empirical case study. The axes are the same for all subfigures, that is, Estimated

Empirical study: The difference in the number of vehicles that overtake two consecutive probes between the detector locations based on detector passing observations (y-axis) and probe relative flow estimates (x-axis). The color of the dots indicates the mean travel time between the considered detector locations of the two consecutive probes. (a) Between detectors at
In line with the simulation study, the empirical results indicate that estimation of the change in cumulative flow between detectors is relatively accurate in free-flow conditions, see Figure 5,
a
,
c
and
e
. Also here, the estimates are better than assuming that
In congestion, the computed relative flow is not very accurate in estimating the real number of vehicles passed; see Figure 5,
b
,
d
and
f
. These figures do not show a good relation between the two axes. The largest differences are observed for probes that have a high mean travel time, which means that these probes are affected by the stop-and-go wave. The figures indicate that the probe relative flows estimated at the detectors are not representative for the full probe trajectory between detectors. The simulation study also showed that the estimation performance decreased when probes experience congestion; however, the estimation performance in congestion for the simulation study seems to be better than for the empirical study. This can partially be explained by the different features that we compare. Figure 5 uses the estimates related to two probes. If these errors have the same sign (+ or –), the total absolute error increases, which leads to large positive or negative values of
Conclusion and Insights
Probe trajectory or vehicle re-identification data can be used for initialization and error correction of cumulative flow curves constructed using stationary detectors. Studies that use these data for this purpose often assume that there is no overtaking, which would mean that the cumulative flow value is constant along a probe trajectory. However, in multi-lane traffic, this assumption is often violated. This study investigates the option to estimate the change in cumulative flow along probe trajectories based on disaggregated detector data, and in this way improve on the “no overtaking” assumption.
In this study, both simulated as real data are used to investigate the changes in cumulative flow along probe trajectories and the ability to expose this using detector data. By means of a case study we show that the probe relative flow estimated at two detector locations is representative for the full trajectory between these locations in free-flow conditions. Therefore, in these conditions it is a clear improvement to describe the change in cumulative flow based on detector data instead of assuming “no overtaking.” If probe vehicles experience congestion between detectors, the probe relative flows estimated at the detector locations are less representative for the rest of the trajectory. In the simulation study (where the estimation accuracy can be quantified), using disaggregated detector data still yields more accurate estimates than assuming “no overtaking.” However, in the empirical study, these benefits are not observed. Changing traffic conditions along the probe trajectory (which are related to ability to estimate the change in cumulative flow based on detector data) can be observed using the probe speed. This means that the probe speeds observed between detectors could and should be used to assign an uncertainty to the estimates of the change in cumulative flow along probe trajectories.
More complex schemes may be used that may include information such as the probe speed and the traffic states along the trajectory between detector locations. Probe trajectory data provide information on the probe speeds between detector locations; however, the data do not contain exact information on the traffic state between these locations. Estimating the traffic states between detector locations is the intended application and is the reason for estimating the relative flows along probe trajectories. The circular relation between estimating the probe-specific relative flows and estimating the traffic state indicates that an (iterative) optimization approach to estimate both features is potentially interesting. However, in this study, we focused on the first step and evaluate how accurate the probe-specific relative flows can be estimated without estimating the macroscopic traffic states along the full probe trajectory.
Footnotes
Acknowledgements
We would to extend our appreciation for all parties that provided the funding and data for this study.
Author Contributions
P. B. C. van Erp has taken the lead in conceptualizing the idea, data analysis and writing the article, V. L. Knoop has provided regular feedback during all phases of the project, E. Smits was involved in the data analysis and provided feedback on the manuscript, C. Tampère has been involved in conceptualizing the idea, and S. P. Hoogendoorn provided feedback during all phases of the project.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the Netherlands Organization for Scientific Research (NWO), grant-number: 022.005.030. The data used in this study were provided by the Dutch road authority Rijkswaterstaat, BeMobile and Modelit.
