Abstract
One credible method for performance evaluation of network timing service was proposed, based on one kind of disciplined time standard with GNSS time transfer, NIMDO. The performance of the NTP timing service were characterized over multiple baselines on both the Internet and the Intranet, and the network delay and the timing offset were analyzed in detail. The results show that on the Internet the averaged timing offsets via the NTP servers are roughly hundreds of μs to several ms level, and on the Intranet the averaged timing offsets via the NTP servers are at about 20 μs level. NTP timing service was mainly affected by white phase noise, and flicker phase noise at the different sites. The maximum change of the timing offset was one-half of the maximum change of
Introduction
Network timing service, especially Network Time Protocol (NTP) 1 timing service widely used around the world is a significant method for many national metrology institutes (NMIs) to disseminate the standard time and for many industrial and commercial users to trace the standard time, described as references.2–7 In the actual NTP timing service, the client clock after NTP time synchronization is still deviated from the reference time of the NTP server because of the primary limiting factors, which are network asymmetry, server instability, and client instability. 8
To ensure the reliability of NTP timing service and provide the real performance characteristics for users’ references, it is necessary to effectively evaluate its performance, such as timing offset or timing accuracy and uncertainty of NTP timing service. In 2008, in Europe, an NTP timing accuracy measurement and calibration device based on GPS disciplined oscillators (GPSDOs) was established, and the test experiments in collaboration with five European timekeeping laboratories were conducted. The network influence on the timing offset for NTP timing service was analyzed. 9 However, the round-trip delay involving network asymmetry was not combined for further analysis. Moreover, in terms of the definition in International Vocabulary in Metrology (VIM) for traceability, the metrology traceability must be acquired through constructing a documented unbroken chain of calibrations, each contributing to the measurement uncertainty. 10 For a standalone GPSDO disciplined by GPS, to achieve the metrology traceability, establishing the unbroken calibration link with UTC is required as described in Defraigne et al., 11 which is the task for the task group of Consultative Committee for Time and Frequency (CCTF) Working Group (WG) on Global Navigation Satellite System (GNSS). Thus, referenced to GPSDO, the traceability of the measurement and evaluation for NTP timing service cannot be declared. In 2014, in America, the National Institute of Standards and Technology (NIST), Colorado, USA designed a local system at NIST to measure the timing accuracy of NTP timing service via different NTP servers located in North, Central and South America. The system compared the timestamps received via NTP servers to UTC(NIST) located in Boulder, Colorado and monitored the working status of multiple NTP servers in the same place. 12 The uncertainty, including network asymmetry from round-trip delay on the Internet, was evaluated at milliseconds-level. However, since the measurement system was always at NIST and the NTP servers were at remote sites, the evaluation on the Intranet mainly reflecting the server instability was not implemented.
In order to have important time and frequency quantities traceable to the UTC(NIM) (National Metrology Primary Standard for Atomic Time Scale), then to UTC, and finally to SI (Système International d’Unités), the transfer hierarchy of time and frequency quantity values is being constructed. One of the most significant time transfer links is NTP time transfer via the NTP servers, and more and more remote NTP servers have been equipped. The performance of NTP timing services of these servers has to be evaluated. In this paper, a traceable method for remote and local evaluation of network timing service was proposed using UTC(NIM) disciplined oscillator (NIMDO), and the influence on network asymmetry and server instability can be evaluated for NTP timing service. UTC(NIM) is National Metrology Primary Standard for Atomic Time Scale in China and is legally and directly traced to UTC via TAI cooperation, which is the only key comparison for time and frequency, named CCTF-K001.UTC. NIMDO gets traced to UTC(NIM) through GNSS time transfer with the time offset at several nanoseconds level. Thus, with NIMDO, the performance for NTP timing service can be evaluated and characterized with traceability, mainly including the timing offset and the time stability for the client after NTP time synchronization and the timing uncertainty. The NTP server located at NIM and the remote ones at the remote test sites were involved in the evaluation.
Principles for NTP timing service
The typical working mode for NTP is client/server mode. As shown in Figure 1, the client first sends an NTP request packet to the NTP server, which stamped with the time epoch

NTP timing diagram.
Assuming that the transmission delays for the NTP request packet and the NTP reply packet are equal (
Principles for performance characterization
In terms of the principles of NTP timing service, when the round-trip delay is asymmetric (
Network asymmetry introduces NTP timing offset as follows.
It can be also concluded from formula (7) that the NTP timing offset standard deviation is equal to half of
We could get the relationship between the maximum change
We could get the absolute value for the maximum change of the measured offsets to evaluate the uncertainty of NTP timing service
Methodology and experiments
Taking NTP timing service as an instance, we evaluated its performance both through the NTP server located (type Symmetricom SyncServer S250) at NIM on the Internet and through the ones at the remote test sites on the Intranet. To acquire the traceability to UTC(NIM) and then to UTC, we use NIMDO to link with UTC(NIM). It was used as the time reference to measure the timing offset of NTP testing equipment (type Titan TimeAcc 007) through NTP timing service. Moreover, one standard test suite including one NTP server (type Symmetricom SyncServer S350) and one self-developed pulse distribution amplifier at the remote sites was used in the evaluation for NTP timing service on the Intranet.
The eight sites took part in the experiments, including Yantai Chijiu Clock Co., Ltd. (Chijiu), Qingdao Institute of Measurement Technology (QIMT), Hunan Institute of Metrology and Test (HNIMT), Xinjiang Uygur Autonomous Region Research Institute of Measurement and Testing (XJMT), Liaoning Institute of Measurement (LIM), Institute for Metrology and Calibration of Guizhou (IMGZ), Guangdong Institute of Metrology (GIM, i.e. South China National Centre of Metrology (SCM)), Chongqing Academy of Metrology and Quality Inspection (CMQ). We set up one NIMDO at each remote site and evaluated the performance including the timing offset and the corresponding time stability based on it. The geographical distribution and test periods for all the sites are shown in Figure 2.

Test sites map.
GNSS time transfer and NIMDO
The basic principle for GNSS time transfer is shown in Figure 3.

GNSS time and frequency transfer.
The GNSS time and frequency transfer receivers
NIMDO is a significant extensional application of GNSS time transfer as shown in Figure 4. 13 If the GNSS time transfer data from the reference station can be acquired via a certain kind of nearly real-time communication method by the remote station, together with the GNSS time transfer data from the remote station itself, the time difference between the two stations could be calculated in real-time. Then, the clock difference and clock rate of the steerable oscillator could be predicted for the present at the remote station and the oscillator may be steered to the reference at the reference station, such as UTC(NIM). When the oscillator is operated continuously based on the above principles, it will be really a time scale disciplined by and traced to UTC(NIM).

Implementation of NIMDO.
NIMDO has been verified and demonstrated on its performance referenced to UTC(NIM) and has been used at different sites and in several industries, 14 and it is much more accurate and precise than the GPS disciplined oscillator (GPSDO). During the test, the time difference between the NIMDO at each site and UTC(NIM) is better than ±10 ns for around 95% of the time.
Experiment 1: NTP timing service on the Internet
The NTP server at the Changping campus of NIM in Beijing mostly transfers its time to the clients on the Internet, and the experiments schemed in Figure 5 were performed for the direct verification of time synchronization performance with an NTP test equipment (type Titan TimeAcc 007) at the remote sites.

Experiment schematic of timing performance evaluation for the NTP server on the Internet.
The test equipment at the test site is used as the client, which was connected with the 1PPS and 10 MHz signals of the NIMDO as the time and frequency reference. It sends a request packet and obtains a reply packet from the NTP server via the Internet and then its time difference
Experiment 2: NTP timing service on the Intranet
To evaluate the performance of time transfer via the local NTP server, the evaluation experiment was designed as shown in Figure 6, and the standard test suite was employed except that during the test at IMGZ the NTP server (type THC-200AB) was used instead of the NTP server (type Symmetricom SyncServer S350). TimeAcc synchronizes the time via the NTP server on the Intranet.

Experiment schematic of timing performance evaluation for the NTP server on the Intranet.
As the NTP server and TimeAcc were both with reference to the NIMDO, the ideal time difference
Numerical results
For the two experiments, the timing offset and the round-trip delay via the network have been acquired and the statistics for them have been done. As well, the stability based on the timing offset has been analyzed.
Experiment 1: Evaluation of NTP timing service on the Internet
The experiments were performed at each site for a few days, and data were recorded every 10 s. To eliminate abnormal network data and accurately evaluate NTP timing service performance, the experimental data of each group are sorted in ascending order according to
Statistics of the timing offset and the round-trip delay for experiment 1 (in milliseconds).
For experiment 1, Table 1 summarizes the statistics of the timing offset and the round-trip delay in Figures 7 and 8 (shifted for a better view). It can be seen that, in general, the longer the distance is from NIM, the greater the delay is in general. The distance between Chijiu and NIM is 547 km, which is the closest distance from NIM among all the test sites. However, the delay is the largest, inferring that there could be the worst network environment which had a greater influence on the delay than the distance. The results of the experiment 2 stressed the effect of the background network traffic, as experiment 2 can be regarded as the measurement without the background network traffic. In general, the larger the delay standard deviation is, the larger the standard deviation of the offset. As shown in Figure 8, the round-trip delay data points of each station have dense parts and sparse parts. In this paper, the dense part of the data points is called the basic area. In addition, the basic areas of the same site may jump due to changes in the network path during NTP timing. For example, as shown in Figure 9, there are seven basic areas (in the red boxes), and there are jumps between the basic areas in the LIM site test. It can be seen from the results for 1 week that both the timing offset and the delay have similar large periodic changes. This phenomenon reflects that the network is more crowded during the day than at night, which causes the increase in delay and the change in offset. The periods when the largest offsets occurred correspond to the periods when the largest round-trip delay occurred during the rush hours for Internet surfing, which are roughly 7:00 to 23:00 (local time zone, Beijing time). The basic areas for the round-trip delay and the timing offset measured at HNIMT, XJMT, LIM, QIMT, IMGZ, SCM, and CMQ all have large but different jumps, anyway, it is evident that the jumps are due to the network rather than the NTP servers. From the above results, it can be found that network asymmetry is an important factor determining the NTP timing performance. Figure 10 reflects the numerical relationship of formula (9) in the corresponding time period for the same basic area.

Timing offset for experiment 1.

Round-trip delay for experiment 1.

Round-trip delay for experiment 1 of LIM.

Timing offset versus round-trip delay on the Internet.
Figure 11 describes the time stability (Time Deviation, TDEV) and the frequency stability (Modified Allan Deviation, MDEV) of the NTP timing offsets at all the sites, and these two results at XJMT are both less than those at the other sites for all the averaging intervals. Except for at the Chijiu site, the time stability of the offset at all the other sites is less than 1 ms for all the averaging intervals; the frequency stability of the offset at all the sites is at the level of 1e−8 after the averaging time of approximately 1 day. Network asymmetry is the main factor that affects the NTP timing offset. By analyzing the slope of the frequency stability curves at the different sites in Figure 11, we found that NTP timing service on the Internet is mainly affected by white phase noise, flicker phase noise and white frequency noise as shown in Table 2, and the noise parameters are estimated with

Time and frequency stability for experiment 1.
Noise parameters of NTP timing service on the Internet.
Experiment 2: Evaluation of NTP timing service on the Intranet
The experiment was performed at each site for 1 week, and data were recorded every 10 s. Data selection and processing principles are the totally same as those in section 4.1. The experimental results are shown in Table 3 and Figures 12 and 13 (shifted for a better view).
Statistics of the timing offset and the round-trip delay for experiment 2 (in microseconds).

Timing offset for experiment 2.

Round-trip delay for experiment 2.
Table 3 summarizes the statistics of the timing offset and the round-trip delay. The standard deviation for
From Table 3 and the Figures 12 and 13, in the case of one-hop or two-hop on the Intranet, the offsets and the delays at different sites are basically at separately about 20 μs and 100 μs level.
Figure 14 reflects the time stability and the frequency stability of the timing offsets for all the sites. The time stability is less than 1 μs after the averaging time of approximately 1 day; the frequency stability of the offset for all the sites is approximately 1e−11 after the averaging time of approximately 1 day. The NTP timing offset can be mainly attributed to the factors, such as the server instability and the client instability. By analyzing the slope of the frequency stability curves in Figure 14, we found that NTP timing service on the Intranet is also mainly affected by white phase noise and flicker phase noise at the different sites as shown in Table 4, which are estimated with

Time and frequency stability for experiment 2.
Noise parameters of NTP timing service on the Intranet.
Uncertainty evaluation
In terms of section 5.2, we could evaluate the uncertainty for NTP timing service in terms of formula (10), as Figure 15 shows. It can be concluded that in the case one-hop or two-hop on the Intranet, the timing offsets for NTP timing service at the different sites are basically at about 20 μs on average, and the uncertainty for NTP timing service could be evaluated as less than 50 μs. As well, on the Internet, the timing offsets for NTP timing service at different sites are basically at about milliseconds level on average, and the uncertainty is evaluated as less than 35 ms.

Uncertainty evaluation for different sites.
Summary
Via the device, such as NIMDO, the evaluation for network time synchronization could be legally traced to national time scale and then to UTC.
In our experiments via NIMDO, on the Internet, on average the timing offsets and the round-trip delays for NTP timing service at the different sites are separately roughly at hundreds of microseconds to milliseconds and tens of milliseconds level, and are basically proportional to the baselines from NIM, that is, farther away sites lead to larger delays and absolute offsets. The time stability of the offset for the different sites is almost less than 1 ms for all the averaging times and the frequency stability of the offset at different sites is less than 1e−8 after the averaging time of approximately 7 days. Network asymmetry is the main factor that determines the NTP timing offset.
On the Intranet, in the case of one-hop or two-hop, on average the timing offsets and the round-trip delays for NTP timing service at the different sites are basically at separately about 20 μs and 100 μs level. The time stability of the offsets for different sites is less than 1 μs after the averaging time of approximately 7 days and the frequency stability of the offsets at the different sites is approximately 1e−11 after the averaging time of approximately 7 days. The NTP timing offset can be attributed to the factors, such as server instability and client instability.
NTP timing service on the Internet and Intranet is mainly affected by white phase noise and flicker phase noise at the different sites. The noise parameters of NTP timing service on the Intranet are about 100 times less than those of NTP timing service on the Internet, which are roughly consistent with the phenomenon on the timing offsets and the round-trip delays.
The maximum change of the timing offset is one-half of the maximum change of
Footnotes
Acknowledgements
This work was implemented thanks to the support by the colleagues from Yantai Chijiu Clock Co., Ltd. (Chijiu), Qingdao Institute of Measurement Technology (QIMT), Hunan Institute of Metrology and Test (HNIMT), Xinjiang Uygur Autonomous Region Research Institute of Measurement and Testing (XJMT), Liaoning Institute of Measurement (LIM), Institute for Metrology and Calibration of Guizhou (IMGZ), Guangdong Institute of Metrology (GIM, i.e. South China National Centre of Metrology (SCM)), Chongqing Academy of Metrology and Quality Inspection (CMQ).
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Key R&D Program of China with grant no. 2021YFB3900704.
