Improving Synchronization With 1588 Transparent Clocks

The IEEE 1588 Precision Time Protocol enables precise time synchronization over the packet-based Ethernet network so that the time on a slave clock at one end of the network agrees with a master clock at the other end. But, how precise?

On a LAN, clocks using IEEE 1588 can agree within 100 ns of each other even if the network is highly congested. That compares with network time protocol (NTP), the Internet’s ubiquitous method of sync distribution, which typically can achieve 0.5 ms to 2 ms precision over a LAN.

Using Ethernet to distribute synchronization is important because of the economies of packet switching vs. dedicated point-to-point timing signal distribution. If packets can reliably carry precise sync information, it would provide a very economical way to synchronize time across networked devices. Furthermore, you would not need to build separate time-distribution infrastructures in places where you want to precisely sync devices or computers and have network access.

IEEE 1588 defines special packets for carrying timing information. It also specifies a protocol for exchanging those packets between devices designed to process them: grandmaster clocks, slave clocks, boundary clocks, and transparent clocks. How these devices, in particular transparent clocks, are implemented and deployed has a big effect on the level of precision achieved. But before getting into those details, it makes sense to describe the special challenges of syncing clocks on a packet network.

Packet Sync Challenges

Unfortunately for network designers, clocks drift and need to be reset periodically to stay in sync. Two clocks are considered synchronized when the difference between them, called the offset, is below some threshold.

For example, a GPS receiver clock may have the specification of 50-ns rms offset to coordinated universal time (UTC) with 100-ns maximum. This means that most of the time the clock offset from UTC is 50 ns or less, but occasionally it can be off by as much as 100 ns.

Clocks get reset in a packet network via a timestamp exchange between a master clock to one or more slave clocks. The time required to traverse the network is not zero and is called packet delay. More delay means a greater offset between when a timestamp arrives and the time when it actually was stamped. Unless accounted for, this timestamp offset will contribute to the master-slave offset.

Packet delay results from two key factors: the distance the packet travels (path delay) and the time spent inside network devices like switches and routers (queuing delay). For example, a network containing switches or routers can add delay in the hundreds of microseconds. The result is a slave that can synchronize to the master in the low tens of microseconds up to the hundreds of microseconds.

Queuing delay is caused by packets from multiple sources waiting to exit a switch on the same port. After packets come into a switch, they queue up at its output port waiting to exit. The more packets, the longer the wait. As packets wait longer in each switch and traverse more switches, the longer it takes to reach the slave and the greater the timestamp offset.

Precise master-slave synchronization is all about enabling the slave to reset its clock despite whatever errors may be caused by packet delay. Making that time adjustment would be simpler if packet delay did not vary so the slave could just apply a constant. However, packet delay does vary because the factors that cause packet delay also vary, such as the number of packets competing for switch exit ports.

This metric is called packet delay variation (PDV) and is one-sided so the packet delay contribution to the master-slave offset is always positive. Packets arrive either in the minimum time or are delayed. They never arrive ahead of time. The key consequence of PDV is that slaves need to receive multiple timestamps and apply sophisticated algorithms to compute the precise offset correction for any timestamp.

For example, one solution to PDV is to exchange multiple timing packets and use the lucky packets that arrive with the minimum amount of delay. For microsecond and submicrosecond time-of-day synchronization, this approach requires increased packet exchange rates, precise hardware-based timestamping, and very intelligent algorithms to filter the packets, compute the offsets, and adjust the slave clock. Not all of these techniques are widely available in commercial off-the-shelf products.

How the Basics of IEEE 1588 Works

IEEE 1588 specifies a method for the slave to correct for the time offset and time packet delays using a series of message exchanges between master and slave. The key is to allow for follow-up messages to update the timestamp received in earlier messages.

The standard does not define any implementation details with respect to hardware or software. However, timing packets in FPGA hardware, a common hardware timestamping technique, eliminates operating system stack delays and significantly reduces overall PDV. By default, the IEEE 1588 2008 standard defines the timing message exchange rate at once-per-second, which is at least 16 times more frequent than NTP.

To mitigate PDV across the network, IEEE 1588 introduces two devices: boundary clocks and transparent clocks. A boundary clock is a switch-clock combination that modifies timing packets coming into the switch with fresh timestamped timing packets exiting the switch so time effectively jumps the queue. This is useful in distributing the timestamping load from a master clock if there are thousands of slaves.

A transparent clock is a switch with two key IEEE 1588 capabilities: it measures individual timing packet queuing delays inside itself, and it modifies specific timestamps passing through it to reflect those delays. By updating timestamps based on its own induced delays, the switch looks like it’s not even there from a timing standpoint; hence, the name transparent.

IEEE 1588 2008 also specifies two timing packet exchange techniques: one-step clock and two-step clock, which generally are implemented in FPGA hardware to eliminate operating-system stack delays. Both techniques apply offset correction but are sensitive to asymmetrical path delays caused by PDV. That means that PDV’s effect in the downstream direction (master ? slave) is different than in the upstream direction (master? slave), ultimately requiring appropriate offset corrections accounting for the delay differences.

One-Step Clock Technique

The one-step clock technique exchanges three packet-sized timing messages between master and slave: Sync, Delay_Req, and Delay_Resp (Figure 1). The master stamps the time (t₁) on the Sync message that leaves the master. The slave receives the Sync message at time (t₂) and sends a Delay_Req message at time (t₃) to the master, which the master receives at t₄. The master then sends a packet containing t₄ to the slave. The slave now has all the information needed to calculate the slave offset:

Figure 1. One-Step Clock Technique for Exchanging Timing Packets

With this technique, the master cannot truly know exactly when the Sync message leaves until after it has already left. For that reason, the masters are engineered to advance the timestamp value to account for the difference between when the time is stamped and when the Sync packet will enter the network.

Two-Step Clock Technique and Transparency

The two-step clock technique measures exactly when the Sync packet leaves the master clock and places this timestamp in a Follow_Up message after it sends the Sync message (Figure 2). The Follow_Up message contains the true t₁ value based on when the Sync message was observed to actually leave.

Figure 2. Two-Step Clock Technique for Exchanging TimestampsTransparent clocks (not shown) measure Sync and Delay_Req delays and update t₁ and t₄ timestamps in intercepted Follow-Up and Delay_Resp messages sent back to the slave.

The availability of the Follow-Up message on the heels of the Sync message is very useful. As the Sync packet traverses the transparent clock, measurements of its resident time in the switch are noted. These values then are added to the appropriate Follow-Up packet for eventual use by the slave clock to account for queuing delays. Transparent clocks along the path perform this timestamp adjustment.

Similarly, the transparent clocks measure packet resident times of the Delay_Request messages and modify the associated timestamps in the Delay_Response packets. The arrival time (t₄) of Delay_Req at the master also is corrected by the slave using the delay values measured by the transparent clocks so the slave also can adjust its clock without seeing errors caused by upstream queuing delays:

The transparent clocks in the upstream and downstream paths perform these queuing delay corrections to packet timestamps going through the switch. They know how long the Sync message was delayed by queuing and add correction values in the Follow_Up packet. Transparent clocks also know how long the Delay_Req was delayed going from slave to master so they add correction values in the Delay_Resp packet going back to the slave. Because of the queuing delay correction values the transparent clocks have measured and added, the slave has an accurate picture of path delays in both directions and can interpret the master’s clock timestamps correctly.

Measuring Transparent Clocks’ Sync Benefits

By making those queuing delays invisible, transparent clocks have a big impact on helping slaves stay in sync with masters. That effect is easily demonstrated in side-by-side comparisons of five networking cases where the master and the slave are connected through:
1.?Ethernet crossover cable: no switches involved.
2.?Two standard switches: timing traffic only, no data traffic.
3.?Two standard switches: timing traffic + data traffic.
4.?Two transparent clocks: timing traffic only, no data traffic.
5.?Two transparent clocks: timing traffic + data traffic.

In each case, the most convenient way to compare the time on the slave to the time on the master is by comparing the 1 pulse-per-second (1pps) outputs of each clock. A counter, oscilloscope, or time interval measurement can be used to measure the difference between the on-time master 1pps and the slave 1pps. This will establish a best-case time-transfer accuracy baseline for all five scenarios. Figure 3 shows a diagram of the test setups.

Figure 3. Configurations of Different Test Scenarios

Table 1 is a summary of the test results. The left side presents results when there is only timing traffic; that is, the traffic generators are off (cases 1, 2, and 4). The right side shows when the timing packets share the network with data packets; that is, the traffic generators are on (cases 3 and 5). Offsets are recorded in nanoseconds, showing the mean and standard deviation of the offset between master and slave. Also measured is the peak-to-peak (pk-pk) dispersion of sampled values around the mean.

Table 1. Summary of Demonstration Test Results

The crossover cable connection with no data traffic represents the best-case scenario against which to compare the other tests; that is, 60-ns offset, 7-ns standard deviation, and 85-ns pk-pk dispersion. That makes sense given that there is no competition for bandwidth between timing packets and data packets, and there are no switching delays.

Now note what happens as switches are introduced, first without data traffic (case 2). Since queues are formed when at least two paths converge to one, the introduction of two switches connected by a single cable between them ultimately will help create queues and subsequent delays. In this example, two identical standard enterprise switches are used so that a queue will form in each direction of the network traffic. Actual timing results are similar to the crossover cable although there is up to 3 µs of dispersion.

Adding data traffic significantly degrades sync performance. In fact, it is similar to what you would expect with NTP. Here Internet control message protocol (ICMP) Ping packets with payloads are exchanged between two traffic generators. The traffic load through the switches is increased to 4% of the 100Base-T bandwidth as measured at the traffic generators.

Traffic on a switched Ethernet network should be able to operate upward of 100% bandwidth utilization. The mean master-slave offset now is almost 25,000 ns with a standard deviation of >82,000 ns and a pk-pk spread of >1 ms.

Replacing the switches with transparent clocks with no data traffic (case 4) provides results very similar to the crossover cable. The master-slave mean offset is 76 ns with a standard deviation of 10 ns and pk-pk range of 126 ns. The two transparent clocks have added only 15 ns of mean timing error.

In case 5, data traffic is raised to 97% of the 100Base-T bandwidth to create the largest possible queuing delay/PDV environment. The slave synchronized to the master with a mean offset of 76 ns, standard deviation of 9 ns, and pk-pk range of 85 ns. Even at this very heavy traffic level, the slave can synchronize the same as if there were no traffic at all.

Key Takeaways

The key point presented in Table 1 is that transparent clocks enable precise synchronization accuracy in networks with extremely high packet traffic and queuing delays. In this demonstration, the statistical performance of slave-synchronization accuracy in the network with transparent clocks was nearly identical to a crossover cable except for a 15-ns shift. In summary:
•?Packet networks can be a cost-efficient way to sync master and slave clocks across a network LAN.
•?Traffic on packet networks causes variable queuing delays in switches.
•?PDV in the arrival of IEEE 1588 timing packets challenges the slaves’ capability to account for queuing delays when interpreting the master’s timestamps.
•?Transparent clocks mitigate PDV by measuring switch delays and allowing slaves to compensate for them.
•?Transparent clocks can enable slave synchronization accuracy to the master similar to that of a crossover cable between the master and slave.

How slaves perform absent queuing delays or data traffic does not accurately reflect how they will perform in the real world when those conditions are present. Best-case good-as-crossover synchronization may indeed be possible in a packet-switched network, and one way is if transparent clocks are used rather than regular switches.

About the Author

Paul Skoog is a product marketing manager managing the bus-level timing and enterprise network time server product lines at Symmetricom. Before joining the company in 1997, he was product manager at Trimble and held application engineering and product management positions in the dynamic signal analysis software market. Mr. Skoog received a B.S. in mechanical engineering from California Polytechnic University and an M.B.A. from Santa Clara University Graduate School of Business. Symmetricom, 3750 Westwind Blvd., Santa Rosa, CA 95403. 707-528-1230, e-mail: [email protected]

January 2010