Choose The Optimum Clock Source For PCI Express Applications

Getting the best performance from the PCI Express interface requires careful selection of the clocking method.

May 23, 2012

8 min read

PCI Express is an interface standard developed by the Peripheral Component Interconnect Special Interest Group (PCI-SIG). Originally designed for desktop personal computers, it’s used across a range of applications, including blade servers, storage, embedded computing, networking, and communications.

A broad selection of commercially available devices supports PCI Express. It’s also available in FPGAs and systems-on-a-chip (SoCs), providing designers with flexible solutions for transferring data within their systems. Two of the key advantages of PCI Express are its scalable bandwidth and flexible clocking. This article describes some of the standard clocking architectures for PCI Express and discusses their advantages and disadvantages in typical system applications.

The PCI Express Link

The PCI Express data link consists of one or more lanes, each of which provides a transmit (Tx) and receive (Rx) differential pair. Figure 1 shows two devices connected through a PCI Express interface. A major advantage of PCI Express is its bandwidth scalability. Up to 32 lanes can be configured in a single link.

1. A PCI Express Link between two chips or devices consists of separate differential pairs to transmit and receive on each lane. Up to 32 lanes of transmit/receive pairs can be incorporated for high-speed data exchange.

Table 1 lists the specifications of PCI Express’s current versions. With the introduction of PCI Express 3.0, each lane can accommodate 8 Gbits/s per direction for a maximum throughput of 64 Gbits/s. The bandwidth can be tailored to your needs simply by configuring the interface for the appropriate number of lanes.

The previous PCI Express 1.1 and 2.1 standards offered 2.5 Gbits/s and 5.0 Gbits/s per lane, respectively. Choosing a PCI Express standard with higher data rates potentially reduces the number of lanes, but places tighter requirements on clocking performance. We’ll examine these requirements in the following sections.

PCI Express Applications

PCI Express’s popularity has resulted in growing numbers of application-specific devices (e.g., ASICs and SoCs) adopting the PCI Express interface as a common interconnect with other devices. FPGAs offer built-in PCI Express protocol stacks and physical-layer interfaces to simplify system-level design. Figure 2 shows examples of system-level solutions using PCI Express interconnects.

2. There are three basic types of PCI Express interconnects for system applications: data transfer between two devices on the same PCB (a); data transfer between the main board and an add-in board (b); and data transfer between multiple boards over a backplane (c).

PCI Express Clocking Architectures

Reliable data transmission requires a stable clock reference. The PCI Express standard specifies a 100-MHz clock (Refclk), with greater than ±300-ppm frequency stability at both the transmitting and receiving devices. It also specifies three clocking architectures: Common Refclk, Separate Refclk, and Data Clocked Refclk (Fig. 3).

3. There are three key PCI Express clocking architectures: common reference clock (Refclk) (a), separate reference clock (b), and data clocked (c).

Common Refclk is the most widely supported architecture among commercially available devices. (The examples in Figure 2 use Common Refclk.) It supports spread-spectrum clocking (SSC), which is useful in reducing electromagnetic interference (EMI). However, the same clock source must be distributed to every PCI Express device while keeping the clock-to-clock skew to less than 12 ns between devices. This can be a problem with large circuit boards or when crossing a backplane connector to another circuit board.

If a low-skew configuration isn’t workable, the Separate Refclk architecture, with independent clocks at each end, can be used. The clocks don’t have to be more accurate than ±300 ppm, because the PCI Express standard allows for a total frequency deviation of 600 ppm between transmitter and receiver. However, this tolerance leaves no margin for SSC.

The Data Clocked Refclk architecture is the simplest, as it requires only one clock source, at the transmitter. The receiver extracts and syncs to the clock embedded in the transmitted data. Data-clocked architecture was introduced when the PCI Express 2.0 standard was released in 2007, so it’s still a relatively new clocking scheme with fewer commercially available devices supporting it.

Refclk Frequency And Jitter Requirements

The industry-standard reference clock frequency for devices supporting PCI Express 1.1, 2.1, and 3.0 is 100 MHz (±300 ppm), generated using the host-clock signal level (HCSL) format. Devices such as embedded processors, system controllers, and SoC-based designs use this reference clock.

In FPGA applications, though, PCI Express reference-clock requirements can differ from the standard 100-MHz HCSL format. Other frequencies and formats include 125 MHz, 200 MHz, or 250 MHz in low-voltage CMOS (LVCMOS), low-voltage differential signaling (LVDS), or low-voltage positive ECL (LVPECL).

A typical example is an FPGA that supports both PCI Express and Ethernet functions. Using a common 125-MHz clock for both functions reduces clocking domains or “timing islands” in the FPGA. The FPGA internally multiplies this reference to the required PCI Express lane rate (e.g., 125 MHz x64 for PCI Express 3.0). Depending on the mix of ICs in a design, the PCI Express clocking scheme can vary from generating multiple 100-MHz HCSL clocks to generating a mix of different frequencies and output formats.

The reference clock’s jitter is an important consideration. In exchange for PCI Express 3.0’s higher throughput and reduced interconnect wiring, lower jitter is required. Clocks meeting the jitter requirements of PCI Express 1.1 and 2.1 might not meet the needs of PCI Express 3.0 device. Table 2 summarizes jitter requirements for all PCI Express standards. For a detailed discussion of clock source or Refclk jitter requirements for each of the PCI Express standards, refer to application note AN562 available at www.silabs.com/timing.

Spread Spectrum Clocking (SSC)

SSC is desirable because it reduces the level of radiated EMI. Spread-spectrum clocks use low-frequency carrier modulation to spread the radiated clock energy across a broader range of frequencies. This modulation has a similar effect on broadening the spectrum of data-generated EMI.

PCI Express devices are specified to transmit data reliably when using a Refclk with a spread spectrum modulation rate of 30 to 33 kHz and a deviation of 0% to –0.5%. Because each PCI Express device must transmit within a bit rate of ±300 ppm of each other, the same Refclk must be supplied to both devices if SSC is enabled.

Separate Clocking Architecture is therefore not recommended when SSC is required, unless both clocks are synchronized to a common source. System-level shielding may be required to meet EMI compliance standards when SSC is disabled.

Choosing The Optimum PCI Express Clock Source

As we’ve seen, several factors influence the choice of the best PCI Express clock source. The clock should ideally support features that provide performance sufficient for all PCI Express data rates, SSC, and adjustable output-output skew.

Off-the-shelf PCI Express clock devices with 100-MHz HCSL format can handle most PCI Express applications. Single-clock oscillators and buffers are adequate for simple PCI Express clocking applications, but frequency- and format-flexible clock generators are increasingly used in complex timing applications requiring reference clock generation of different frequencies and formats (LVDS, LVPECL, etc).

Figure 4 shows two PCI Express clocking solutions using typical clock ICs. The first example uses Silicon Labs’s Si52144 clock generator as a single-chip PCI Express clock tree solution. This generator is ideal for off-the-shelf PCI Express devices that require standard 100-MHz HCSL clocks. The clock generator also provides pin-controlled spread-spectrum operation to engage or disengage SSC during electromagnetic interference (EMI) compliance testing.

4. Here are two PCI Express clock generation solutions using off-the-shelf Silicon Laboratories clock ICs: a pre-configured fixed frequency solution using the Si52144 (a); and a flexible clock solution using the Web-configured Si5335 (b).

The second example uses Silicon Labs’s Web-customizable Si5335 clock generator. It offers the increased flexibility often required with FPGA or custom ASIC and SoC designs. In addition to generating clocks other than the standard 100-MHz HCSL format, it provides output-to-output skew adjustments to compensate for large differences in PCB trace lengths. It also supports spread spectrum on a per-output basis, allowing developers to target EMI reduction in areas of the board that need it the most.

Providing these features in a single clock generator device simplifies system design and reduces the component count and bill-of-materials cost. This flexibility also lets developers make changes when prototyping, increasing the likelihood of a successful first-pass design. Visit www.silabs.com/products/clocksoscillators/clock-generators-and-buffers/Pages/ClockBuilder.aspx for an online video that shows how Web customization works.

Summary

To ensure stability and rapid data transfer, systems using the PCI Express interface require careful attention to system architecture and timing. Designers must decide which of the PCI Express reference clock architectures—common, separate, or data clocked—meets their application’s functional and performance goals.

Due to the diversity of ICs such as FPGAs, processors, and switches that integrate PCI Express cores, developers may want to use timing solutions that support multiple I/O voltage and formats, as well as SSC technology.