The Ryzen family now has a full complement of chips to meet different processing needs, with AMD adding members ranging from the Ryzen 3 up to the high-end Threadripper. The top-end, 3.4-GHz, 16-core 1950x is now available for only $999. It supports a 4-GHz boost mode, 64 PCI Express channels, four-channel DDR4, as well as unlock overclocking. The latter extends through the Ryzen family. At the other end of the spectrum is the $109, quad-core, 3.1-GHz Ryzen 3 1200.
All processors in the family are built around the Zen core (Fig. 1). Each core runs a pair of threads; as a result, the top-end Threadripper runs 32 threads simultaneously. In general, the Ryzen family lines up against Intel’s Core i3 through i9 family.
1. AMD’s Ryzen family is built on the Zen core. The Threadripper delivers 16 cores that run 32 threads.
Sixteen cores may seem like a lot, but even larger numbers can be found on AMD and Intel server chips. Intel’s latest Platinum Xeon can host up to 28 cores running up to 56 threads. Server chips typically support multiple-chip configurations. Platinum Xeon can be part of an eight-chip solution that supports up to 448 threads.
The Ryzen chips are built around a four-core CPU complex (Fig. 2). They share a 16-way, set associative L3 cache that’s mostly exclusive to the L2 caches. The L3 cache is built around four slices interleaved by the low-order portion of the address.
2. The Ryzen chips are built around a four-core CPU complex.
The Zen architecture provides a number of improvements, including better branch prediction using artificial intelligence techniques along with two branches in each branch table entry. There is a larger opcode cache, and the micro-op dispatch width is 6 (up from 4), which is a 50% improvement over previous designs. There are now 84 integer instruction schedulers and 96 floating-point schedulers. The floating-point unit can handle four instructions at a time, and the Load/Store support handles up to 72 out-of-order loads (Fig. 3).
3. The Load/Store and L2 cache interface can handle up to 72 out-of-order loads. There is a 32-kB, 8-way L1 data cache that supports two 128-bit accesses.
Significant work was done on the caching system that includes faster L2 and L3 caches (Fig. 4). The L1 cache now operates in write-back mode; L1 and L2 data prefetch performance has been enhanced. Total bandwidth of the L3 cache has improved by a factor of five.
Aggressive clock gating with multi-level regions is designed to reduce power requirements even with increases in overall performance. Some features that help to reduce power include the L1 write-back cache support, larger opcode cache, an improved stack engine and move elimination support.
4. Each core has its own L1 and L2 cache. All cores share a global L3 cache.
The primary target for AMD’s Ryzen and Intel’s Core families is the desktop and laptop arena, but quite a few of these chips find their way into embedded applications. Ryzen is a CPU while the Core and AMD’s APU families include a GPU with the CPU. This has little effect at the high end, where the CPU tends to be paired with one or more high-end, external GPU chips. At the low end, integration is often the norm for embedded applications because of power and space considerations. AMD’s APUs tend to work better for these applications, but the current version doesn’t use the newer Zen core.
The size of the Threadripper socket will impact the size of motherboards capable of supporting the chip, such as the ASUS Zenith Extreme (Fig. 5). The 16-core platform uses a new TR4 socket, which has 4094 pins. The socket will be used in high-end PCs and AMD EPYC servers.
5. The ASUS Zenith Extreme can handle the massive, 16-core Ryzen Threadripper chip.
AMD EPYC 7000 series processors target the server and enterprise space where Intel Xeons compete. The EPYC 7000 supports up to 32 Zen-based cores running 64 threads, 128 lanes of PCI Express Gen 3, and eight memory channels that handle up to 2 TB of memory. Could a 32 core Ryzen be in the future?
In general, Threadripper will wind up in high-end gaming PCs. Embedded systems that require a large number of cores will move to the EPYC family, which also sports larger caches and other features that enhance server-level support.
The rest of the Ryzen family utilizes the more compact AM4 socket. It has only 1331 pins, and it supports AMD’s newer APUs as well as the Ryzen chips (it’s also known as PGA 1331). The socket can handle dual-channel DDR4 interfaces. At this point, Ryzen 7 eight-core chips will be at the top end for this form factor. That’s still 16 threads, but it should suffice for many high-end embedded applications. There are 65- and 95-W versions of the Ryzen 7. Ryzen 3 chips have a 65-W TDP.
High-end processors haven’t been making significant performance strides in single-core operation. Most of the benefits are coming from adding more cores and providing more power-efficient cores. Ryzen delivers more cores as well as providing a more efficient platform. The wide range of family options allows embedded developers to choose a chip that closely matches their design requirements.