This article is part of our 2023 Electronic Design Technology Forecast issue.
What you’ll learn:
- Why CXL is needed to handle growing amounts of data.
- Why memory disaggregation will help data centers scale.
- How focusing on performance can help the computing industry unlock its full potential.
Today’s computing demands are massively complex and diverse, exceeding past requirements. The explosion of applications in high-performance computing (HPC), artificial intelligence (AI), machine learning (ML), and data analytics is producing incredible amounts of data.
However, the memory these applications depend on to function is reaching an innovation impasse. Although the number of processor cores packed into data centers is rising, modern system designs have reached their next weakest link: memory capacity and memory bandwidth.
The solution is to move beyond the limitations of physical memory and prioritize the optimization of memory instead of focusing on the processor itself. Compute Express Link (CXL) has become increasingly necessary to solve the issue, as it facilitates on-demand access to memory across servers.
Not only is the CXL approach viable to solve our most pressing needs in the near future, but it’s also poised to unlock potential that will permanently change computing in the long term.
Why We Need CXL
To scale and handle growing amounts of information, data centers are being equipped with more physical memory to their data centers, creating what is at best a patchwork of resources.
Microsoft’s Azure is a prime example of a large, experienced organization overwhelmed by the underutilization of costly physical memory. With 25% of memory stranded by limited memory bandwidth, Azure pays a taxing CPU performance cost, considering that 50% of server expenses are dedicated to DRAM.
To embrace scaling, data centers not only need to invest in promising new technologies, but also maximize the performance of the hardware they already have on hand. Memory must be moved outside the server if data centers are to scale effectively. Yet, current options that include block storage and cloud services aren’t viable solutions by themselves.
CXL helps ease pressure on DRAM while increasing computing efficiency and performance. The technology is unlocking a new age of disaggregated memory, mirroring the revolution that came with disaggregated storage.
Focusing on Performance to Unlock Potential
Workloads at the heart of everything from HPC to AI have significant memory requirements. But designers struggle to make use of the additional cores available in modern CPUs.
The leap forward in the number of CPU cores is mismatched with a lack of memory bandwidth. And it continues to worsen due to the limited physical space to incorporate more memory and the limited access to additional memory beyond the motherboard.
Concerns about latency compromises are surfacing, too. Given the fact that CXL is piggybacking on the PCI Express (PCIe) physical layer and relying on physical memory paired with PCIe, one would ordinarily deduce a penalty on a key critical metric—latency. In general, the further the memory is from the CPU, the higher the latency and the poorer the performance.
CXL takes a system-wide approach to close the gap between CPU power and performance, and it’s evolving to address issues from memory and bandwidth to scaling and latency. Memory must not stand alone if the computing industry hopes to handle the increasing demands for speed.
Last year, the world’s leading supercomputing conference, SC22, turned CXL’s hypothetical potential into a conversation about the here and now. At the event was the first CXL 3.0 prototype, a “smart memory node” designed by UnifabriX. It’s capable of tapping into both local and remote memory to unlock additional performance and beat the bandwidth limits of DRAM. The system’s performance was measured with the HPCG standard industry benchmark.
The performance boost paves the way for additional uses for CXL, such as advanced memory pooling that can future-proof the backbone necessary for rapidly evolving workloads.
Closing the computing-memory gap is CXL’s primary goal, as it significantly impacts performance. CXL improves the total cost of ownership (TCO) of a server. Furthermore, it increases data-center utilization by freeing up CPU cores to be used on high-value tasks and innovating memory channel bandwidth to smoothly handle growing workloads.
Adopting CXL is the only sure way for the computing industry to shoulder growing demands and improve performance.
Read more articles in our 2023 Electronic Design Technology Forecast issue.