HBM3e-Equipped Grace Hopper Superchip Targets Generative AI
This article is part of the TechXchange: Generating AI.
According to NVIDIA, its new Grace Hopper AI Superchips can improve accelerated computing and generative AI application speeds thanks to the GH200 platform that incorporates the world’s first HBM3e processor. The Superchip increases high-bandwidth memory, allowing it to run large AI models.
Optimized for AI inference functions, the GH200 is able to run complex generative AI workloads, vector databases, and recommender systems. It’s expected to be released in the second quarter of next year.
The Grace Hopper AI chip comprises a server with 282 GB of HBM3e memory, 144 Arm Neoverse cores, and speeds of 8 petaFLOPS (PFLOPS).
“To meet surging demand for generative AI, data centers require accelerated computing platforms with specialized needs,” said Jensen Huang, founder and CEO of NVIDIA. “The new GH200 Grace Hopper Superchip platform delivers this with exceptional memory technology and bandwidth to improve throughput, the ability to connect GPUs to aggregate performance without compromise, and a server design that can be easily deployed across the entire data center.”
The Grace Hopper Superchip can connect to other NVIDIA NVLink Superchips so that they’re able to work together, running the models for generative AI. Also, the high-speed tech enables the GPU to access the CPU memory. As a result, the platform can expand its fast memory to 1.2 TB while in dual configuration.
The HMB3e memory operates at 10 TB/s of combined bandwidth, which means the GH200 platform sees performance boosts with 3X more memory bandwidth and runs models 3.5X larger than the previous version. NVIDIA also said the new HBM3e memory technology is 50% faster than the current GH200 version.
NVIDIA MGX servers support the Grace Hopper Superchip platform using the HBM3e memory. Thus, systems manufacturers can install Grace Hopper into over 100 server variations.
Read more articles in the TechXchange: Generating AI.