Check out more coverage of GTC Fall 2022.
A new flagship family of graphics chips developed by NVIDIA is powered by its next-generation Ada Lovelace architecture, which leverages artificial intelligence (AI) to create more realistic images in games.
At the company’s GTC 2022 conference, NVIDIA CEO Jensen Huang said the Lovelace architecture underpins its latest GeForce RTX 40 GPUs. The architecture is named after the 19th-century mathematician regarded as an early pioneer in computer science.
The top-of-the-line GPU in the gaming family, the RTX 4090, offers 2X the performance and represents a major step up in power efficiency over its previous generation based on the circa-2020 Ampere GPU architecture, said NVIDIA.
The Lovelace GPU is packed with 76.3 billion transistors, which is around 2.7X more than in its Ampere GPU and close to the same number of transistors as its Hopper GPU for data centers, in addition to more than 16,000 CUDA cores.
These features give it one of the most advanced graphics chips on the market at a time when it is feeling increasing pressure from AMD (with its impending RDMA 3 GPU architecture) and Intel (with its high-performance Arc GPUs).
Lovelace Architecture
The chips will be manufactured by TSMC on a custom "4N" technology node. That represents a major step up from NVIDIA's last generation of graphics chips for gaming, built by Samsung Electronics on the 8-nm node.
The company said the use of a newer process technology, plus improvements in the underlying architecture, gives Lovelace-based graphics processors double the power efficiency of its previous generation using Ampere.
The Lovelace architecture stands out from NVIDIA’s Hopper architecture announced at the start of the year. While Hopper will power the H100 GPU for high-performance computing and AI workloads, the Lovelace architecture is ideal for general-purpose, graphics-heavy workloads—everything from creating physically accurate lighting and objects in games to building digital twins with NVIDIA’s Omniverse software platform.
Digital twins are large-scale simulations—for instance, of factory floors or cars—that give you a way to test and validate designs or processes in the safety of a virtual world before rolling them out into the real world.
NVIDIA said the RTX 40 series of GPUs introduce a host of innovations across the board. For instance, a new generation of streaming multiprocessors are 3X as fast as its previous generation, according to the company. The units can supply up to 90 TOPS of performance to shaders, which are used to work out the correct levels of light, darkness, and color during the rendering of a scene and are used in every modern game.
One of the highlights of the Lovelace architecture is what NVIDIA terms “shader execution reordering.” This increases execution efficiency by rescheduling shading workloads on the fly. The technology works in a way that apparently resembles out-of-order execution in a central processing unit (CPU). NVIDIA said the Lovelace architecture uses it to improve ray-tracing performance by up to 3X and frame rates up to 25%.
The chips also contain a new generation of ray-tracing (RT) cores that provide up to 200 TFLOPS to create more accurate reproductions of light, rendering more realistic shadows and reflections in a scene in real time.
Graphics chips based on Lovelace architecture feature new video encoders with support for the AV1 codec.
AI-Powered Graphics
The Lovelace architecture also brings NVIDIA’s fourth-generation tensor cores into the fold. These units are purpose-built to carry out the "matrix multiply and accumulate" operations at the core of machine learning.
NVIDIA said the next-generation tensor cores are up to 5X faster than the previous generation, supplying up to 1,400 TFLOPS, or 1.4 quadrillion operations per second, for the company’s FP8 format for AI workloads.
The new-and-improved inference processing cores belong to the same generation as those used in the Hopper GPU. As a result, they are equipped with the same “transformer engine” as the Hopper GPUs.
A new hardware engine called the optical flow accelerator supplements the tensor cores. It uses machine learning to compare pairs of high-resolution frames and predict the movement of objects rendered in a 3D scene. This gives Lovelace the ability to render everything in the frame, from particles and reflections to shadows and lighting, ahead of time, increasing frame rate without impacting the sharpness of the image.
The new tensor cores and hardware accelerators are what make one of the most advanced graphics technologies in Lovelace GPUs possible: a third generation of NVIDIA’s Deep Learning Super Sampling (DLSS).
Rendering every pixel in vast virtual worlds or in games with accurate physics, vivid lighting, and realistic materials requires a massive amount of computing power. But instead of attempting to render everything in a scene, the technology leaves out a portion of the pixels. Then, it uses machine learning to create new pixels that fill in the blanks, resulting in sharp, high-resolution graphics running at frame rates that outstrip the computational capabilities of NVIDIA’s GPUs.
Instead of only creating new pixels, the DLSS 3 technology uses AI to generate entirely new frames, increasing the frame rates by up to 4X compared to without DLSS. The technology can give a boost to performance even when a game is bottlenecked by the CPU.
All in the Family
NVIDIA said the flagship RTX 4090 is one of the most advanced on the market, equipped with 16,384 CUDA cores, up from 10,752 in its predecessor, while boosting the base clock frequency by more than 30%.
All of the new hardware features, coupled with a host of improvements in the Ada Lovelace architecture itself, means the processor can display 4K resolution gameplay at more than 100 frames/s.
The RTX 4090, accompanied by 24 GB of high-speed GDDR6X memory from Micron Technology, has the same 450-W power envelope as its previous generation. The chips use PCIe Gen 4 lanes for connectivity.
NVIDIA said the RTX 4090 brings up to 4X the performance of its current high-end graphics chip, the RTX 3090. It also delivers up to double the speed of its predecessor at the same level of power consumption.
The semiconductor giant also introduced a new mid-range graphics processor for the gaming family, called RTX 4080. The new chip comes in two different memory configurations: 12 or 16 GB of GDDR6X memory.
While neither of these configurations is as advanced as the RTX 4090, NVIDIA said the Lovelace-based GPUs can display higher-quality graphics with more realistic lighting faster than even the current RTX 3090.
The high-end RTX 4090 will cost $1,599 when it comes to market next month, while the mid-range RTX 4080 will sell for $899 (for the 12-GB GDDR6X configuration) and $1,199 (with 16 GB of GDDR6X).
Check out more coverage of GTC Fall 2022.