What you’ll learn:
- Why ray tracing is important on mobile.
- How Imagination’s CXT family delivers level 4 ray-tracing capabilities.
Imagination Technologies is well-known for pushing the limits of GPUs, and its new CXT family continues this trend. One of the key features of this GPU IP is level 4 ray-tracing hardware support. At this point, ray tracing is a feature found on high-end systems that target gaming and video editing.
Ray tracing replicates real lighting and shading based on reflections and refraction of light (Fig. 1). This process is data- and compute-intensive but delivers photorealistic images. It also provides visual cues we normally notice unconsciously that aren’t replicating in bit-map imaging typically used to reduce processing and storage overhead.
There are five levels associated with ray-tracing hardware:
- Software using traditional GPUs
- Ray-box and ray-triangle testers
- Bounding-volume-hierarchy (BVH) processing in hardware
- BVH support plus coherency sorting in hardware
- Coherent BVH processing with BCH hardware builder
The more that can be moved into hardware with this ray-tracing level system (RTLS), the better, because it improves performance and reduces power requirements. The downside is the need for more transistors to make it work. NVIDIA’s RTX 8000 with ray-tracing support is almost three years old, but that and subsequent hardware has been sitting on PCI Express cards for PCs and servers. Attaining this type of performance on mobile hardware is a bit more challenging.
The basis for Imagination’s ray-tracing support is the Photon Ray Acceleration Cluster (RAC) (Fig. 2). This starts with a box tester unit (BTU), a dual triangle tester unit (DTTU), and a procedural tester unit (PTU) that searches for rays intersecting with an object in 3D space. This hardware tests rays against axis-aligned boxes from the 3D scene hierarchy. It effectively implements level 2 support.
The addition of the box primitive scheduler (BPS), ray reference courier (RRC), ray store (RS), and ray task scheduler (RTS) brings the hardware up to level 3 RTLS. These dedicated blocks keep the ray date on-chip, and it handles ray traversal, tracking, and monitoring.
The packet coherency gather (PCG) block brings the RAC up to level 4 RTLS. It analyzes all active rays and creates groups of coherent rays. These can be tested together against the 3D scene.
A single RAC can handle 433 Mrays/s and 16 Gbox-tests/s. The hardware supports the VulkanRT ray-query and ray-pipeline interfaces. It works on all ray-traced content and enables advanced ray-tracing effects on a mobile power budget.
The CXT RT3 combines three Photon RAC blocks with other GPU blocks, including Tensor Processing Units (TPUs) and Unified Shading Clusters (USCs). Multiple CXT RT3s can be combined as well (Fig. 3). This just goes to show how many different tasks use GPUs. The CXT RT3 can deliver 30 to 60 frames/s of ray-traced content at 1080p.
Expect mobile devices like smartphones to make even better use of their hi-res displays. This moves mobile gaming and user interfaces to a new level.
Imagination provides developers with tools like PVRTune, which can examine low-level ray-tracing counters built into the hardware.