GPGPUs Put Machine Learning on the Fast Track

Sept. 17, 2018

NVIDIA’s AGX line of AI-augmented GPGPUs targets machine-learning applications from self-driving cars to medical instruments.

William G. Wong

NVIDIA’s support for machine learning using GPGPUs has been extensive. Its latest Turing architecture-based GPU, the RTX 8000, combines ray-tracing support with machine-learning (ML) acceleration. The Turing architecture includes Tensor Cores to accelerate ML applications.

NVIDIA’s T4 Tensor Core GPU, built on the Turing architecture with its Tensor Core support, targets hyperscale deployment where high-performance interference is needed (Fig. 1). The PCI Express (PCIe) card only needs 75 W. Its small form factor allows for very dense system design.

Www Electronicdesign Com Sites Electronicdesign com Files Nvidia Agx Fig1

1. NVIDIA’s T4 is designed for hyperscale deployment and includes Turing Tensor Cores to accelerate machine-learning applications.

The T4’s Tensor Cores support INT4, INT8, FP16, and FP32 data types, enabling developers to optimize performance while minimizing size and the amount of computation needed for a particular deep-neural-network (DNN) model. The system is also designed to address video applications. It can analyze 38 full-HD video streams in real time.

Each T4 has 2560 CUDA cores, 320 Turing Tensor Cores, and 16 GB of GDDR6 with a bandwidth over 320 GB/s. The board includes a x16 PCIe interface, and is rated at 260 INT4 TOPS, 130 INT8 TOPS, 65 FP16 TFLOPS and 8.1 FP32 TFLOPS.

NVIDIA’s DRIVE AGX line (Fig. 2) brings the same architecture to automotive applications that had been served by NVIDIA’s P4 series. The top-end NVIDIA DRIVE AGX PEGASUS, which combines two NVIDIA Xavier processors and two Tensor Core-based GPUs, delivers 320 TOPS. The more compact NVIDIA DRIVE AGX Xavier has a single processor but only needs 30 W. Both are available as development kits.

Www Electronicdesign Com Sites Electronicdesign com Files Nvidia Agx Fig2

2. The NVIDIA AGX line brings TensorRT acceleration to automotive applications.

These systems run DRIVE Software 1.0 that targets autonomous systems. The DriveNet DNN support allows vehicles to detect and classify objects in the surrounding environment and track them from one frame to the next. With the LaneNet and OpenRoadNet support, the system can identify lane markings and detect drivable spaces.

The DRIVE IX SDK also includes support for processing input from driver-facing cameras. It can recognize a driver’s facial expression to detect whether they are drowsy or paying attention to the road.

In addition, the software comes with a data-recording tool. Consequently, developers and manufacturers can collect real-time, time-stamped data from various sensors for training, testing, and system validation.

Www Electronicdesign Com Sites Electronicdesign com Files Nvidia Agx Fig3

3. Clara AGX is designed for medical instrumentation.

Another application area that NVIDIA is targeting is medical instrumentation, where ML can provide additional support. The Clara AGX (Fig. 3) system is a combination of hardware and software. The Clara SDK provides developers with a set of GPU-accelerated libraries for computing, graphics, and AI designed for medical applications such as image processing and rendering, and computational workflows for CT, MRI, and ultrasound. The tools leverage CLARA containers and Kubernetes, allowing applications to scale.

The Jetson AGX Xavier platform is designed for mobile applications such as robotics (Fig. 4). The platform’s SoC includes a 512-core Volta GPGPU with Tensor Cores along with an 8-core ARM v8.2 64-bit CPU cluster with 8 MB of L2 cache and 4 MB of L3 cache. The module packs 16 GB of 256-bit LPDDR4x with a 137-GB/s bandwidth and 32 GB of eMMC 5.1 flash memory. Non-volatile storage can be expanded using an M.2 Key M (NVMe) or M.2 Key E interface, an SD/UFS socket, as well as an eSATAp interface and an USB 3.0 Type A connection.

Www Electronicdesign Com Sites Electronicdesign com Files Nvidia Agx Fig4

4. Jetson AGX Xavier development kit targets robotic and compact machine-learning applications.

The system also incorporates a pair of NVDLA deep-learning accelerators and a 7-way VLIW vision processor. Hardware encode/decode support can handle two 4Kp60 as well as a pair of high-efficiency video coding (HEVC) at 4Kp60. The module measures 105 by 105 mm.

The Jetson AGX Xavier can handle 16 CSI-2 camera connections. It has two x8 PCI Express Gen 4 interfaces, plus a Gigabit Ethernet interface and two USB-C interfaces. One features DisplayPort support, while the other maintains Close-System Debug and Flashing support. A 40-pin header exposes serial and GPIO. The system offers High-Definition Audio, HDMI, and DisplayPort outputs.

Versions are available that use as little as 10 W, with the top end using 30 W. The developer kit is priced at $2,499.

Www Electronicdesign Com Sites Electronicdesign com Files Source Esb Looking For Parts Rev Caps

About the Author

William G. Wong | Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form.

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below.

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence.