Image credit: Intel
Hls 1 Edit 627d39317721c

Intel Leverages Habana’s AI Chips to Train Self-Driving Cars

May 12, 2022
The company has also deployed more than 8,000 Gaudi2 server chips in its data centers to inform of further advances of the upcoming Gaudi3 chip.

Intel rolled out a new generation of AI server chips that offer a massive jump in computing, memory, and networking capabilities, setting it apart from NVIDIA’s GPUs for training deep learning in data centers.

The company is courting every major cloud-computing giant with the new Gaudi2, the second generation of the server chip that debuted in cloud services for training AI models offered by Amazon Web Services (AWS) last year. However, it’s also using the Habana Labs-designed chips in its own data centers to push the envelope in autonomous driving and other areas.

Intel’s Mobileye unit is using Habana’s first-generation Gaudi accelerators to train the AI at the heart of its self-driving vehicles to sense and understand their surroundings. Gaby Hayon, executive vice president of R&D at Mobileye, said that as training such models is time-consuming and costly, Mobileye is using Gaudi in AWS's cloud and on-premises in its data centers to get “significant cost savings” compared to GPUs.

Hayon said the use of Habana’s Gaudi accelerator cards is giving it “better time-to-market for existing models or training much larger and complex models aimed at exploiting the advantages of the Gaudi architecture.”

Intel has also deployed more than a thousand eight-card Gaudi2 servers in its data centers to support research and development into its Gaudi2 software and to inform of further advances in its next-generation Gaudi3.

AI Cost Savings

Gaudi2 is based on the same heterogeneous architecture as its predecessor. But Habana upgraded to the 7-nm node to cram more compute engines, on-chip and on-package memory, and networking into the chip.

Intel said Gaudi2 can run AI workloads faster and more efficiently than its previous chips while bringing major performance leaps over NVIDIA's A100 GPU. But the main selling point, according to the company, is in reducing the total cost of ownership (TCO).

Last year, AWS rolled out a cloud-computing service based on the first-generation Gaudi. It claimed customers would get up to 40% better performance-per-dollar than instances running on NVIDIA’s GPUs.

The Gaudi2 integrates 24 Ethernet ports directly on the die, with each running up to 100 Gb/s of RoCE—RDMA over Converged Ethernet—up from 10 ports of 100 GbE in its first generation. That removes the need for a standalone server networking card in every server, reducing system costs as a result. Integrating RoCE ports into the processor itself gives customers the ability to scale up to thousands of Gaudi2s using Ethernet. 

“Reducing the number of components in the system reduces TCO for the end customer,” said Habana COO Eitan Medina. Using Ethernet also allows customers to avoid lock-in with proprietary interfaces such as NVIDIA’s NVLink GPU-to-GPU interconnect.

Most of the Ethernet ports are used to communicate with the other Gaudi2 processors in the server. The remainder supplies 2.4 TB/s of networking throughput to other Gaudi2 servers in the data center or cluster. 

Software Battle

Swiping market share from NVIDIA has been a challenge for Intel and other players in the AI chip landscape. The graphics-chip giant has invested aggressively in its AI software tools, including its CUDA development kit, to help run AI workloads on its GPUs.

The AI chip market is estimated to grow by about 25% per year over the next five years to $50 billion or so, said Sandra Rivera, executive vice president and general manager of Intel’s data center and AI group.

In addition to building a suite of server chips, including Habana’s AI accelerators and Arc general-purpose GPUs, Intel is trying to lure customers by making its open-source software development a bigger priority.

Habana’s Synapse AI software development kit (SDK) is open-standard and free to access. Customers can use Habana’s software to translate workloads from PyTorch and TensorFlow to tap into the processing power of Gaudi2 and the 24 Tensor Processor Cores (TPCs) inside based on a very long instruction word (VLIW) architecture. The SDK includes Habana’s compiler, runtime, libraries, firmware, drivers, and other tools.

Habana’s Medina said the Gaudi2 aims to make AI training more accessible. “The job of the software is to hide the complexity of the hardware underneath and to support customers where they are,” he added.

About the Author

James Morra | Senior Editor

James Morra is a senior editor for Electronic Design, covering the semiconductor industry and new technology trends, with a focus on power electronics and power management. He also reports on the business behind electrical engineering, including the electronics supply chain. He joined Electronic Design in 2015 and is based in Chicago, Illinois.

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!