Image credit: Meta Platforms
272129674 454504216213565 903503002058288683 N 61f96f7484598

Meta Uses NVIDIA Chips to Build Sprawling New Supercomputer for AI

Feb. 2, 2022
Meta said the AI supercomputer would play a key role in its ambitions for the metaverse, a relatively new term that refers to virtual worlds where people can meet, work, or play.

Meta Platforms said it would use thousands of graphics processors designed by NVIDIA as the brains of a colossal supercomputer that the technology giant is building to push the envelope in artificial intelligence.

Meta, which changed in name from Facebook, said the AI Research Supercluster would contain a total of 16,000 of NVIDIA's graphics processing chips to train machine-learning models faster and more accurately than it has been able to previously. The company believes it will rank as the fastest AI supercomputer in the world when it is completed by the end of this year and give a booster shot to its ambitions in the metaverse.

The software that it trains with the system will play a central role in Meta’s ambitions in the metaverse, which is a relatively new concept that broadly refers to virtual worlds that blur the line between physical and digital realms. People would use augmented- or virtual-reality hardware to access imaginary worlds where they can meet or overlay digital objects on a scene to collaborate on projects, such as designing a bridge or a vehicle.

Meta said the supercomputer would help it develop more accurate machine-learning models that can learn from trillions of bits of data, work with hundreds of languages, and analyze text, images, and video together at the same time. Meta said that these technologies are among the core building blocks for the metaverse.

Souped-Up Supercomputer

AI supercomputers are assembled out of thousands of graphics processors clustered in so-called "compute nodes," which communicate with each other rapidly using a high-performance network fabric. Meta said the system is already in use by the company's researchers and taps 760 A100 DGX systems, for a total of 6,080 of NVIDIA's flagship A100 GPUs, cementing its place as one of the fastest AI supercomputers in the world.

Every node in Meta's new system communicates via NVIDIA's InfiniBand switches, according to Meta. The supercomputer cluster is bound together using a 200 GB/s Infiniband network fabric, the company said.

Meta said that as it stands the system is 20 times faster at computer vision tasks and three times faster for training voice-recognition tools than a previous computer system that uses 22,000 NVIDIA V100 GPUs.

Meta plans to attach a total of 16,000 chips to the system, which would transform it into the world’s fastest AI supercomputer, clocking in at more than a quintillion—or 1,000,000,000,000,000,000—operations per second of performance for handling huge machine-learning models. The supercomputer will have 2,000 compute nodes based on NVIDIA's A100 DGX system, more than double its performance today.

Meta said the computer system would help it build better AI models to boost its metaverse ambitions, such as software that can translate voices in real-time for large groups of people who speak different languages.

Behind the cloud-computing units of Amazon, Microsoft, and Google, Meta reportedly runs the fourth largest data center operation in the U.S. and is responsible for buying hundreds of millions of dollars of chips a year. Meta said the new AI supercomputer, which has been in development for close to three years at this point, required it to redesign the cooling, networking, and storage used in its server hardware from the ground up.

The Metaverse and NVIDIA

Meta and other companies investing in the metaverse are bound to use a vast amount of computing power from NVIDIA chips in the process. NVIDIA's GPUs have long been used to run high-end graphics in PCs. The chips have become the golden standard for carrying out AI jobs in data centers. A GPU contains thousands of cores that work together to run computations central to machine learning, which can overload the CPU.

NVIDIA is investing in the hardware and software technologies behind the metaverse as well. The company has rolled out a set of tools, called Omniverse, to help software developers build three-dimensional virtual worlds.

It is also facing mounting competition in data centers from Intel, which upped its ambitions in AI with its $2 billion deal for Habana Labs in 2019. Intel is mounting a major push into graphics processors for PCs for the first time in years. In addition, it is rolling out a server graphics processor to run the U.S. Energy Department's Aurora supercomputer, which will become the world’s fastest for scientific research when it is finished later in 2022.

About the Author

James Morra | Senior Editor

James Morra is a senior editor for Electronic Design, covering the semiconductor industry and new technology trends, with a focus on power management. He also reports on the business behind electrical engineering, including the electronics supply chain. He joined Electronic Design in 2015 and is based in Chicago, Illinois.

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!