CARracing-1143851175.jpg

Groq’s AI Accelerator Eyes Hyperscalers, Autonomous Vehicles

Nov. 18, 2019
This machine-learning acceleration chip zeros in on applications such as autonomous vehicles, the cloud, and the enterprise.

A machine-learning acceleration chip developed by startup Groq is set to take on a range of high-performance applications from autonomous vehicles to the cloud and the enterprise. Its Tensor Streaming Processor (TSP) is a single-threaded machine that handles integer (INT8) and floating-point models. Among the features is a global shared memory bandwidth of over 60 TB/s using on-chip memory. The three-quarter length PCI Express (PCIe) card has a x16 PCIe Gen 4 interface.

The TSP uses a different compute layout (see figure) designed to deliver high performance. The architecture is designed for compiler-orchestrated execution. This allows for deterministic operation, since there’s no caching system involved. Runtime performance and power requirements are known at compile time. This also avoids potential hacks like Spectre and Meltdown, which take advantage of hardware speculation features in conventional processors that don’t exist in the TSP.

The Tensor Streaming Processor (TSP) provides deterministic performance and eliminates tail latency.

The TSP is optimized for a batch size of one that’s preferable for applications like self-driving cars and financial applications. The system also eliminates tail latencies that can impact enterprise operation in large servers.

The chip is single threaded, but it can quickly switch between models and layers. As a result, the chip can handle multiple models without a lot of data movement that can otherwise reduce performance.

Groq isn’t announcing performance numbers yet, but the company’s collection talent bodes well for a high-end machine. Jonathan Ross is Groq’s technical founder and CEO who also worked on Google’s TPU effort as a 20% project. He designed and implemented the core elements of the original chip. Dinesh Masheshwari is CTO. He worked on multi-threaded multi-processor systems in the mid-1980s. Dinesh served as CVP, CTO of the Memory Division at Cypress Semiconductor. And Michelle Tomasko, Vice President of Engineering, worked Nvidia, Google Consumer HW, and Transmeta.

About the Author

William G. Wong | Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form. 

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below. 

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.  

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence. 

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!