Cortex-M Gets Heavy-Duty Machine Learning

Feb. 11, 2020

Like many, Arm is moving into machine learning in a big way. Its new Cortex-M55 and Ethos-U55 bring machine-learning acceleration to microcontrollers.

William G. Wong

Machine learning (ML) has taken the developer community by storm, but implementing many algorithms with any efficiency have required FPGAs, multicore CPUs, or high-performance GPGPUs. Though developers have used ML on microcontrollers, the ML models are more limited. ML performance on these smaller conventional platforms is also a bit slower—useful but not capable of handling more ambitious projects.

This is about to change with Arm’s announcement of the Cortex-M55 and the Ethos-U55. The Cortex-M55 can be used alone as it has its own ML augmentation. The Ethos-U55 can be added if ML applications are especially demanding. It could be paired with other Cortex-M platforms, too, but it will make more sense to be matched with the Cortex-M55, allowing ML support to be distributed.

The Cortex-M55 (Fig. 1) is built around a conventional Cortex-M that includes the Helium ARMv8.1-M machine-learning hardware-acceleration support. Helium adds 128-bit vector extensions including gather/scatter support and complex math support such an 8-bit vector dot product. This enables the Cortex-M55 to deliver 15X the performance of a basic Cortex-M platform.

1. The Cortex-M55 has a conventional Cortex-M that is augmented by the Helium ARMv8.1-M machine learning hardware acceleration.

The Ethos-U55 (Fig. 2) is what Arm calls a microNPU (neural processing unit). It can provide 32X the performance of a base Cortex-M. Together, the Ethos-U55 and Cortex-M55 and deliver up to 480X the ML performance of a non-accelerated microcontroller.

2. The Ethos-U55 microNPU can have up to 256 MACs. It supports on-the-fly weight decompression to make more efficient use of the available memory bandwidth.

The microNPU is designed to handle the ML heavy lifting while the Cortex-M manages the system and takes on lightweight ML models. This distributed approach allows many models to be employed in the system. It also lets developers balance performance and power. It's possible to employ just the Cortex-M55 for lighter-weight applications using less power while Ethos-U55 is powered down. Of course, the Ethos-U55 is able to efficiently handle video streams that would cripple lesser systems.

The Ethos-U55's scalable design can contain 32, 64, 128, or 256 MACs. DMA support for moving model weights supports on-the-fly decompression. There's also a built-in weight decoder. The accelerator communicates with the micro via system SRAM. The Ethos-U55 can be configured in a number of ways that run applications in parallel to the micro.

Both platforms support popular ML frameworks. For example, the Cortex-M55 can run TensorFlow Lite models. Both will handle multiple models at the same time. A unified development tool chain allows models to target either platform.

Expect to see chips based on the Cortex-M55 and the Cortex-M55 with an Ethos-U55 in about a year. A Cornerstone-300 reference design is available to get chip developers started.

“Enabling AI everywhere requires device makers and developers to deliver machine learning locally on billions, and ultimately trillions, of devices,” said Dipti Vachani, senior vice president and general manager, Automotive and IoT Line of Business, Arm. “With these additions to our AI platform, no device is left behind as on-device ML on the tiniest devices will be the new normal, unleashing the potential of AI securely across a vast range of life-changing applications.”

The Cortex-M55/Ethos-U55 sets the bar for microcontroller-based artificial intelligence (AI). ML applications that once targeted higher-end platforms will be practical on this micro-based solution that will consume less space and power.

About the Author

William G. Wong | Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form.

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below.

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence.