Augmented reality

New CPU/GPU Combo Targets High-End Mobile and Machine Learning

June 1, 2017
ARM’s latest Cortex-A55/A75 and Mali-G72 target AI applications in addition to high-end mobile applications.

ARM continues to push the envelope with its latest trio of cores that include the Cortex-A55 and Cortex-A75 CPUs (Fig. 1) and the Mali-G72 GPU. These take advantage of ARM’s recently announced DynamiIQ architecture. The combination targets the high-end mobile space, as well as applications that utilize both machine learning and augmented, mixed, and virtual reality (AR/MR/VR).

The core clusters have private L2 caches and a 4 Mbyte shared, 16-way set associative L3 cache that can be partitioned into a maximum of four groups. Repartitioning can be done at runtime by the OS or hypervisor. The DynamIQ L3 cache snoop control unit (SCU) is shared by all cores in the cluster. The SCU is part of the DynamIQ shared unit (DSU) that also include low latency interfaces for closely coupled accelerators, in addition to advanced power management support.

The cluster can contain any combination of up to 8 CPU cores (like the typical 4 by 4 big.LITTLE configuration) to more device-specific platforms (like one Cortex-A75 and seven Cortex-A55s, or vice versa). This allows developers to choose the combination that works best for their application. The latest combination supports the DynamIQ Energy Aware Scheduling (EAS) support.

The Cortex-A75 delivers 50% more performance that its Cortex-A72 and Cortex-A73 siblings. Likewise the Cortex-A55 is 2.5 times more power-efficient than the Cortex-A53 that is also found in big.LITTLE combos with the Cortex-A72 and Cortex-A73. The Cortex-A55 is built on the ARMv8.2 specification. The in-order CPU has a very small die size that is highly energy-efficient. The Cortex-A75 is 2.5 times larger in area than the Cortex-A55 but is more than 20% faster than the Cortex-A73.

The system allows fine grain power management from controlling cores individually to cache management. Parts of the L3 cache can also be turned off when required, such as performing audio or video playback when much of the system can be shut down.

The Cortex-A75 and Cortex-A55 offers a number of enhancements over earlier platforms. This includes Virtual Host Extensions (VHE) need for Type 2 hypervisors like Linux’s KVM. It supports atomic actions, extended cache stashing, and the wider 256-bit AMBA 5interface. The clean to persistent memory feature is designed for future non-volatile memory hierarchies.

The Int8 byte-oriented dot product targets neural network and machine learning applications. Essentially, deep neural networks (DNNs) work very well with smaller weight values and 8-bits is usually more than adequate. This allows the CPU to handle these matrix operations efficiently. The GPUs are also being tuned to handle this instead of just larger integers or floating point numbers.

ARM is also providing new branch prediction support that takes a neural net-like approach. This isn’t the first time this approach has been used. AMD’s Ryzen also uses a neural net structure for its branch prediction support.

A typical system will often include the Mali-G72 GPU, along with additional display and video support that has already been available like the Mali-V550 video subsystem (Fig. 2). The Mali-G72 GPU extends ARM’s Bifrost architecture and provides 1.4 times the performance of earlier subsystems. It has also been enhanced to support DNNs. Its GEneral Matrix Multiply (GEMM) is 17% more energy-efficient than earlier Mali GPUs.

The Mali-G72 GPU also targets AR/MR/VR space with multiview drawing support. Multiview support is where two almost identical images are rendered, one for each eye. Software to support AR/MR/VR can take advantage of this hardware acceleration allowing higher frame rates with reduced overhead and lower power requirements.

The GPU includes additional AR/MR/VR enhancements such as multisampling, anti-aliasing, and foveated rendering. This is where higher definition processing is done on the area where the eyes are focused. This is done by tracking where the eye is looking.

The Mali-G72 GPU also has Adaptive Scalable Texture Compression (ASTC) support. The transaction elimination (TE) support works on a 16- by 16-pixel block to identify identical blocks between two consecutive rendered targets. The Smart Composition feature extends TE to every stage of the user interface composition system. It eliminates the need to read and process identical information.

The high-fidelity gaming market is also supported by the Mali-G72. It has an 87% bandwidth savings compared to the Mali-G71. This handled by the pixel local storage (PLS) G-Buffer.

The Cortex-A55, Cortex-A75, and Mali-G71 can be used by themselves, but they are designed to be integrated. They will likely wind up in high end system-on-chip (SoC) solutions for the mobile space.

About the Author

William G. Wong | Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form. 

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below. 

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.  

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence. 

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!