If you are learning about machine learning (ML) or need to accelerate your latest artificial intelligence (AI) creation, Intel’s new Movidius Neural Compute Stick (Fig. 1) may be just what you’re looking for. The USB 3.0 device contains a 28-nm Movidius Myriad 2 MA2x5x vision processing unit (VPU). Intel picked up Movidius when it was working on getting the dongle out the door. It complements Intel’s new Xeon and Xeon Phi. These have also been tuned to handle deep neural networks (DNN). The stick is priced at only $79.
The VPU is optimized for vision applications but can handle all sorts of DNN applications. While the chip has a pair of 32-bit RISC processors to manage resources but the heavy lifting is done by the dozen 128-bit SHAVE vector processors. The SHAVE units support 16- and 32-bit floating point plus 8-, 16- and 32-bit integer operations. The chip’s processors share 2 Mbytes of RAM on chip plus a 256 Kbyte L2 cache and the DDR interface provides access to 1 or 4 Gbits of DDR memory in a stacked memory configuration. The Stick has 4 Gbits for DRAM.
The chip runs at 600 MHz at 0.9 V, consuming less than 2.5 W of power. Its image signal processor (ISP) mode is designed to handle video streams directly, although the USB stick requires data to be loaded via the USB interface. The chip has a dozen 1.5 Gbit/s MIPI lanes that can be configured as CSI-2 or DSI interfaces. The chip can be configured directly using I2C or SPI. There is a 1 Gbit/s Ethernet interface as well. The 4 Gbit LPDDR chip comes in an 8-mm by 9.5-mm BGA package.
The chip can deliver 10 inferences/s using the standard GoogleNet benchmark in continuous inference mode. It does so using only 1 W of power while delivering 100 GFLOPS of performance.
Multiple Myriad 2 chips can be ganged together for more performance (Fig. 3). This scales almost linearly since the units work in parallel. A USB hub can be used with the sticks, but deployment of a system will likely utilize the chips directly. Multiple chips can also be used to run different DNNs instead of a single larger one, depending upon application requirements.
The Myriad 2 chip is available to OEMs and has been used in applications like DJI’s SPARK drone (Fig. 4). The chip processes captured video to identify objects and avoid them. The drone uses other SoCs for flight control, communication, and so on. The VPU is also handy for chores such as 3D mapping. DJI uses the VPU for facial and gesture recognition, as well as to implement a safe landing mode. The SPARK is priced at $499.
ML techniques can also be used to implement a follow me mode where the drone tracks a person or object, orienting the camera to capture the scene in real time (Fig. 5). Features like gesture control are possible with this type of system.
The stick is a handy way to evaluate the chip and to implement DNN applications. It is much more convenient for software developers than the development board (Fig. 6). All the platforms are supported by the Myriad Development Kit (MDK). The MDK includes vision, imaging, and linear algebra libraries, plus reference processing pipeline examples (including source code). Solutions for 3D depth, object tracking, and natural user interfaces are also available.
The stick will be what most developers utilize. It can be used with any USB 3.0 platform, ranging from a PC to the Raspberry Pi. As noted, a primary target is vision applications. A typical combination with the stick would be a Raspberry Pi with a camera attached. In this case the camera data is streamed through the USB interface. In a final product, the camera would typically be attached directly to the chip. This provides more throughput and lowers power requirements.