Vision DSP Family Eyes Low- and High-End Apps

May 7, 2021

New additions to Cadence’s Vision DSP family slide into the low and top end of the system-solution spectrum.

Related To:

Cadence has experienced great success with its Tensilica Vision DSP family. The Tensilica Vision DSP family shares a common single-instruction multiple-data/very-long-instruction-word (SIMD/VLIW) architecture (Fig. 1). Its latest announcement is an expansion of the family of processor IP cores.

1. Cadence’s Tensilica DSPs share a common architecture. Configurations such as SIMD width, and options like floating point support, can be combined to create unique solutions.

The company’s latest low-end solutions target applications like smart sensors, mobile devices, and augmented reality (AR), where low power and always-on are becoming more common. Features like user authentication via voice commands, face detection, and fingerprint recognition requires artificial-intelligence/machine-learning (AI/ML) support. High-end solutions analyze multiple video streams in real-time, and they also use multiple AI/ML models.

The SIMD/VLIW architecture includes multiple, wide, 2048-bit, dual load/store memory interfaces and scatter-gather support. The cores also feature a 128-/256-bit AXI iDMA interface.

The Vision DSP family adds the new Tensilica Vision P1 at the low end and the multicore Tensilica Vision Q8 at the high end (Fig. 2). The Tensilica Vision P1’s 128-bit SIMD support can deliver over 0.256 TOPS of performance using one-third the area and power of its Tensilica P6 sibling. This make the P1 ideal for always-on applications that require minimal power.

2. The Vision DSP family ranges from the low-power Vision P1 to the multicore Vision Q8.

The single-core Tensilica Vision Q8 supports a 1024-bit SIMD engine with 3.8 TOPS of performance and 129-GFLOPS FP32 floating-point performance. That’s twice the performance of the Q7 DSP.

This family of DSP cores provides developers with a range of options to meet different power and performance requirements that can include always-on, low-power solutions to high-performance, multistream machine-learning platforms. They share a common architecture and software solution from Cadence that allows for easy migration from one solution to another.

The company can provide developers with core IP or complete subsystems (Fig. 3). This is especially handy for more complex systems, especially those that need to address safety-related applications such as automotive platforms that need to be certified to safety standards such as ISO26262 ASIL-D certification.

3. Cadence provides complete subsystem designs like this one based around the Tensilica Vision Q8 DSP core, which can deliver 800 GFLOPS of FP32 performance.

Cadence’s software support includes Halide, OpenCL OpenVx Graph, and C/C++ compiler support (Fig. 4). The runtime can work with its own single-threaded XTOS or multithreaded XOS operating systems or with third-party RTOSes. Software libraries are provided for features like TensorFlow Lite for Microcontrollers support.

4. All Tensilica DSPs are supported by shared Xtensa C/C++ and OpenCL compilers. Halide is a C/C++-based image and array enhanced compiler.

The DSPs are supported by the Tensilica Xtensa Neural Network Compiler (XNNC). XNNC also can target Cadence’s AI/ML DNA 150 processor. XNNC supports AI/ML models from TensorFlow, Caffe2, Keras, PyTorch, and Chainer.

Floating point is optional in these platforms, where often integer or fixed-point support is sufficient for an application. The Tensilica DSP architecture handles all of the new compact, numeric formats used by AI/ML applications, especially when scaling down to reduce performance and power requirements. The high-end platforms include complex floating-point support for FP16, FP32, and FP64 data formats. There are ADDSUB FFT enhancements for FP16 and FP32 as well.

In addition, Cadence supports the Tensilica Instruction Extension (TIE) language. Designers can create new TIE instructions that are automatically handled by the optimizing compiler. Typically, these new instructions are hidden from high-level language programmers, with the compiler handling utilization and optimization of TIE instructions to deliver higher performance with less overhead. TIE instructions can be added while maintaining ISO26262 certification.

About the Author

William G. Wong | Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form.

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below.

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence.