Memory And Processor Advances Redefine Digital Technology
Designers are in luck. Many digital technologies like PCI Express Gen 3 and USB 3.0 are beginning to mature, providing more options to use instead of new specifications to covet. Still, other technologies will emerge and have a significant impact, including DDR4, MRAM dual-inline memory modules (DIMMs), and 64-bit Arm cores.
Memory and Storage
DDR4 is on the horizon and will see the light of day this year. DDR3 will remain a major factor, especially for embedded applications, while DDR4 takes the high ground.
DDR4 increases throughput starting at 2133 Mtransfers/s. It reduces power consumption starting with a lower voltage range (1.04 to 1.2 V). It also changes from a bus interface to a point-to-point connection with one DIMM per memory channel. Switched systems are supported, although they’re likely to be found only on servers requiring a large number of DIMMs. There will still be action in DDR3, though. Non-volatile DDR3 DIMMs are now available in two forms.
The first was the flash/DRAM pairing, such as Viking Technology’s ArxCis-NV (see “Non-Volatile DIMMs And NVMe Spice Up The Flash Memory Summit”). This DIMM requires an external supercap, but it takes advantage of the capacity of the flash and the DRAM to deliver storage comparable to a standard DRAM DDR3 DIMM, albeit a more costly one. Data is copied to and from the flash during power transitions.
The second non-volatile DDR3 storage to watch is based on Everspin’s spin-torque MRAM (see “Magnetic DRAM Arrives”). The chips used in this DIMM are 64 Mbits, but it doesn’t require extra power or even refreshing.
Non-volatile storage is changing the way things work. Hybrid disk drives are on the rise as they’re more tightly integrated with operating systems like Windows 8 (see “Hybrids And The Cloud Mark 2012’s Key Storage Advances”). Hard drives continue grow in capacity, but vendor consolidation has reduced competition.
Flash is where the action will be in the disk-drive and related storage market. Nonvolatile Memory Express (NVMe) is going to explode this year. Components like IDT’s NVMe controller simplify PCI Express connections to flash chips (see “Innovations Push Processors And Storage Into New Digital Roles”).
The fast, scalable PCI Express bandwidth will be pushed by flash that has exceeded the bandwidth provided by the SATA and SAS disk interfaces. The lines are blurring as PCI Express is moving to the drive form factor, and comparable functionality is available in PCI Express adapters (Fig. 1). SCSI Express (SCSI on PCI Express), also based on PCI Express, will make a showing this year as well, but it will be focused on the enterprise.
Micros Big And Small
What do you do when you have lots of transistors to play with? At the top end of the spectrum, you put lots of cores on one chip. Intel’s Xeon Phi packs in 60 x86 cores with vector support (Fig. 2). It competes with general purpose on graphics processing unit (GPGPU) platforms like NVidia’s Tesla K20X and AMD’s high-end FirePro (see “Battle Of The Supercomputing Nodes”). These platforms target large high-performance computing (HPC) clusters, but a single chip may be just what’s needed to make an embedded application practical.
Move down the spectrum a little and you’ll find 64-bit Arm Cortex-A50 chips multiplying (see “Delivering 64-Bit Arm Platforms”). The long-awaited platform will target servers, but the chips’ low power requirements make them a candidate for even mobile devices.
Intel’s 22-nm Haswell processors are starting to show up. Initially, they will target tablet and ultrabook platforms. The architecture doubles the graphics performance. The 14-nm Broadwell is on the horizon too.
Embedded developers will be looking toward the Atom “Avoton” release later in the year. Targeting the microserver architecture, it will have competion from AMD’s Opteron 3300 and 4300 chips as well as the 64-bit Arm processors. Avoton may have up to eight cores.
The accelerated processing unit (APU) approach popularized by AMD will be getting company with Arm platforms mixed with Arm’s GPU, Mali. The latest variant is the Arm CPU’s equal when it comes to cache use.
The most action will be in the 32-bit space that Arm cores now dominate. Other 32-bit platforms are still very successful, but the core Arm architecture is found in platforms from the tiny Cortex-M0+ up to the Cortex-A50.
Asymmetric dual-core chips will become more common. The typical CPU/DSP pair will remain the primary configuration, with the CPU usually providing connectivity support. Look for improved tool support for this mixed configuration with better communication and debugging support.
Look for more heterogeneous multicore chips going into the cloud as well as more conventional application areas like communications. For example, Texas Instruments’ Keystone chip family blends multiple Cortex-A15 CPUs with C66x DSP cores (see “Delivering 64-Bit Arm Platforms”). The challenge is picking the right combination of CPUs and DSPs for an application.
The 32-bit CPUs continue to push into the 8- and 16-bit microcontroller space. NXP’s LPC800 targets these form factors, prices, and power envelopes (see “32-Bit Micro Aims To Replace 8-Bit Counterparts”). The programming advantages and the upgrade path are clear. Still, most 8- and 16-bit micro vendors have developed streamlined development tool suites that make migration from 8- to 32-bit platforms relatively painless by providing standard support for common peripherals.
There is still a lot of life in 8- and 16-bit platforms, though, and those extra transistors can be put to use in more, and often better, ways than just increasing register size. Switch matrix connectivity between pins and I/O interfaces is becoming more common. So is intelligent I/O linkage.
With intelligent I/O linkage, programmers can connect a pin or the output of one device to the input of another, possibly with additional logic control in between. Some systems essentially incorporate a basic programmable logic device (PLD) in the mix. The approach reduces external support chips and can enable slower micros to handle chores that require faster response. They also can offload the host, allowing asynchronous operation.
Some systems even extend this support to complex control such as motor control. Reduced power requirements can be achieved with this approach by allowing the host to sleep while the peripherals continue to operate. The host may be awakened as necessary.
Power remains the watchword from tiny to massive processor core collections. It is one reason power debugging continues to grow in importance and availability. Power estimate, power use tracing, and power mode management are turning from nice features into design requirements.
Programming FPGAs
Processing cores in FPGAs will continue to be the norm in new designs. The big question is whether they will be hard, soft, or both. Most FPGA vendors offer at least one hard-core option. Xilinx’s Zynq-7000 EPP FPGA family with dual, Cortex-A9 cores (see “FPGA Packs In Dual Cortex-A9”) has been available on platforms like the open-source ZedBoard for a while (Fig. 3).
The lines blur in programming and FGPAs. FPGA logic units (LUTs) are configured or “programmed,” but this is a more static process compared to software running on a processing core. The advantages of software and FPGAs are clear and the combination has obvious advantages, including the high-speed connection between the cores and the FPGA fabric. This gets even more interesting when they’re combined with OpenCL.
Altera has released a software development kit (SDK) for Altera FPGAs and OpenCL (see “How To Put OpenCL Into An FPGA”). FPGA fabrics typically are programmed using FPGA design tools and intellectual property (IP) based on languages like SystemC or others that are much different from C or C++. OpenCL is a specialization of C, but it is now commonly used for programming everything from GPUs to clusters of CPUs.
The initial implementation of Altera’s SDK targets off-chip hosts using a PCI Express interface, so the result works like GPGPUs where the host provides data and then processes the results. PCI Express is fast but not as fast as the FPGA fabric, and an on-chip core provides faster access. This will be cutting-edge work, but it is a technology to watch since it turns software into hardware without requiring major FPGA expertise.
Interconnects And Networking
PCI Express Gen 3 and USB 3.0 have lots in common—particularly stability. Both standards have been around for a long time, at least for the semiconductor industry, and they have growing support. On the PCI Express side, the growth is on host support. There is significant support already on the switch side, though there will be more advances on the enterprise end where virtualization and shared resource access is increasing.
On the USB 3.0 side, switches finally will be showing up. Most clients will be available as well. USB 3.0 primarily has been used for storage. Connectivity to mobile devices is key when it comes to moving large video files.
MIPI (Mobile Industry Processor Interface), a high-speed interface common on mobile devices, is another area where PCI Express and USB meet. The latest support lets PCI Express and USB ride on MIPI, providing a useful bridge for these protocols. Last year, the standard was finished. This is the year for implementation.
In another area, PCI Express will be picking up a new micro cable connector. The PCI-SIG OCuLink looks to bring 32-Gbit/s bandwidth to external devices. It will compete with USB 3.0 and Thunderbolt in some applications (see “Thunderbolt And Light Peak”).
The migration from SATA and SAS to PCI Express has already been mentioned. PCI Express will absorb the functionality, but the protocols will still be available in different forms. In the case of SATA, there is SATA Express from SATA-IO. For SAS, there is SCSI Express via the SCSI Trade Association.
Sensors Everywhere
Sensors will appear everywhere and in myriad combinations. Accelerometers, pressure sensors, and other types of sensors are getting smaller and less expensive. They’re also being combined into multi-sensor units. Best of all, they’re more likely to have smart digital interfaces than analog interfaces, making integration easier.
Sensor integration and virtualization is also on the rise. Virtual sensors can mimic other sensors by using data from a different type of sensor, typically by using a lower power but less accurate sensor. They might be used during idle periods, with the more power hungry but more accurate sensor awakened when needed.
Another form of virtual sensor combines sensor information to provide more generic information such as whether a user is close to a particular device like a specific display screen. Video cameras fall into this category. Sometimes they provide 3D information, which requires computing and storage resources, but they often can fit on a micro. And although it has yet to be proven a long-term alternative, 3D gesture recognition is on the rise.
Displays And Touch
3D gestures do not require anything as sophisticated as Microsoft’s Kinect. Alternatives include capacitive-based solutions, which require computing horsepower but less than what a video solution would use. Microchip’s MGC3130 supports basic gesture recognition using an on-chip 32-bit microcontroller (see “Innovations Push Processors And Storage Into New Digital Roles”).
3D gestures are fancy, but 2D multitouch is where the volume, is including its use in smart phones and tablets. The news this year is the mass migration to thinner touchscreens because of laminated touch sensor technology (see “Multitouch Sensor Brings Flexibility To Design”). These new displays offer better performance, lower cost, lower power requirements, and thinner packaging.
There will be the usual chaos in the underlying displays, including performance and cost improvements along with availability issues. Demand is high, but display vendors are having a challenging time given the competition.
Flexible displays are likely to see more traction this year. E-Ink has been shipping flexible grayscale displays for years. This year, there will be color displays along with rumors of flexible organic LED (OLED) displays.
GPGPUs primarily will be used to drive displays. At the high end, AMD and NVidia continue to battle with x16 PCI Express cards. Things get much more interesting down the scale where integrated graphics reside and there is a wider variance, even within processor families. For instance, NVidia’s popular Tegra 3 quad-core Arm Cortex-A9 uses NVidia’s GPU. Arm CPU cores are being combined in this fashion, but Arm’s Mali GPU is pushing into the high-performance arena.
There is a definite push toward multiple screen support even on smaller platforms. At the higher end, multidisplay support via standard interfaces like DisplayPort are common.
The link between displays is yet another area where change is occurring. Wireless HDMI technology is readily available now like IOGear’s WHDI-based plug-and-play wireless HDMI link (see “IOGear Delivers WHDI Wireless HDMI”). What will be interesting this year is how WHDI and the alternatives arrive in terms of embedded solutions instead of add-ons. Multiple standards will make choosing products and technology a challenge since they aren’t interoperable. Also, the consumer wireless spectrum is already busy, and this adds more to the mix.
ASIC Design Trends
Silicon technology is driving from 20 nm down to 16 nm, 14 nm, and even 10-nm design nodes, giving designers access to a tremendous number of transistors. Designers are now building systems using blocks of IP from third parties. Standards like IEEE 1801-2009 for design and verification have helped make this interaction easier.
Look for continued improvements on the verification side of chip designs. The adoption of the Universal Verification Methodology (UVM) has had a major impact and will continue to do so this year. And, there will be more action in yield optimization tools.