High-end microcontrollers often use large, complex crossbar switches and other
technologies to maximize throughput and performance. Low-end microcontrollers
typically feature a simple bus structure. But as performance increases, so does
the need for more advanced architectures. Low-to mid-range microcontrollers
are now moving into a new realm where balance is key.
Crossbars offer some performance advantages. Unfortunately they do not
scale well. This has lead to interesting switching architectures like the communication
rings inside IBM's Cell processor's Element Interconnect Bus (EIB) (see "CELL
Processor Gets Ready To Entertain The Masses" at www.electronicdesign.com,
ED Online 9748). Yet crossbars and even switched, on-chip communication
systems are too expensive for the low-to mid-range MCUs. Instead, multiple-bus
architectures have found their way into a variety of novel architectures. These
tend to be easier to implement while still meeting the performance requirements
to support a chip's memory and peripherals.
The need to meet performance requirements is key, of course. But issues such as power usage, multicore solutions, and load balancing also influence new chip designs.
MULTIPLE-BUS MICROCONTROLLERS
The growth of flash-memory sizes has had an interesting
impact on low-to mid-range microcontrollers. Larger flash memory takes up more
chip real estate. This increase does not translate into a corresponding space
increase for the processor and memory support, making the processor core a much
smaller percentage of the chip. This affects a chip designer's options for these
components.
Moving to processors with higher performance, such as the 32-bit ARM or Freescale ColdFire architectures, is one way to exploit the space compared to using an 8- or 16-bit platform as the processor core. An equally interesting approach to enhancing a chip design is to add DMA support. A DMA channel is significantly simpler than a processor, so chip designers often can add a number of DMA channels.
Adding complexity to a DMA channel with features such as chained or double buffering is often a low cost option. Of course, adding DMA or going with a wider processor bus increases bandwidth requirements.
Sharing the bandwidth of a single bus can be useful, especially if the aggregate
bus bandwidth meets the system design requirements. On the other hand, it's
frequently possible to dedicate one or more DMA channels to different buses.
In many cases, the multiple buses aren't identical in nature. Rather, they form
a hierarchical interconnect, like Atmel's AVR32 (Fig.
1).
The AVR32 uses a type of crossbar switch for its top-level interconnect. The 32-bit processor and the DMA unit can access the high speeds, while the peripheral DMA can service the two advanced high-performance buses (AHBs). The processor can access the lower-speed peripherals off these two buses, but it's more efficient if the DMA can handle those chores.
Likewise, transfers on the faster AHB matrix can occur while slower transactions take place on the AHBs. Still, transfers for all buses occur between the on-chip memory or off-chip memory and the peripherals.
Less complex architectures don't necessarily mean lower performance. Instead,
chip designers look to match the capabilities of the architecture with the system
requirements. This is the case with Microchip's dsPIC line (Fig.
2). The architecture features four buses and one dual-port RAM. Even so,
the main memory and code flash-memory buses resemble any Harvard architecture
microcontroller.
The flash memory is dedicated to supplying the instruction decode, while the two peripheral buses are dedicated to the processor and multichannel DMA controller. The buses let the processor and DMA simultaneously access any peripheral as long as it isn't the same one at the same time. This commonly is the case when DMA is used with a peripheral.
The other difference from most conventional microcontroller architectures is
the dual-port DMA memory, which is primarily intended for use by the DMA controller.
The CPU typically will move a block of data in and out of this memory after
a block transfer is completed.
High-end microcontrollers often use large, complex crossbar switches and other
technologies to maximize throughput and performance. Low-end microcontrollers
typically feature a simple bus structure. But as performance increases, so does
the need for more advanced architectures. Low-to mid-range microcontrollers
are now moving into a new realm where balance is key.
Crossbars offer some performance advantages. Unfortunately they do not
scale well. This has lead to interesting switching architectures like the communication
rings inside IBM's Cell processor's Element Interconnect Bus (EIB) (see "CELL
Processor Gets Ready To Entertain The Masses" at www.electronicdesign.com,
ED Online 9748). Yet crossbars and even switched, on-chip communication
systems are too expensive for the low-to mid-range MCUs. Instead, multiple-bus
architectures have found their way into a variety of novel architectures. These
tend to be easier to implement while still meeting the performance requirements
to support a chip's memory and peripherals.
The need to meet performance requirements is key, of course. But issues such as power usage, multicore solutions, and load balancing also influence new chip designs.
MULTIPLE-BUS MICROCONTROLLERS
The growth of flash-memory sizes has had an interesting
impact on low-to mid-range microcontrollers. Larger flash memory takes up more
chip real estate. This increase does not translate into a corresponding space
increase for the processor and memory support, making the processor core a much
smaller percentage of the chip. This affects a chip designer's options for these
components.
Moving to processors with higher performance, such as the 32-bit ARM or Freescale ColdFire architectures, is one way to exploit the space compared to using an 8- or 16-bit platform as the processor core. An equally interesting approach to enhancing a chip design is to add DMA support. A DMA channel is significantly simpler than a processor, so chip designers often can add a number of DMA channels.
Adding complexity to a DMA channel with features such as chained or double buffering is often a low cost option. Of course, adding DMA or going with a wider processor bus increases bandwidth requirements.
Sharing the bandwidth of a single bus can be useful, especially if the aggregate
bus bandwidth meets the system design requirements. On the other hand, it's
frequently possible to dedicate one or more DMA channels to different buses.
In many cases, the multiple buses aren't identical in nature. Rather, they form
a hierarchical interconnect, like Atmel's AVR32 (Fig.
1).
The AVR32 uses a type of crossbar switch for its top-level interconnect. The 32-bit processor and the DMA unit can access the high speeds, while the peripheral DMA can service the two advanced high-performance buses (AHBs). The processor can access the lower-speed peripherals off these two buses, but it's more efficient if the DMA can handle those chores.
Likewise, transfers on the faster AHB matrix can occur while slower transactions take place on the AHBs. Still, transfers for all buses occur between the on-chip memory or off-chip memory and the peripherals.
Less complex architectures don't necessarily mean lower performance. Instead,
chip designers look to match the capabilities of the architecture with the system
requirements. This is the case with Microchip's dsPIC line (Fig.
2). The architecture features four buses and one dual-port RAM. Even so,
the main memory and code flash-memory buses resemble any Harvard architecture
microcontroller.
The flash memory is dedicated to supplying the instruction decode, while the two peripheral buses are dedicated to the processor and multichannel DMA controller. The buses let the processor and DMA simultaneously access any peripheral as long as it isn't the same one at the same time. This commonly is the case when DMA is used with a peripheral.
The other difference from most conventional microcontroller architectures is
the dual-port DMA memory, which is primarily intended for use by the DMA controller.
The CPU typically will move a block of data in and out of this memory after
a block transfer is completed.