Electronic Design

  
Reprints     Printer-Friendly    Email this Article    RSS        Font Size     What's This?


[Leapfrog: Industry First]
A New Player In The 32-Bit Procesor Field
The AVR architecture blends 32-bit power with the elegance of its 8-bit brethren.

William Wong  |   ED Online ID #11939  |   February 2, 2006


Atmel took a turn away from the pack when it designed its 8-bit AVR. Now, the company is bucking the trend toward the 32-bit ARM architecture with its AVR32 processor architecture. Needless to say, ARM and its partners don't have the 32-bit market sewn up by any means.

In fact, a range of popular 32-bit microprocessors is available, including three families from Freescale alone. The competition should be healthy. Still, the AVR32 melds system design components like DSP and single-instruction/multiple-data (SIMD) instructions, Java bytecode support, and a compact instruction set.

The AVR32 runs in standard and Java bytecode mode. In standard mode, it can execute 16- or 32-bit instructions without switching modes. Most instructions are 16 bits, which reduces code size and effectively increases the cache performance because more instructions can fit into the cache. This is one reason why the ARM Thumb and Thumb2 instruction sets have become so popular. However, the AVR32 doesn't have the mode switch overhead when the need arises for 32-bit instructions.

The AVR32 only requires a seven-stage pipeline (Fig. 1). A short pipeline reduces overhead due to stalls. It also allows for more aggressive analysis because timing constraints aren't as critical as they are with some other system architectures. As a result, features like the dynamic branch prediction can essentially implement zero-cycle-loop instructions, which are key to improving DSP performance.

Thanks to conditional return instructions, there's more inline execution versus the test/branch combination used with other architectures. Individually, the architectural finetuning may seem trivial. But combined, the features add up to greater performance. As a result, a low clock rate can perform the same function on other architectures. Also, lower clock requirements reduce power consumption. This is vital for Atmel's targeted product areas, such as portable multimedia devices.

A REGULAR ARCHITECTURE
Atmel designers kept the system architecture simple. It uses a 16-register register file with a minimal number of mirrored registers for hardware context switching (Fig. 2). The AVR32 also features four levels of interrupt priority. It supports up to 64 interrupt groups and up to 32 interrupt lines per group, and each group has its own priority. This provides a very flexible interrupt control structure.

Interrupt 3, the highest-priority interrupt, mirrors a half-dozen additional registers. This allows many interrupt service routines to run without saving any additional system state. It also enables interrupts to be processed with minimal overhead.

The AVR32 includes a number of common instructions that typically take multiple instructions on other architectures. For example, certain instructions move selected blocks of registers. This is similar to instructions found in the new Texas Insturments MSP430X architecture (see "16-Bit Architecture Grows To 1 Mbyte" at www. elecdesign.com, ED Online 11528). Register-to-register block moves occur in a single cycle.

The AVR32 is a big-endian architecture. But it implements a host of pack and extract operations with a 32-bit barrel shifter that simplify little-endian support. These instructions also come in handy for structure manipulation. The processor can manipulate 64-bit values as well.

The balance of the system architecture is fairly conventional. Data and code caches provide better performance. The paged and segmented memory management unit (MMU) can handle any operating system.

However, Atmel designers still have a few tricks up their sleeves. For example, a four-entry circular buffer can hold return addresses pushed into memory. It allows the values to be used immediately from the buffer instead of being read from memory, delivering better performance. This is transparent to the application and compiler, though applications that use nonstandard returns must explicitly flush the buffer.

The DSP and SIMD sections use a straightforward design with a few interesting tweaks that increase performance, reduce overhead, and get the job done using less power. For instance, there's delayed writeback of the 48-bit temporary accumulator used in a multiply-accumulate mode.

That means each iteration of a loop only needs to load one value instead of the two typically used in other architectures. This can be employed to implement fast finite-impulse-response (FIR) filtering algorithms. The processor supports fractional multiplications with saturation, rounding, and scaling.

Likewise, SIMD support addresses common multimedia algorithms such as MPEG-4 motion compensation. MPEG-4 encoding software also uses instructions to handle operations like the sum of absolute differences. These types of operations are found in competing 32-bit multimedia architectures, but you won't see them in conventional 32-bit architectures.


<-- prev. page     [1] 2     next page -->

Reprints   Printer-Friendly  Email this Article  RSS    Font Size   What's This?


  • A New Design Inflection Point
  • Forecasting Industry Growth For 2009 And Beyond
  • EDA Retools To Exploit Multicore Architectures
  • Design And Verification Move Up In Abstraction
  • EDA Retools To Exploit Multicore Architectures
  • A New Design Inflection Point
  • Design And Verification Move Up In Abstraction
  • Challenges Lurk For 22-nm Physical Implementation
    1) 1-A Switching Regulators Operate With 96% Efficiency To Replace Linear Regulators
    (519 views today)
    2) Battery Pack Improves Li-Ion Management For Electric Vehicles
    (308 views today)
    3) New Power Approaches May Fuel Analog Job Opportunities In Security And Health Applications
    (299 views today)
    4) Build A Smart Battery Charger Using A Single-Transistor Circuit
    (283 views today)
    5) Step-Down Switching Regulator Provides 60-V Input Transient Protection
    (152 views today)
    ALL TOP 20



    Reader Comments

    Glad cry!

    Dato -February 27, 2006   (Article Rating: )

    I'm scared. whoooh. seriously

    mfb -February 21, 2006

    great!

    Anonymous -February 08, 2006   (Article Rating: )

    POST YOUR COMMENTS HERE
    Name:

    Email:
    Your Comments:

    Enter the text from the image below


    Please refresh the page if you have trouble reading this text.

    Search Electronic Design
         
      
     
    Web Seminar
    Sponsored By:
    Title: Read Pacing: A Performance Enhancing Feature of PCI Express Gen 2 Switch Devices
    Speakers: 
    Date: 07/01/08
    Register: 

    Electronic Design Europe Electronic Design China EEPN Power Electronics Auto Electronics Microwaves & RF
    Mobile Dev & Design Schematics Find Power Products Military Electronics EE Events Related Resources