What you’ll learn:
- Why stack allocation is a good idea for embedded programming.
- Myths about stack allocation.
I started my programming career using BASIC and assembler on computers that had very limited support for a stack. Along the way, I bumped into all sorts of programming languages from APL to Pascal to Lisp to Haskell to Ada. Each has unique features that support a programming style which is often different from other programming languages. Learning to use different programming languages can be very useful in bringing techniques that are common in one to another where they may be applicable but less common.
Likewise, common features like object-oriented programming span many languages. Some features like loops, arrays, and memory allocation are found in almost all programming languages.
The focus of this article is on memory allocation and, very specifically, embedded programming and the use of the stack, which is often overlooked. The keyword here is “embedded,” where deterministic behavior is important. Some of the hardest bugs to find and fix are related to memory access and allocation. Memory leaks in applications that use the heap fall into this category. The Rust programming language is specifically designed to address things like memory leaks among other memory-related problems.
But back to embedded programming.
Embedded applications are not all alike, but many tend to be very fixed and deterministic. High reliability and safety are often key to an application’s design. This usually means that the design of the application is such that programmers know what the requirements and limitations are within very specific specifications. This is where stack allocation shines.
Stack Allocation Myths
If someone wants to write an “11 Myths” article for me that would be great, but I will only hit on a couple here so that we can move onto implementation details. By the way, I will be looking at this from an Ada/SPARK and then C/C++ perspective, so hang in there. I know most of you will be C/C++ fans.
Myth: Stacks are small and should be used sparingly.
Many recommend that structures and arrays be allocated in the heap and that stacks are small, leading to stack overrun errors. Actually, all memory comes from the same place, and in single threaded applications, the stack is often at the top of a block of memory that contains the heap at the bottom—hopefully they never grow into each other.
Two things change this model. First is multitasking, where each task has its own stack and often a shared heap. Second, virtual memory or memory protection systems enable large memory spaces to have physical memory scattered about with the ability to move memory around on demand.
In any case, stack size is something that’s normally configurable. Therefore, if the amount of memory needed is known, as is often the case for embedded applications, then heap allocation, with its potential for memory fragmentation, can be avoided.
Myth: Stack allocation is only for fixed-size elements.
This is true for most programming languages. However, some languages, such as Ada, allow for variable-sized objects such as arrays to be allocated upon entry to a function or block.
For those languages that only have fixed-size elements, it may still be desirable to use the stack instead of the heap, depending on the application’s operation and specifications. Remember that all physical memory is shared, so it’s a matter of usage. If an application will need up to 1 MB of memory for a set of buffers, then it doesn’t matter if it comes from the heap or stack. The trick is not to run out of memory. Allocating one buffer or a thousand will still require the same amount of memory for the system to make sure an out-of-memory condition doesn’t arise.
Myth: Stack size is an unknown.
This can be true when using recursive functions. But those aside, the maximum amount of stack space required can be computed, even including interrupt support. The amount of memory needed is essentially the maximum sum of the amounts of memory needed by each procedure while walking the call graph of an application.
Determining stack size is actually a feature built into C/C++ compilers. The gcc compiler’s -fstack-usage flag will generate a table that has the amount of memory used by each procedure. Call-graph analysis tools like stack_usage are available that can use this information to determine the maximum amount of memory needed.
IAR’s Embedded Workbench has this built into the compiler. It’s part of the static analysis provided by IAR, which is very useful for embedded developers that the suite targets.
GNATstack is a tool from AdaCore that works with Ada/C/C++. It can handle recursive functions if you provide a limitation on the level of recursion and it can detect cycles of indirect recursion.
Heap Allocation Within a Procedure
One reason heap allocation is often used rather than stack allocation is to address changing sizes at runtime (Fig. 1). Normally the structure, such as an array, is allocated upon entry, used, and then released upon exit.
void example(int size) { int* ptr = (int*) malloc(size); free(ptr); }
1. Typically, a C/C++ procedure will allocate data upon entry and free it before exiting.
C++ uses new/delete, and its constructor/destructor support is far superior to C. Still, many embedded developers prefer C to C++.
The only reason to use this type of approach versus stack allocation as a local variable is the variable-size issue since C can only allocate fixed-size data in the stack. As noted earlier, variable allocation can be addressed by using a fixed-size entity that will handle the largest needed by the application. Of course, this assumes the system has been designed with sufficient memory to address these needs, but that should be the case anyway. The problem can arise in a multitasking environment where multiple tasks need to share the memory and there’s a way to make sure storage is sufficient enough to handle the application under any circumstances. In this case, it may not be possible to allow a task to have a stack large enough to handle its needs.
Dynamic Stacking
Some programming languages like Ada allow variable structure sizes to be allocated on the stack (Fig. 2). In Ada’s case, it’s possible to do this allocation within a block in addition to a procedure or function entry.
procedure Main is procedure Dynamic1 ( Size1, Size2 : Integer ) is Var1 : array ( 1 .. Size1 ) of Integer ; begin for I1 in 1 .. Size1 loop declare Var2: array ( 1 .. Size2 ) of Integer ; begin for I2 of Var2 loop null; end loop; end; end loop; end Dynamic1; begin Dynamic1(100, 10); Dynamic1(200, 20); end Main;
2. This Ada procedure has two dynamic array allocations. One is part of a procedure while the other is a local block.
The example also highlights Ada’s nest function and block definitions. These features aren’t unique to Ada but are uncommon in other programming languages—they aren’t available for C/C++.
In this particular example, the Dynamic1 procedure allocates two variable length arrays, Var1 and Var2. The first is upon procedure entry while the other is within a block used in a for loop. Also note, the variables I1 and I2 are implicitly defined by the for loop. Note the for/of/loop that gets the number of entries from the entity, Var2.
Static Stacking and Simplifying Memory Management
Embedded applications often utilize global memory for static storage or allocate a predefined set of buffers from the heap at startup. These are never freed. Stack allocation can be used in the same fashion, providing some advantages such as the potential of eliminating the heap, its overhead, and potential problems like memory fragmentation.
In this case, procedures or functions would be written to allocate and initialize the desired data and pass references or pointers to procedures or functions that would use the data. The stack would grow and shrink as needed, so a function that sends a block of data and then receives a response would be provided with buffers allocated in the stack.
Stack allocation has the advantage over heap allocation when it comes to speed and simplicity. Heap allocation requires extracting a block when one is requested and returning it when the application is done with it. Overhead is often minimal, but it’s not usually deterministic when compared to stack allocation.
Another advantage to this approach for multitasking applications is that the memory overhead will be known for a particular process. It can be easily replicated since it’s a matter of starting up a new task.
Stacks that Grow and Overflows
Virtual-memory and memory-protection systems make stack allocation more interesting—overflows can be detected—and addressed—when they occur, versus memory leaks that often result in errors well past the occurrence of the root cause. Likewise, for embedded applications, an overflow usually indicates incorrect program design since, in theory, a task should have a fixed upper limit for its stack usage.
Individual tasks rarely use all available memory and most virtual-memory systems have an address space that’s significantly larger than physical memory. This means that spaces can be left between blocks of memory; accessing the intervening spaces results in a fault because it’s considered an error. For stack support, it also can be used to initially provide a smaller block; then more physical memory can be added to the stack if an overflow is detected. New Cortex-M microprocessors have built-in stack overflow checking in the hardware.
Challenges Using Libraries
Incorporating outside code such as libraries using a stack-first approach can be a challenge depending on the tools and libraries. That’s because, in many instances, the operation of the libraries with respect to stack usage is unknown. This is a general problem because even if large blocks aren’t allocated in the stack, but rather globally or in the heap, the use of library functions still requires stack space.
On the embedded side, libraries are often available in source form so that they can be analyzed. It’s also possible to do runtime estimates and then provide headroom so that applications don’t run the risk of stack overflow.
Wrapping Up
Many of you may already lean toward stack allocation, but the design pattern or frame of mind for others may not be as common. The approach doesn’t have to be used exclusive of heap allocation. However, like many techniques, it can be useful if you can examine the application from this perspective.
In many instances, it’s a matter of how you approach a problem. Heap-allocation procedures often return a block allocated from the heap. With a stack approach, you need to pass the already allocated stack entity to a function to do some work.
There are significant advantages to just using stack-based storage, including minimal overhead, automatic memory management, and knowing how much overhead is really required by an application. Scaling via tasks with fixed-size stacks also can be a benefit.
Having a well-structured application can be helpful outside of embedded applications where a fixed amount of storage isn’t as much of a limitation. Nonetheless, the simplification of the application can have benefits in reducing some types of errors.