Forth Still Suits Embedded Applications

Although Sometimes Viewed Merely As An Interesting Relic, Forth Still Holds A Role In Today's Embedded Systems.

Nov. 22, 1999

20 min read

No single solution exists for embedded programming. Projects differ too widely in scale. Real-time signal generators may need every instruction cycle counted, but the programs take only a few hundred lines of code. For such applications, assembly language is the only way to go. Other jobs require extensive user interfacing and hundreds of thousands of lines of code. There, the most economical solution is often a ready-built single-card PC, programmed in C and running an operating system.

For mid-sized applications that need only a few thousand lines of code, however, you may find that you're using an expensive microprocessor and a lot of PROM. And it's not because the task really needs it, but because the operating system and the language demand it. In such situations, choosing Forth lets engineers produce compact and reliable code quickly, run it on an inexpensive processor, and reuse the results in later applications.

Forth isn't a new language. It's been commercially available for over 25 years and has its own ANSI standard (X3.215/1994). But it's not widely used. There are probably fewer than a hundred full-time Forth programmers in the country. Instead, C is the language of choice.

Programmers learn and use C because everyone else uses it. Managers specify C for their projects because C programmers are easy to find. Yet if you talk to hardware engineers rather than programmers, you'll find many that are familiar with Forth and eager to use it. That's a point in Forth's favor because, in embedded-systems development, an engineer who understands the hardware is often a more effective programmer than a computer-science graduate who would rather be writing compilers.

Forth is a powerful language, but its syntax is simpler than Basic's and much simpler than C's. Despite its simplicity, you can extend the language to meet any requirement. You're never limited to what some compiler writer thought you might want to do. As with assembly language, you have to keep in mind what's happening at the hardware level. But beyond that point, Forth is an extremely easy language to learn and use.

If Forth were just another computer language, it wouldn't stand out from the rest. But it's conceptually different. For a start, the Forth kernel acts both as an interpreter and a compiler. So Forth programs run about as fast as if they were compiled, yet can be written and tested interactively. Any routine in a Forth program can be tested from the keyboard without a symbolic debugger. This tightly coupled code/test loop makes programmers extremely productive.

Forth achieves this seeming miracle by eliminating parsing. Every word in the source represents a single operation. (Parentheses delimit comments, not pending operations. Pending operations exist but are rare in Forth.) Every word knows where to go for its input parameters, how to process them, and where to leave the result. If you type a word at the keyboard, it will execute. You can test a word with any input parameters you please to check that it always works.

To write programs in Forth, you create the definitions of new words, extending the language until it can handle the task at hand. Each definition is made up from other words. Some of these words come from the library of predefined words, such as arithmetic functions, which form Forth itself. Others will be words that you yourself defined earlier in the program (see "A Sample Of Forth"). Forth makes no distinction between library and user-defined words. The highest-level word is the application program itself. Thus, Forth programs are designed from the top down but written and tested from the bottom up. Think of it as an executable program design language.

You can do anything in Forth that the hardware itself can do. In effect, you write your own application language. A program to drive robotic arms would have words to turn an arm, while a benchtop instrument would have them to write characters to a display. Each new Forth word is inherently a reusable program module. On your next project, you can reuse the same words, thus steadily increasing your productivity.

Reusing words is easy even if you need to move to a new processor architecture. Only 80 or so of Forth's predefined words call routines written in the machine code of the host processor. The rest are written in Forth. So the language and all of your word definitions are easy to port to a new microprocessor. Only those 80 machine-code definitions—less than 2000 bytes—need to be changed.

That productivity enhancement applies to the entire project, as well. With adequate partitioning of the program's functions, coding can be divided amongst as large a team of programmers as may be necessary. Once every low-level word has been written and tested, the higher-level code is almost sure to work. Integration and maintenance of code also is easy. Word names in Forth can be long and should be descriptive of their actions. Good Forth is almost self-commenting. Because definitions tend to be short, it's much easier to follow the program flow in Forth than in other languages.

Forth programs have a high degree of structure. All of the usual IF, ELSE, WHILE, and LOOP operators also are Forth words, but their usage is much clearer than in most languages. For example, comparisons are distinct from the actions to be taken. A comparison word creates a flag; a decision word then uses that flag. There's nary a GOTO in sight, unless you care to define one.

Of course, a Forth word needs to know the location of those flags and other parameters. Like most compiled languages, then, Forth passes them on a stack. Each successive word processes the parameters left on the stack by the previous one. A parameter can be anything you need it to be: a number, a Boolean value, a variable address, or a pointer to a data structure. The next word in the code knows how to interpret it correctly.

All arithmetical and logical operations are done between such numbers on the parameter stack. Since Forth works by calling successive words, it needs a return stack to keep track of where it came from. The return stack also stores loop limits and indices.

Unlike other languages, Forth doesn't hide the stack from you. The programmer has complete access to the parameter stack and should be aware of what's on it. Since Forth keeps intermediate results on the stack, it uses fewer named variables than most languages. When programming, keep track of the stack rather than worry about whether a named variable has been overwritten. Luckily, one rarely needs to think about more than three or four stack items at any one time.

This stack access is one of the language's more powerful (and dangerous) features. Because you can manipulate the return stack, for example, you could drop a return address to abort execution of the current word. You also can push a temporarily inconvenient parameter stack value onto the return stack for deferred handling. (Forgetting to pop it again, however, can lead to spectacular crashes.)

Along with providing the programming power of stack manipulation, Forth harnesses the computational power of today's 32-bit processors. Early Forths standardized on 16-bit width for both the parameter and return stacks. This suited the addressing range of the first microprocessors, and allowed both integers and addresses to be processed by the same routines. (Forth allows arithmetical operations on addresses. Rather than defining an "array" data type, which might only be used once, it's common practice to add an offset to a variable address to fetch an array member or a string character.)

The earliest Forths even had some double precision (32-bit) operations. One number occupied two stack levels. Now, except on the smallest microprocessors, Forth uses 32-bit single-precision numbers. Double precision is available.

Numbers can be defined to be either integer or floating-point values. But beyond that, Forth has little or no data typing. It assumes that programmers don't need babysitting and would rather have flexibility than over-fussy error messages. If you have a good reason to add a Boolean value to a number, the compiler won't try to outguess you. Thus, you can mix and match values as you wish, so long as you're careful about it. Of course, you must comment on your code thoroughly or you will exasperate future maintenance programmers.

This flexibility makes it easy to define new data types. I regularly use a Forth that offers 32-bit floating-point numbers. When writing a filter-analysis program, I defined pairs of them to be complex numbers. I wrote my own library of complex arithmetic routines. It took about 20 minutes—less time than it would have taken to find and understand a precoded library.

It can take a little effort to become accustomed to Forth because it's conceptually different from other languages. For example, a variable name in Forth is a word that puts an address on the parameter stack. If you want to fetch the value of the variable or store the number on the stack as the variable's value, you follow the variable name with "fetch" or "store." These are abbreviated to @ and ! respectively, which looks a little cryptic at first sight. If you leave out the @, you end up processing the variable's address rather than its contents. Since this too is a legitimate operation in Forth, no error message is generated.

Constants are another data type that Forth handles a bit differently. A constant is a word that puts a predetermined number on the stack. But if you don't want to define them as constants, Forth will happily compile numbers as literals.

While producing code, a Forth compiler acts much like an assembler (see the figure). All it does is look up each word of the source in a dictionary. (Many Forths use hashing to speed the search.) If the compiler doesn't find the word, it tries to interpret it as a number. If that fails, it prints an error message. If it finds it, it does one of two things, depending on whether it's currently in the middle of compiling a definition or has finished one definition and is waiting for the start of the next.

When in the middle of compiling a definition, the compiler takes the address (or token) of the word from the dictionary and adds it to the object code. If the compiler has finished a definition, however, it executes the code for the word. This is what happens when you type a word at the keyboard. To execute the code for a word, many versions of Forth use a short machine-code routine to read each address in the word code and jump to it.

A number of compiler control words always execute. One of these is "colon" (:), which tells the compiler to start adding a new definition to the dictionary. The next word in the source then becomes the name of the definition and is followed by the words that make up that definition. Each of these words gets compiled in turn. The definition terminates when the word "semicolon" (;) is encountered in the source.

Once a word has been defined, it can be used to define further words. In general, words must be defined before they can be used. But definitions can be DEFERed, and recursion is allowed. You also can FORGET, edit, and recompile part of a program without changing the earlier parts of the code.

The "fetch next code address and call it" manner in which Forth executes its programs takes slightly longer than calling a subroutine or executing in-line code. A Forth program will theoretically run more slowly than a compiled program in another language. (Some Forth compilers generate executable machine code and don't have this overhead.) In practice, the efficiency of Forth's arithmetic and the absence of OS calls and overhead, such as garbage collection, make Forth's execution fast as well as predictable. It also handles interrupts very smoothly. And Forth needn't save the context, since it's already on the stacks.

Forth's method of execution may result in slower operation than other compiled code, but the code itself is more compact. A Forth word compiles either to a token or to an address and so occupies only a byte or two in memory. An embedded Forth usually incorporates its own operating system. Even so, Forth and Forth programs are very compact. A compiler can be written in 1200 lines, which compiles to about 6 kbytes. A run-time kernel need take no more than 2 kbytes. For many years, I used a commercial Forth which ran under DOS on a PC. It contained a built-in source editor but occupied only 15 kbytes of memory. Each Forth word occupies one or two bytes, and a line of code may contain four or five words. A microprocessor with a 64-kbyte address range can run a 5000-line program.

A Run-Only Option During compilation and program debug, Forth needs to keep a diction-ary of word definitions available. But in the final product, the dictionary names are just so much wasted space. They could, however, contain proprietary information. Fortunately, most Forth compilers allow you the option of creating a run-only version of the code. This version contains neither the dictionary structure nor the definitions of any standard words that haven't been used. This not only shortens the code, but also makes reverse engineering much trickier.

If you're going to write your embedded application in Forth, you have two approaches from which to choose. You can write and compile the program on a PC using the PC's disk and editing facilities. Then, just download the program to PROM, RAM, or EEPROM in the target system. You also may be able to emulate program execution on the PC.

The second alternative is to use the PC simply as a dumb terminal and to compile your program as you create it on the target system itself. This alternative requires that the target have some sort of serial port and enough RAM space to contain the Forth compiler. Yet it makes for vastly easier debug, because the code being tested is running at normal speed and interacting with the target hardware.

This second method also makes Forth a useful tool for debugging prototype hardware. It takes only moments to write a word to exercise a particular port bit, for example, to see where a signal gets lost on the board. I've also found Forth to be the perfect language for writing production test and debug programs.

There should be no trouble getting it to run in your prototype system. Forth has been implemented on just about every microprocessor that's ever existed. Yet it runs most efficiently on processors that support two stacks and have a word length that matches the width of the parameter stack. A microcontroller with enough on-chip RAM to implement the stacks will run Forth particularly quickly.

Time and again, this platform's flexibility in implementation has been demonstrated. In 1985, Charles Moore, who invented Forth in the late 1960s, used a gate array to build a single-chip Forth engine (see "Fast Processor Chip Takes Its Instructions Directly From Forth," Electronic Design, March 21, 1985, p. 127). This design was further developed by Harris Semiconductor as its RTX 2000 series of chips which, unfortunately, are no longer available in production quantities.

Forth also has proven itself in many embedded applications. The Magellan probe's radar pictures of Venus passed through a data-block handler that I designed. It was controlled by a Forth program running on a Z-80. I wrote its operating system and, in about two weeks, taught a hardware engineer enough Forth to write its 2000-line program. We used a PC running a Forth program for the system tests. Whenever the customer dreamed up new tests, it was a matter of minutes to incorporate them into the program.

On projects in which the final program was written in another language, I've used Forth as an executable program design language (PDL). Once the program was debugged in Forth on a PC, it was relatively easy to recode it in assembly language, for example. I'm currently implementing a tokenized Forth on a PIC microcontroller.

At the other extreme, I do my general programming in Forth in a windowed environment. It has pull-down menus and slider input of parameters. Forth runs in one window, while the source editor runs in another. A mouse click moves you between the two.

Despite the advantages that Forth offers, however, you may find it difficult to employ on your next project because of corporate inertia. The last thing a harried project manager wants is to have today's project take longer and cost more as designers learn a new language. Writing and debugging an embedded program in less time is a good thing. And manufacturing with a cheaper microprocessor and a smaller PROM nearly always pays dividends. But changing how things are done costs time and money. Just balance the cost against the benefits of writing in a language which is fast and interactive, requires fewer programmer-hours, and leads to shorter, more reliable code.

The C juggernaut has a lot of corporate momentum behind it, so switching to Forth may be difficult to justify in the short term. If you plan to stay in the embedded-systems business, however, Forth's lower development costs and cheaper hardware can make it a profitable long-term option. *

A Sample Of Forth This Forth fragment from a real application accepts an input frequency between 0000 and 9999. It interpolates between entries in a 100-point calibration table to generate the nonlinear drive voltage for a voltage-controlled oscillator. Here, Forth code is written in upper case, with comments in lower case. Comments are demarcated by parentheses or by a backslash and the end of the line of code. Definitions begin with "colon" (:), end with a semicolon (;), and contain a "stack picture" to guide future programmers. The "picture" is the comment in parentheses following the definition's name word. It shows the parameters on the stack that the word will use (bottom to top), followed by those that it will leave on the stack when it finishes execution.

10000 CONSTANT MAX-INPUT
   \ Define a constant named MAX-INPUT.
   
100 CONSTANT WIDTH
       \ The separation between table entries.

CREATE TABLE1 MAX-INPUT WIDTH / CELL * ALLOT
       \ Generates TABLE1 having room for 100 stack elements.
       \ CELL is the stack element size in bytes, typically 4.
       \ This table will be filled when the system is calibrated.

: INTERPOLATE ( point1 point2 fraction width --> value )
       \ fraction/width = interpolation factor between point1 and point2

    >R >R           \ Move width and fraction to the return stack
    OVER -        \ Size of interval = point2 - point1
    R> R>           \ Recover width and fraction
    */                     \ Do size*fraction/width with double precision
    +                      \ Add result to point 1
;                        \ End of definition

: LOOK.UP     ( user input --> equivalent value from table )

    DUP 0 MAX-INPUT WITHIN? NOT \ Test user's input
    IF ERROR1 DROP 5000                 \ Flag error and insert dummy
    THEN	                                       \ Some Forths use "ENDIF"
    WIDTH /MOD                                      \ Get table index and fraction
    DUP TABLE1 + @                              \ Look up lower table entry
    SWAP 1+ TABLE1 + @                     \ Look up upper table entry
    ROT WIDTH                      \ Move fraction to top of stack and get width
    INTERPOLATE                                 \ Call the interpolation routine
;                                                                     \ End of definition

For More Information On Forth The starting place for learning more about Forth is the nonprofit Forth Interest Group (FIG). FIG publishes a journal, "Forth Dimensions," and sells books and public-domain versions of Forth. For reading I recommend Leo Brodie's classic but dated book, "Starting Forth." If you can't find it elsewhere, you can buy it from FIG. His "Thinking Forth" is also worth reading.

If you want to obtain a Forth compiler, commercial versions of Forth for embedded systems can be bought for between $30 for a public-domain version to around $2000 for a comprehensive cross-compiling version that runs on a PC. Forth Inc. is the biggest commercial Forth source for embedded and other applications. A number of smaller Forth vendors also advertise in "Forth Dimensions." Given some familiarity with Forth, it's not difficult to take a public-domain Forth and rewrite it to run on the micro of your choice. I've written five or six Forth compilers, all loosely based on the 1978 vintage FIG-Forth for the 8080.

Forth Inc.
111 N. Sepulveda Blvd.
Suite 300
Manhattan Beach, CA 90266
www.forth.com
[email protected]

Forth Interest Group
100 Dolores St.
Suite 183
Carmel, CA 93923
www.forth.org