Macros to Add Structured Control-flow to any Assembler (part 1) (version 2)

For those who just want to use it, and don’t care how it works, here’s the Quick Reference.

In memory of Wil Baden (1928-2016). Among Wil's many gifts to the world, is the elegant control-flow implementation scheme in ANS Forth (1994), on which this work is based.

[The first version of this article required the following disclaimer:

“Well almost any. Unfortunately the following method cannot be used with the GNU assembler because it does not allow the assembly location to be moved backwards. For the GNU assembler, an attempt to .org backwards is an error.”

However, Bill Westfield was inspired by this, to devise an org-free way of doing it in the GNU assembler. He used computed local labels. I have now adopted some of Bill’s ideas, to create an org-free variant of my method below, that uses computed labels (ordinary, not local), and will therefore work with other assemblers, in addition to the GNU assembler. It turns out to be simpler than my first method, and it generates far-more-readable listings. Thanks Bill.]

I assume that most programmers appreciate the advantages of structured control-flow over "spaghetti code" (using labels and jumps or "go-to"s). The basic problem is that a label is a "come-from" statement with no indication of where control can come from. With structured control-flow we have sets of matching reserved-words that indicate the direction of the jump (forward or backward) and we use indentation to indicate which words are connected.

e.g. Instead of                                                 we write

        tst     R8                        tst     R8
        jnz     label                     _IF     Z
        mov     R9,R8                         mov     R9,R8
label:                                    _ENDIF

which generates exactly the same machine code. This is not to be confused with conditional assembly.

And instead of                                                 we write

label:                                    _REPEAT
        sub     R9,R8                         sub     R9,R8
        jnc     label                     _UNTIL  C

You don’t need to know anything about Forth in order to use this method, and the words I have chosen for the macro names owe more to C, Pascal and BASIC. But I feel the need to say a few words to honour the source of the method. Users of Forth systems have had structured control-flow in their assembly languages since 1968, the same year Dijkstra's famous letter was published (Go To Statement Considered Harmful). I was pleased to learn that the x86 community at least, now has support for structured control-flow built into the assemblers MASM and NASM. It seems incredible to me that assemblers still exist with no support for structured control-flow. I program microcontrollers using the IAR Workbench assembler where this is the case.

The method used to implement structured control-flow in Forth-based assemblers (and in the Forth language itself) was nicely refined by the time of the first ANS/ISO standard Forth in 1994. If you’re curious, you can read about it here http://lars.nocrew.org/dpans/dpansa3.htm#A.3.2.3.2. But you don't need to, since what I've done for you below, is to translate the method into assembler macro definitions, so you don't need to use Forth, or know anything about it.

Some of the words used for control-flow in Forth are confusing when you try to relate them to programming languages in common use today. That's because those languages were only a glint in their creators' eyes when Chuck Moore included structured control-flow in Forth. So I have changed those words to more familiar ones. The main thing is the method, which can be adapted to add structured control-flow to any assembler, without requiring access to the assembler's source code. It only requires assembler variables and macros.

We can’t make it read exactly like any higher-level language because (a) the elegant stack-based implementation limits us to a context-free grammar, and (b) in assembly language, a conditional branch must come textually after the code that performs the test, i.e. IF and UNTIL are inherently postfix operations in assembly language. But I’ve done my best to make it read in a familiar manner.

The code below implements structured control-flow for a specific assembler (IAR) and a specific target processor (MSP430), but with minor modifications the method can be applied to any combination of assembler and target. Here’s the full source code for the IAR/MSP430 version. Please let me know if you port it, so I can include a link here.

In general terms, the method consists of macros that push some information (usually a label number) onto a stack when assembling the start of a structure, and other macros that pop it off and make use of it when assembling the end of the corresponding structure. The use of a stack allows structures to be nested. I once saw a forum post where someone claimed it was impossible to use such a method with most assemblers, because they do not provide an assembly-time stack. I thought so too, until one day I realised I could make such a stack, using assembler variables and macros, by “brute force” as it were. Garth Wilson came up with the same idea independently and has implemented it for the C32 assembler for 65C02 and MPASM for PIC16. All assemblers have a directive like SET, to assign variable values to labels, as opposed to constant values. So the stack can be implemented like this:

_CS_PUSH    MACRO   arg
_CS4        SET     _CS3
_CS3        SET     _CS2
_CS2        SET     _CS_TOP
_CS_TOP     SET     arg
            ENDM
 
_CS_DROP    MACRO
_CS_TOP     SET     _CS2
_CS2        SET     _CS3
_CS3        SET     _CS4
_CS4        SET     0
            ENDM

We use underscores at the start of all the macro and variable names so they don’t clash with assembler directives and are unlikely to clash with anything in the application code. "CS" stands for Control-flow Stack. We implement DROP instead of POP since we can just access the top item directly as _CS_TOP. I've only shown a four level stack, but you get the idea. I find that 12 levels are sufficient, but this will depend mostly on the maximum number of cases you want to have in a _CASE statement, which we will meet in part 2.

It turns out we need one more stack operation SWAP, to implement some words that can occur in the middle of a structure, such as ELSE and WHILE. It swaps the top two elements. This one uses an XOR-swap because hey, they're tricky, and how often do you get the chance. :-)

_CS_SWAP    MACRO
_CS_TOP     SET     _CS_TOP ^ _CS2
_CS2        SET     _CS_TOP ^ _CS2
_CS_TOP     SET     _CS_TOP ^ _CS2
            ENDM

 

Then we need a macro that will take an integer assembler variable and assemble a unique label based on its value, e.g. _L followed by the decimal representation of the value. This is easy in the GNU assembler, thanks to the % operator which becomes available following an .altmacro directive. This evaluates a numeric expression and turns it into a string. But in the IAR assembler I had to be more creative. I came up with the following recursive macro. At each level of the recursion, the number is divided by 10 and the string grows by one digit, until the number hits zero.

_LABEL  MACRO num, str ; "\2" below is equivalent to "str" (2nd argument) but can be concatenated

        IF num == 0

_L\2

        ELSE

            IF num % 10 == 0

               _LABEL num / 10, 0\2

            ENDIF

            IF num % 10 == 1

               _LABEL num / 10, 1\2

            ENDIF

            IF num % 10 == 2

               _LABEL num / 10, 2\2

            ENDIF

            IF num % 10 == 3

               _LABEL num / 10, 3\2

            ENDIF

            IF num % 10 == 4

               _LABEL num / 10, 4\2

            ENDIF

            IF num % 10 == 5

               _LABEL num / 10, 5\2

            ENDIF

            IF num % 10 == 6

               _LABEL num / 10, 6\2

            ENDIF

            IF num % 10 == 7

               _LABEL num / 10, 7\2

            ENDIF

            IF num % 10 == 8

               _LABEL num / 10, 8\2

            ENDIF

            IF num % 10 == 9

               _LABEL num / 10, 9\2

            ENDIF

        ENDIF

        ENDM

 

And we need a macro that will take a condition code and a label number, and assemble the corresponding jump instruction. We use the same recursive conversion from label number to label string.

 

_JUMP   MACRO cond, num, str   ; "\3" below is equivalent to "str" (3rd argument) but can be concatenated

        IF num == 0

            J\1 _L\3

        ELSE

            IF num % 10 == 0

                   _JUMP cond, num / 10, 0\3

            ENDIF

            IF num % 10 == 1

                    _JUMP cond, num / 10, 1\3

            ENDIF

            IF num % 10 == 2

                   _JUMP cond, num / 10, 2\3

            ENDIF

            IF num % 10 == 3

                   _JUMP cond, num / 10, 3\3

            ENDIF

            IF num % 10 == 4

                   _JUMP cond, num / 10, 4\3

            ENDIF

            IF num % 10 == 5

                   _JUMP cond, num / 10, 5\3

            ENDIF

            IF num % 10 == 6

                   _JUMP cond, num / 10, 6\3

            ENDIF

            IF num % 10 == 7

                   _JUMP cond, num / 10, 7\3

            ENDIF

            IF num % 10 == 8

                   _JUMP cond, num / 10, 8\3

            ENDIF

            IF num % 10 == 9

                   _JUMP cond, num / 10, 9\3

            ENDIF

        ENDIF

        ENDM

 

Now we need to define some macros for jump instructions with the opposite condition from that used in the  _IF or  _UNTIL. Notice how, in the examples above, the _IF Z needs to assemble a jnz instruction and the _UNTIL C assembles a jnc.

 

; Translate the jump instructions generated by _JUMP above, when "not" is placed before the condition code.

; Used by _IF and _UNTIL.

 

JnotZ   MACRO   label

        JNZ     label

        ENDM

JnotNZ  MACRO   label

        JZ      label

        ENDM

JnotEQ  MACRO   label

        JNE     label

        ENDM

JnotNE  MACRO   label

        JEQ     label

        ENDM

JnotHS  MACRO   label

        JLO     label

        ENDM

JnotC   MACRO   label

        JNC     label

        ENDM

JnotNC  MACRO   label

        JC      label

        ENDM

JnotLO  MACRO   label

        JHS     label

        ENDM

JnotN   MACRO   label          ; MSP430 specific.

        JN      $+4            ; The best substitute for the non-existent JNN instruction

        JMP     label          ; Thanks to Anders Lindgren

        ENDM

JnotNN  MACRO   label

        JN      label

        ENDM

JnotL   MACRO   label

        JGE     label

        ENDM

JnotGE  MACRO   label

        JL      label

        ENDM

JnotNEVER MACRO label          ; An unconditional jump

        JMP     label

        ENDM

 

 

Now we initialise the variable used to generate unique labels beginning with _L.

 

_LABEL_NUM     SET     100

 

 

And we define a couple of macros that will improve the readability of the other macros below, by allowing us to move variable names away from the first column.

 

_INC    MACRO var

var SET var + 1

        ENDM

 

_SET    MACRO var, expr

var SET expr

        ENDM

 

 

Now we can start defining the actual control-flow macros that we will use in our source code. First we define the macros that let us write conditional execution like this:

 
        <test>
        _IF cc
            ...
        _ENDIF
or this:
        <test>
        _IF cc
            ...
        _ELSE
            ...
        _ENDIF
 

Note that I’m using “...” here to stand for any number of lines of assembly language. “<test>” also stands for any number of lines, but with the specific purpose of affecting some processor condition flag (status bit). And in the case of the MSP430 processor, “cc” stands for one of Z, NZ, EQ, NE, C, NC, HS, LO, N, NN, L, GE or NEVER.

 

 
; Mark the origin of a forward jump.
; Called by _ELSE and _WHILE.
 

_IF     MACRO  cond                   ; "\1" below is equivalent to "cond" (1st argument) but can be concatenated

        _JUMP   not\1, _LABEL_NUM      ; Assemble a conditional jump with the opposite condition
        _CS_PUSH       _LABEL_NUM      ; Push its label number
        _INC           _LABEL_NUM      ; Increment the label number
        ENDM
 
 
; Resolve a forward jump due to the most recent _IF, _ELSE or _WHILE.
; Called by _ELSE and _ENDW.
 
_ENDIF  MACRO
        _LABEL  _CS_TOP                ; Assemble the label for the previous _IF.
        _CS_DROP                       ; Drop its label number off the control-flow stack
        ENDM
 
 
; Mark the origin of a forward unconditional jump and
; resolve a forward jump due to an _IF.
 
_ELSE   MACRO
        _IF     NEVER                  ; Assemble an unconditional jump and push its label number
        _CS_SWAP                       ; Get the prior _IF’s label number back on top
        _ENDIF                         ; Assemble the label for the prior _IF, and drop its number
        ENDM
 
 
Now we define the macros that let us write conditional loops like this:
 
        _REPEAT
            ...
            <test>                     ; post-tested loop
        _UNTIL cc
and this:
        _DO
            ...
            <test>                     ; pre or mid tested loop
        _WHILE cc
            ...
        _ENDW
 
 
; Mark a backward destination (i.e. the start of a loop).
 
_REPEAT MACRO
        _LABEL   _LABEL_NUM            ; Assemble a label
        _CS_PUSH _LABEL_NUM            ; Push the number of the label to jump back to
        _INC     _LABEL_NUM            ; Increment the label number
        ENDM
 
_DO     MACRO
        _REPEAT
        ENDM
 
 
; Resolve the most recent _REPEAT or _DO with a backward conditional jump.
; The end of a post-tested loop.
; Called by ENDW.
 
_UNTIL  MACRO  cond                   ; "\1" below is equivalent to "cond" (1st argument) but can be concatenated
        _JUMP   not\1, _CS_TOP         ; Assemble a conditional jump back to the corresponding _REPEAT or _DO
        _CS_DROP                       ; Drop its label number off the control-flow stack
        ENDM
 
 
; Mark the origin of a forward conditional jump out of a loop.
; The test of a pre-tested or mid-tested loop.
 
_WHILE  MACRO   cond
        _IF     cond                  ; Assemble a conditional jump and push its label number
        _CS_SWAP                      ; Get the _DO label number back on top
        ENDM
 
 
; Resolve the most recent _DO with a backward unconditional jump and
; resolve a forward jump due to the most recent _WHILE.
; The end of a pre-tested or mid-tested loop.
 
_ENDW   MACRO
        _UNTIL  NEVER                 ; Assemble a jump back to the most recent _DO
        _ENDIF                        ; Assemble the label for the last _WHILE
        ENDM
 

Note that _DO and _REPEAT are equivalent. They both just generate a label to jump back to. But having different words improves readability by allowing you to tell if the loop is post-tested or not, right from its start.

A loop with multiple exits is not strictly structured, but can be implemented by adding extra _WHILE s inside the loop. Each additional _WHILE must be matched by an additional _ENDIF or _ELSE ... _ENDIF following the loop. For more information on this see http://lars.nocrew.org/dpans/dpansa3.htm#figure.a.2.

These definitions are delightfully simple. In real life, someone would probably want us to clutter them up with some error checking and listing control.

The simplest error checking uses an assembler variable to keep a count of the items on the control-flow stack. We initialise it to zero.

 

_CS_COUNT SET 0

 

Then we add the following to the start of _CS_PUSH above.

 

_CS_COUNT SET _CS_COUNT+1

        IF _CS_COUNT > 12      ; Or whatever your stack size is

            #error "Control flow stack overflow"

        ENDIF

 

And we add the following to the start of _CS_DROP above.

 

_CS_COUNT SET _CS_COUNT-1

        IF _CS_COUNT < 0

            #error "Control flow stack underflow"

        ENDIF

 

Then we invoke the following macro at the end of the program, or anywhere that control flow structures should all be complete, to check that the control flow stack is empty and has not underflowed.

 

_CS_CHECK   MACRO

        IF _CS_COUNT != 0

            #error "Control-flow stack is unbalanced"

        ENDIF

        ENDM

Notice that the 5 basic control-flow elements are _IF, _ENDIF, _REPEAT, _UNTIL and CS_SWAP. All the other elements are defined using those. We can use these 5 to define other structures such as counted loops, switch/case statements and short-circuit-conditionals. You can read about these in Part 2.

Go to Part 2. [Note: This Go To is not considered harmful :-) ]

-- Dave Keenan, 2018-Jan-01 (last updated 2018-Jan-13)
thing.gif