7.10 Loops

Loops represent the final basic control structure (sequences, decisions, and loops) that make up a typical program. Like so many other structures in assembly language, you'll find yourself using loops in places you've never dreamed of using loops. Most high-level languages have implied loop structures hidden away. For example, consider the BASIC statement if A$ = B$ then 100. This if statement compares two strings and jumps to statement 100 if they are equal. In assembly language, you would need to write a loop to compare each character in A$ to the corresponding character in B$ and then jump to statement 100 if and only if all the characters matched. In BASIC, there is no loop to be seen in the program. Assembly language requires a loop to compare the individual characters in the string.[109] This is but a small example that shows how loops seem to pop up everywhere.

Program loops consist of three components: an optional initialization component, an optional loop termination test, and the body of the loop. The order in which you assemble these components can dramatically affect the loop's operation. Three permutations of these components appear frequently in programs. Because of their frequency, these loop structures are given special names in high-level languages: while loops, repeat..until loops (do..while in C/C++), and infinite loops (e.g., forever..endfor in HLA).

The most general loop is the while loop. In HLA's high-level syntax it takes the following form:

while( expression ) do statements endwhile;

There are two important points to note about the while loop. First, the test for termination appears at the beginning of the loop. Second, as a direct consequence of the position of the termination test, the body of the loop may never execute if the boolean expression is always false.

Consider the following HLA while loop:

mov( 0, i );
     while( i < 100 ) do

          inc( i );

     endwhile;

The mov( 0, i ); instruction is the initialization code for this loop. i is a loop-control variable, because it controls the execution of the body of the loop. i < 100 is the loop termination condition. That is, the loop will not terminate as long as i is less than 100. The single instruction inc( i ); is the loop body that executes on each loop iteration.

Note that an HLA while loop can be easily synthesized using if and jmp statements. For example, you may replace the previous HLA while loop with the following HLA code:

mov( 0, i );
     WhileLp:
     if( i < 100 ) then

          inc( i );
          jmp WhileLp;

     endif;

More generally, you can construct any while loop as follows:

<< Optional initialization code >>

     UniqueLabel:
     if( not_termination_condition ) then

          << Loop body >>
          jmp UniqueLabel;

     endif;

Therefore, you can use the techniques from earlier in this chapter to convert if statements to assembly language and add a single jmp instruction to produce a while loop. The example we've been looking at in this section translates to the following pure 80x86 assembly code:[110]

mov( 0, i );
     WhileLp:
          cmp( i, 100 );
          jnl WhileDone;
          inc( i );
          jmp WhileLp;

     WhileDone:

The repeat..until (do..while) loop tests for the termination condition at the end of the loop rather than at the beginning. In HLA high-level syntax, the repeat..until loop takes the following form:

<< Optional initialization code >>
     repeat

          << Loop body >>

     until( termination_condition );

This sequence executes the initialization code, then executes the loop body, and finally tests some condition to see if the loop should repeat. If the boolean expression evaluates to false, the loop repeats; otherwise the loop terminates. The two things you should note about the repeat..until loop are that the termination test appears at the end of the loop and, as a direct consequence of this, the loop body always executes at least once.

Like the while loop, the repeat..until loop can be synthesized with an if statement and a jmp. You could use the following:

<< Initialization code >>
     SomeUniqueLabel:

          << Loop body >>

     if( not_the_termination_condition ) then jmp SomeUniqueLabel; endif;

Based on the material presented in the previous sections, you can easily synthesize repeat..until loops in assembly language. The following is a simple example:

repeat

          stdout.put( "Enter a number greater than 100: " );
          stdin.get( i );

     until( i > 100 );

// This translates to the following if/jmp code:

     RepeatLabel:

          stdout.put( "Enter a number greater than 100: " );
          stdin.get( i );

     if( i <= 100 ) then jmp RepeatLabel; endif;

// It also translates into the following "pure" assembly code:

     RepeatLabel:

          stdout.put( "Enter a number greater than 100: " );
          stdin.get( i );

     cmp( i, 100 );
     jng RepeatLabel;

If while loops test for termination at the beginning of the loop and repeat..until loops check for termination at the end of the loop, the only place left to test for termination is in the middle of the loop. The HLA high-level forever..endfor loop, combined with the break and breakif statements, provides this capability. The forever..endfor loop takes the following form:

forever

          << Loop body >>

     endfor;

Note that there is no explicit termination condition. Unless otherwise provided for, the forever..endfor construct forms an infinite loop. A breakif statement usually handles loop termination. Consider the following HLA code that employs a forever..endfor construct:

forever

          stdin.get( character );
          breakif( character = '.' );
          stdout.put( character );

     endfor;

Converting a forever loop to pure assembly language is easy. All you need is a label and a jmp instruction. The breakif statement in this example is really nothing more than an if and a jmp instruction. The pure assembly language version of the code above looks something like the following:

foreverLabel:

          stdin.get( character );
          cmp( character, '.' );
          je ForIsDone;
          stdout.put( character );
          jmp foreverLabel;

     ForIsDone:

The for loop is a special form of the while loop that repeats the loop body a specific number of times. In HLA, the for loop takes the following form:

for( Initialization_Stmt; Termination_Expression; inc_Stmt ) do

          << statements >>

     endfor;

This is completely equivalent to the following:

Initialization_Stmt;
     while( Termination_Expression ) do

          << statements >>

          inc_Stmt;

     endwhile;

Traditionally, programs use the for loop to process arrays and other objects accessed in sequential order. One normally initializes a loop-control variable with the initialization statement and then uses the loop-control variable as an index into the array (or other data type). For example:

for( mov( 0, esi ); esi < 7; inc( esi )) do

     stdout.put( "Array Element = ", SomeArray[ esi*4 ], nl );

endfor;

To convert this to pure assembly language, begin by translating the for loop into an equivalent while loop:

mov( 0, esi );
          while( esi < 7 ) do

               stdout.put( "Array Element = ", SomeArray[ esi*4 ], nl );

               inc( esi );
          endwhile;

Now, using the techniques from the section on while loops, translate the code into pure assembly language:

mov( 0, esi );
          WhileLp:
          cmp( esi, 7 );
          jnl EndWhileLp;

               stdout.put( "Array Element = ", SomeArray[ esi*4 ], nl );

               inc( esi );
               jmp WhileLp;

          EndWhileLp:

The HLA break and continue statements both translate into a single jmp instruction. The break instruction exits the loop that immediately contains the break statement; the continue statement restarts the loop that immediately contains the continue statement.

Converting a break statement to pure assembly language is very easy. Just emit a jmp instruction that transfers control to the first statement following the endxxxx (or until) clause of the loop to exit. You can do this by placing a label after the associated endxxxx clause and jumping to that label. The following code fragments demonstrate this technique for the various loops.

// Breaking out of a FOREVER loop:

forever
     << stmts >>
          // break;
          jmp BreakFromForever;
     << stmts >>
endfor;
BreakFromForever:

// Breaking out of a FOR loop;
for( initStmt; expr; incStmt ) do
     << stmts >>
          // break;
          jmp BrkFromFor;
     << stmts >>
endfor;
BrkFromFor:

// Breaking out of a WHILE loop:

while( expr ) do
     << stmts >>
          // break;
          jmp BrkFromWhile;
     << stmts >>
endwhile;
BrkFromWhile:

// Breaking out of a REPEAT..UNTIL loop:

repeat
     << stmts >>
          // 20break;
          jmp BrkFromRpt;
     << stmts >>
until( expr );
BrkFromRpt:

The continue statement is slightly more complex than the break statement. The implementation is still a single jmp instruction; however, the target label doesn't wind up going in the same spot for each of the different loops. Figure 7-2, Figure 7-3, Figure 7-4, and Figure 7-5 show where the continue statement transfers control for each of the HLA loops.

The following code fragments demonstrate how to convert the continue statement into an appropriate jmp instruction for each of these loop types.

Given that the 80x86 accesses registers more efficiently than memory locations, registers are the ideal spot to place loop-control variables (especially for small loops). However, there are some problems associated with using registers within a loop. The primary problem with using registers as loop-control variables is that registers are a limited resource. The following will not work properly because it attempts to reuse a register (CX) that is already in use:

mov( 8, cx );
          loop1:
               mov( 4, cx );
               loop2:
                    << stmts >>
                    dec( cx );
                    jnz loop2;
               dec( cx );
           jnz loop1;

The intent here, of course, was to create a set of nested loops, that is, one loop inside another. The inner loop (loop2) should repeat four times for each of the eight executions of the outer loop (loop1). Unfortunately, both loops use the same register as a loop-control variable. Therefore, this will form an infinite loop because CX will contain 0 at the end of the first loop. Because CX is always 0 upon encountering the second dec instruction, control will always transfer to the loop1 label (because decrementing 0 produces a nonzero result). The solution here is to save and restore the CX register or to use a different register in place of CX for the outer loop:

mov( 8, cx );
          loop1:
               push( cx );
               mov( 4, cx );
               loop2:
                    << stmts >>
                    dec( cx );
                    jnz loop2;

               pop( cx );
               dec( cx );
               jnz loop1;

or

mov( 8, dx );
          loop1:
               mov( 4, cx );
               loop2:
                    << stmts >>
                    dec( cx );
                    jnz loop2;

               dec( dx );
               jnz loop1;

Register corruption is one of the primary sources of bugs in loops in assembly language programs, so always keep an eye out for this problem.



[109] Of course, the HLA Standard Library provides the str.eq routine that compares the strings for you, effectively hiding the loop even in an assembly language program.

[110] Note that HLA will actually convert most while statements to different 80x86 code than this section presents. The reason for the difference appears in 7.11 Performance Improvements, when we explore how to write more efficient loop code.