In its most basic form, a decision is some sort of branch within the code that switches between two possible execution paths based on some condition. Normally (though not always), conditional instruction sequences are implemented with the conditional jump instructions. Conditional instructions correspond to the if..then..endif
statement in HLA:
if( expression
) then
<< statements >>
endif;
Assembly language, as usual, offers much more flexibility when dealing with conditional statements. Consider the following C/C++ statement:
if( (( x < y ) && ( z > t )) || ( a != b ) ) stmt1;
A "brute force" approach to converting this statement into assembly language might produce the following:
mov( x, eax ); cmp( eax, y ); setl( bl ); // Store x<y in bl. mov( z, eax ); cmp( eax, t ); setg( bh ); // Store z>t in bh. and( bh, bl ); // Put (x<y) && (z>t) into bl. mov( a, eax ); cmp( eax, b ); setne( bh ); // Store a != b into bh. or( bh, bl ); // Put (x<y) && (z>t) || (a!=b) into bl je SkipStmt1; // Branch if result is false. << Code for Stmt1 goes here. >> SkipStmt1:
As you can see, it takes a considerable number of conditional statements just to process the expression in the example above. This roughly corresponds to the (equivalent) C/C++ statements:
bl = x < y; bh = z > t; bl = bl && bh; bh = a != b; bl = bl || bh; if( bl ) << Stmt1 >>;
Now compare this with the following "improved" code:
mov( a, eax ); cmp( eax, b ); jne DoStmt; mov( x, eax ); cmp( eax, y ); jnl SkipStmt; mov( z, eax ); cmp( eax, t ); jng SkipStmt; DoStmt: << Place code for Stmt1 here. >> SkipStmt:
Two things should be apparent from the code sequences above: First, a single conditional statement in C/C++ (or some other HLL) may require several conditional jumps in assembly language; second, organization of complex expressions in a conditional sequence can affect the efficiency of the code. Therefore, you should exercise care when dealing with conditional sequences in assembly language.
Conditional statements may be broken down into three basic categories: if
statements, switch
/case
statements, and indirect jumps. The following sections describe these program structures, how to use them, and how to write them in assembly language.
The most common conditional statements are the if..then..endif
and if..then..else..endif
statements. These two statements take the form shown in Figure 7-1.
The if..then..endif
statement is just a special case of the if..then..else..endif
statement (with an empty else
block). Therefore, we'll consider only the more general if..then..else..endif
form. The basic implementation of an if..then..else..endif
statement in 80x86 assembly language looks something like this:
<< Sequence of statements to test some condition >>
jcc
ElseCode;
<< Sequence of statements corresponding to the THEN block >>
jmp EndOfIf;
ElseCode:
<< Sequence of statements corresponding to the ELSE block >>
EndOfIf:
Note that jcc represents some conditional jump instruction. For example, to convert the C/C++ statement
if( a == b ) c = d; else b = b + 1;
to assembly language, you could use the following 80x86 code:
mov( a, eax ); cmp( eax, b ); jne ElsePart; mov( d, c ); jmp EndOfIf; ElseBlk: inc( b ); EndOfIf:
For simple expressions like ( a == b )
generating the proper code for an if..then..else..endif
statement is almost trivial. Should the expression become more complex, the code complexity increases as well. Consider the following C/C++ if
statement presented earlier:
if( (( x > y ) && ( z < t )) || ( a != b ) ) c = d;
When processing complex if
statements such as this one, you'll find the conversion task easier if you break the if
statement into a sequence of three different if
statements as follows:
if( a != b ) c = d; else if( x > y) if( z < t ) c = d;
This conversion comes from the following C/C++ equivalents:
if(expr1
&&expr2
)stmt
;
is equivalent to
if(expr1
) if(expr2
)stmt
;
and
if(expr1
||expr2
)stmt
;
is equivalent to
if(expr1
)stmt
; else if(expr2
)stmt
;
In assembly language, the former if
statement becomes
// if( (( x > y ) && ( z < t )) || ( a != b ) ) // c = d; mov( a, eax ); cmp( eax, b ); jne DoIF; mov( x, eax ); cmp( eax, y ); jng EndOfIF; mov( z, eax ); cmp( eax, t ); jnl EndOfIf; DoIf: mov( d, eax ); mov( eax, c ); EndOfIf:
As you can see, testing a condition can easily become more complex than the statements appearing in the else
and then
blocks. Although it seems somewhat paradoxical that it may take more effort to test a condition than to act on the results of that condition, it happens all the time. Therefore, you should be prepared to accept this.
Probably the biggest problem with complex conditional statements in assembly language is trying to figure out what you've done after you've written the code. A big advantage high-level languages offer over assembly language is that expressions are much easier to read and comprehend. The high-level version is (more) self-documenting, whereas assembly language tends to hide the true nature of the code. Therefore, well-written comments are an essential ingredient to assembly language implementations of if..then..else..endif
statements. An elegant implementation of the example above is as follows:
// if ((x > y) && (z < t)) or (a != b) c = d; // Implemented as: // if (a != b) then goto DoIf; mov( a, eax ); cmp( eax, b ); jne DoIf; // if not (x > t) then goto EndOfIf; mov( x, eax ); cmp( eax, y ); jng EndOfIf; // if not (z < t) then goto EndOfIf; mov( z, eax ); cmp( eax, t ); jnl EndOfIf; // then block: DoIf: mov( d, eax ); mov( eax, c ); // End of if statement EndOfIf:
Admittedly, this appears to be going overboard for such a simple example. The following would probably suffice:
// if ( (( x > y ) && ( z < t )) || ( a != b ) ) c = d; // Test the boolean expression: mov( a, eax ); cmp( eax, b ); jne DoIf; mov( x, eax ); cmp( eax, y ); jng EndOfIf; mov( z, eax ); cmp( eax, t ); jnl EndOfIf; // then block: DoIf: mov( d, eax ); mov( eax, c ); // End of if statement EndOfIf:
However, as your if
statements become complex, the density (and quality) of your comments become more and more important.
Translating HLA if
statements into pure assembly language is very easy. The boolean expressions that the HLA if
statement supports were specifically chosen to expand into a few simple machine instructions. The following paragraphs discuss the conversion of each supported boolean expression into pure machine code.
This form is, perhaps, the easiest HLA if
statement to convert. To execute the code immediately following the then
keyword if a particular flag is set (or clear), all you need do is skip over the code if the flag is clear (set). This requires only a single conditional jump instruction for implementation, as the following examples demonstrate:
// if( @c ) then inc( eax ); endif; jnc SkipTheInc; inc( eax ); SkipTheInc: // if( @ns ) then neg( eax ); endif; js SkipTheNeg; neg( eax ); SkipTheNeg:
This form uses the test
instruction to check the specified register for 0. If the register contains 0 (false), then the program jumps around the statements after the then
clause with a jz
instruction. Converting this statement to assembly language requires a test
instruction and a jz
instruction, as the following examples demonstrate:
// if( eax ) then mov( false, eax ); endif; test( eax, eax ); jz DontSetFalse; mov( false, eax ); DontSetFalse: // if( al ) then mov( bl, cl ); endif; test( al, al ); jz noMove; mov( bl, cl ); noMove:
This form of the if
statement uses the test
instruction to check the specified register to see if it is 0. If the register is not 0 (true), then the program jumps around the statements after the then
clause with a jnz
instruction. Converting this statement to assembly language requires a test
instruction and a jnz
instruction in a manner identical to the previous examples.
This form of the if
statement compares the boolean variable against 0 (false) and branches around the statements if the variable contains false. HLA implements this statement by using the cmp
instruction to compare the boolean variable to 0, and then it uses a jz
(je
) instruction to jump around the statements if the variable is false. The following example demonstrates the conversion:
// if( bool ) then mov( 0, al ); endif; cmp( bool, false ); je SkipZeroAL; mov( 0, al ); SkipZeroAL:
This form of the if
statement compares the boolean variable against 0 (false) and branches around the statements if the variable contains true (the opposite condition of the previous example). HLA implements this statement by using the cmp
instruction to compare the boolean variable to 0 and then it uses a jnz
(jne
) instruction to jump around the statements if the variable contains true. The following example demonstrates the conversion:
// if( !bool ) then mov( 0, al ); endif; cmp( bool, false ); jne SkipZeroAL; mov( 0, al ); SkipZeroAL:
HLA translates this form of the if
statement into a cmp
instruction and a conditional jump that skips over the statements on the opposite condition specified by the relop
operator. Table 7-4 lists the correspondence between operators and conditional jump instructions.
Table 7-4. if
Statement Conditional Jump Instructions
Relational operation | Conditional jump instruction if both operands are unsigned | Conditional jump instruction if either operand is signed |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Here are a few examples of if
statements translated into pure assembly language that use expressions involving relational operators:
// if( al == ch ) then inc( cl ); endif; cmp( al, ch ); jne SkipIncCL; inc( cl ); SkipIncCL: // if( ch >= 'a' ) then and( $5f, ch ); endif; cmp( ch, 'a' ); jnae NotLowerCase and( $5f, ch ); NotLowerCase: // if( (type int32 eax ) < −5 ) then mov( −5, eax ); endif; cmp( eax, −5 ); jnl DontClipEAX; mov( −5, eax ); DontClipEAX: // if( si <> di ) then inc( si ); endif; cmp( si, di ); je DontIncSI; inc( si ); DontIncSI:
HLA translates this if
statement into a pair of cmp
instructions and a pair of conditional jump instructions. It compares the register or memory location against the lower-valued constant and jumps if less than (signed) or below (unsigned) past the statements after the then
clause. If the register or memory location's value is greater than or equal to LowConst
, the code falls through to the second cmp
and conditional jump pair that compares the register or memory location against the higher constant. If the value is greater than (above) this constant, a conditional jump instruction skips the statements in the then
clause.
Here's an example:
// if( eax in 1000..125_000 ) then sub( 1000, eax ); endif; cmp( eax, 1000 ); jb DontSub1000; cmp( eax, 125_000 ); ja DontSub1000; sub( 1000, eax ); DontSub1000: // if( i32 in −5..5 ) then add( 5, i32 ); endif; cmp( i32, −5 ); jl NoAdd5; cmp( i32, 5 ); jg NoAdd5; add(5, i32 ); NoAdd5:
This form of the HLA if
statement tests a register or memory location to see if its value is outside a specified range. The implementation is very similar to the previous code except you branch to the then
clause if the value is less than the LowConst
value or greater than the HiConst
value, and you branch over the code in the then
clause if the value is within the range specified by the two constants. The following examples demonstrate how to do this conversion:
// if( eax not in 1000..125_000 ) then add( 1000, eax ); endif; cmp( eax, 1000 ); jb Add1000; cmp( eax, 125_000 ); jbe SkipAdd1000; Add1000: add( 1000, eax ); SkipAdd1000: // if( i32 not in −5..5 ) then mov( 0, i32 ); endif; cmp( i32, −5 ); jl Zeroi32; cmp( i32, 5 ); jle SkipZero; Zeroi32: mov( 0, i32 ); SkipZero:
Many boolean expressions involve conjunction (and
) or disjunction (or
) operations. This section describes how to convert boolean expressions into assembly language. There are two different ways to convert complex boolean expressions involving conjunction and disjunction into assembly language: using complete boolean evaluation or using short-circuit boolean evaluation. This section discusses complete boolean evaluation. The next section discusses short-circuit boolean evaluation.
Conversion via complete boolean evaluation is almost identical to converting arithmetic expressions into assembly language. Indeed, the previous chapter on arithmetic covers this conversion process. About the only thing worth noting about that process is that you do not need to store the result in some variable; once the evaluation of the expression is complete, you check to see if you have a false (0) or true (1, or nonzero) result to take whatever action the boolean expression dictates. As you can see in the examples in the preceding sections, you can often use the fact that the last logical instruction (and
/or
) sets the zero flag if the result is false and clears the zero flag if the result is true. This lets you avoid explicitly testing for the result. Consider the following if
statement and its conversion to assembly language using complete boolean evaluation:
// if( (( x < y ) && ( z > t )) || ( a != b ) ) // << Stmt1 >>; mov( x, eax ); cmp( eax, y ); setl( bl ); // Store x<y in bl. mov( z, eax ); cmp( eax, t ); setg( bh ); // Store z>t in bh. and( bh, bl ); // Put (x<y) && (z>t) into bl. mov( a, eax ); cmp( eax, b ); setne( bh ); // Store a != b into bh. or( bh, bl ); // Put (x<y) && (z>t) || (a != b) into bl. je SkipStmt1; // Branch if result is false. << Code for Stmt1 goes here. >> SkipStmt1:
This code computes a boolean result in the BL register and then, at the end of the computation, tests this value to see if it contains true or false. If the result is false, this sequence skips over the code associated with Stmt1
. The important thing to note in this example is that the program will execute each and every instruction that computes this boolean result (up to the je
instruction).
If you are willing to expend a little more effort, you can usually convert a boolean expression to a much shorter and faster sequence of assembly language instructions using short-circuit boolean evaluation. Short-circuit boolean evaluation attempts to determine whether an expression is true or false by executing only some of the instructions that would compute the complete expression. For this reason, plus the fact that short-circuit boolean evaluation doesn't require the use of any temporary registers, HLA uses short-circuit evaluation when translating complex boolean expressions into assembly language.
Consider the expression a && b
. Once we determine that a
is false, there is no need to evaluate b
because there is no way the expression can be true. If and b
represent subexpressions rather than simple variables, the savings possible with short-circuit boolean evaluation are apparent. As a concrete example, consider the subexpression ((x<y) && (z>t))
from the previous section. Once you determine that x
is not less than y
, there is no need to check to see if z
is greater than t
because the expression will be false regardless of z
and t
's values. The following code fragment shows how you can implement short-circuit boolean evaluation for this expression:
// if( (x<y) && (z>t) ) then ... mov( x, eax ); cmp( eax, y ); jnl TestFails; mov( z, eax ); cmp( eax, t ); jng TestFails; << Code for THEN clause of IF statement >> TestFails:
Notice how the code skips any further testing once it determines that x
is not less than y
. Of course, if x
is less than y
, then the program has to test z
to see if it is greater than t
; if not, the program skips over the then
clause. Only if the program satisfies both conditions does the code fall through to the then
clause.
For the logical or
operation the technique is similar. If the first subexpression evaluates to true, then there is no need to test the second operand. Whatever the second operand's value is at that point, the full expression still evaluates to true. The following example demonstrates the use of short-circuit evaluation with disjunction (or
):
// if( ch < 'A' || ch > 'Z' ) // then stdout.put( "Not an uppercase char" ); // endif; cmp( ch, 'A' ); jb ItsNotUC cmp( ch, 'Z' ); jna ItWasUC; ItsNotUC: stdout.put( "Not an uppercase char" ); ItWasUC:
Because the conjunction and disjunction operators are commutative, you can evaluate the left or right operand first if it is more convenient to do so.[106] As one last example in this section, consider the full boolean expression from the previous section:
// if( (( x < y ) && ( z > t )) || ( a != b ) ) << Stmt1 >>; mov( a, eax ); cmp( eax, b ); jne DoStmt1; mov( x, eax ); cmp( eax, y ); jnl SkipStmt1; mov( z, eax ); cmp( eax, t ); jng SkipStmt1; DoStmt1: << Code for Stmt1 goes here. >> SkipStmt1:
Notice how the code in this example chose to evaluate a != b
first and the remaining subexpression last. This is a common technique assembly language programmers use to write better code.
When using complete boolean evaluation, every statement in the sequence for that expression will execute; short-circuit boolean evaluation, on the other hand, may not require the execution of every statement associated with the boolean expression. As you've seen in the previous two sections, code based on short-circuit evaluation is usually shorter and faster. So it would seem that short-circuit evaluation is the technique of choice when converting complex boolean expressions to assembly language.
Sometimes, unfortunately, short-circuit boolean evaluation may not produce the correct result. In the presence of side effects in an expression, short-circuit boolean evaluation will produce a different result than complete boolean evaluation. Consider the following C/C++ example:
if( ( x == y ) && ( ++z != 0 )) << Stmt >>;
Using complete boolean evaluation, you might generate the following code:
mov( x, eax ); // See if x == y. cmp( eax, y ); sete( bl ); inc( z ); // ++z cmp( z, 0 ); // See if incremented z is 0. setne( bh ); and( bh, bl ); // Test x == y && ++z != 0. jz SkipStmt; << Code for Stmt goes here. >> SkipStmt:
Using short-circuit boolean evaluation, you might generate the following code:
mov( x, eax ); // See if x == y. cmp( eax, y ); jne SkipStmt; inc( z ); // ++z cmp( z, 0 ); // See if incremented z is 0. je SkipStmt; << Code for Stmt goes here. >> SkipStmt:
Notice a very subtle but important difference between these two conversions: If x
is equal to y
, then the first version above still increments z and compares it to 0 before it executes the code associated with Stmt
; the short-circuit version, on the other hand, skips the code that increments z
if it turns out that x
is equal to y
. Therefore, the behavior of these two code fragments is different if x
is equal to y
. Neither implementation is particularly wrong; depending on the circumstances you may or may not want the code to increment z
if x
is equal to y
. However, it is important that you realize that these two schemes produce different results, so you can choose an appropriate implementation if the effect of this code on z
matters to your program.
Many programs take advantage of short-circuit boolean evaluation and rely on the fact that the program may not evaluate certain components of the expression. The following C/C++ code fragment demonstrates what is probably the most common example that requires short-circuit boolean evaluation:
if( Ptr != NULL && *Ptr == 'a' ) << Stmt >>;
If it turns out that Ptr
is NULL
, then the expression is false and there is no need to evaluate the remainder of the expression (and, therefore, code that uses short-circuit boolean evaluation will not evaluate the remainder of this expression). This statement relies on the semantics of short-circuit boolean evaluation for correct operation. Were C/C++ to use complete boolean evaluation, and the variable Ptr
contained NULL
, then the second half of the expression would attempt to dereference a NULL
pointer (which tends to crash most programs). Consider the translation of this statement using complete and short-circuit boolean evaluation:
// Complete boolean evaluation: mov( Ptr, eax ); test( eax, eax ); // Check to see if eax is 0 (NULL is 0). setne( bl ); mov( [eax], al ); // Get *Ptr into al. cmp( al, 'a' ); sete( bh ); and( bh, bl ); jz SkipStmt; << Code for Stmt goes here. >> SkipStmt:
Notice in this example that if Ptr
contains NULL
(0), then this program will attempt to access the data at location 0 in memory via the mov( [eax], al );
instruction. Under most operating systems this will cause a memory access fault (general protection fault).
Now consider the short-circuit boolean conversion:
// Short-circuit boolean evaluation mov( Ptr, eax ); // See if Ptr contains NULL (0) and test( eax, eax ); // immediately skip past Stmt if this jz SkipStmt; // is the case. mov( [eax], al ); // If we get to this point, Ptr contains cmp( al, 'a' ); // a non-NULL value, so see if it points jne SkipStmt; // at the character 'a'. << Code for Stmt goes here. >> SkipStmt:
As you can see in this example, the problem with dereferencing the NULL
pointer doesn't exist. If Ptr
contains NULL
, this code skips over the statements that attempt to access the memory address Ptr
contains.
Encoding if
statements efficiently in assembly language takes a bit more thought than simply choosing short-circuit evaluation over complete boolean evaluation. To write code that executes as quickly as possible in assembly language, you must carefully analyze the situation and generate the code appropriately. The following paragraphs provide some suggestions you can apply to your programs to improve their performance.
A mistake programmers often make is the assumption that data is random. In reality, data is rarely random, and if you know the types of values that your program commonly uses, you can use this knowledge to write better code. To see how, consider the following C/C++ statement:
if(( a == b ) && ( c < d )) ++i;
Because C/C++ uses short-circuit evaluation, this code will test to see if a
is equal to b
. If so, then it will test to see if c
is less than d
. If you expect a
to be equal to b
most of the time but don't expect c
to be less than d
most of the time, this statement will execute slower than it should. Consider the following HLA implementation of this code:
mov( a, eax ); cmp( eax, b ); jne DontIncI; mov( c, eax ); cmp( eax, d ); jnl DontIncI; inc( i ); DontIncI:
As you can see in this code, if a
is equal to b
most of the time and c
is not less than d
most of the time, you will have to execute all six instructions nearly every time in order to determine that the expression is false. Now consider the following implementation of the above C/C++ statement that takes advantage of this knowledge and the fact that the &&
operator is commutative:
mov( c, eax ); cmp( eax, d ); jnl DontIncI; mov( a, eax ); cmp( eax, b ); jne DontIncI; inc( i ); DontIncI:
In this example the code first checks to see if c
is less than d
. If most of the time c
is less than d
, then this code determines that it has to skip to the label DontIncI
after executing only three instructions in the typical case (compared with six instructions in the previous example). This fact is much more obvious in assembly language than in a high-level language; this is one of the main reasons why assembly programs are often faster than their high-level language counterparts: optimizations are more obvious in assembly language than in a high-level language. Of course, the key here is to understand the behavior of your data so you can make intelligent decisions such as the one above.
Even if your data is random (or you can't determine how the input values will affect your decisions), there may still be some benefit to rearranging the terms in your expressions. Some calculations take far longer to compute than others. For example, the div
instruction is much slower than a simple cmp
instruction. Therefore, if you have a statement like the following, you may want to rearrange the expression so that the cmp
comes first:
if( (x % 10 = 0 ) && (x != y ) ++x;
Converted to assembly code, this if
statement becomes:
mov( x, eax ); // Compute X % 10. cdq(); // Must sign extend eax -> edx:eax. imod( 10, edx:eax ); // Remember, remainder goes into edx. test( edx, edx ); // See if edx is 0. jnz SkipIf; mov( x, eax ); cmp( eax, y ); je SkipIf; inc( x ); SkipIf:
The imod
instruction is very expensive (often 50–100 times slower than most of the other instructions in this example). Unless it is 50–100 times more likely that the remainder is 0 rather than x
is equal to y
, it would be better to do the comparison first and the remainder calculation afterward:
mov( x, eax ); cmp( eax, y ); je SkipIf; mov( x, eax ); // Compute X % 10. cdq(); // Must sign extend eax -> edx:eax. imod( 10, edx:eax ); // Remember, remainder goes into edx. test( edx, edx ); // See if edx is 0. jnz SkipIf; inc( x ); SkipIf:
Of course, in order to rearrange the expression in this manner, the code must not assume the use of short-circuit evaluation semantics (because the &&
and ||
operators are not commutative if the code must compute one subexpression before another).
Although there are many good things to be said about structured programming techniques, there are some drawbacks to writing structured code. Specifically, structured code is sometimes less efficient than unstructured code. Most of the time this is tolerable because unstructured code is difficult to read and maintain; it is often acceptable to sacrifice some performance in exchange for maintainable code. In certain instances, however, you may need all the performance you can get. In those rare instances you might choose to compromise the readability of your code in order to gain some additional performance.
One classic way to do this is to use code movement to move code your program rarely uses out of the way of code that executes most of the time. For example, consider the following pseudo C/C++ statement:
if( See_If_an_Error_Has_Occurred ) { << Statements to execute if no error >> } else { << Error handling statements >> }
In normal code, one does not expect errors to be frequent. Therefore, you would normally expect the then
section of the above if
to execute far more often than the else
clause. The code above could translate into the following assembly code:
cmp( See_If_an_Error_Has_Occurred, true ); je HandleTheError; << Statements to execute if no error >> jmp EndOfIF; HandleTheError: << Error handling statements >> EndOfIf:
Notice that if the expression is false, this code falls through to the normal statements and then jumps over the error-handling statements. Instructions that transfer control from one point in your program to another (for example, jmp
instructions) tend to be slow. It is much faster to execute a sequential set of instructions rather than jump all over the place in your program. Unfortunately, the code above doesn't allow this. One way to rectify this problem is to move the else
clause of the code somewhere else in your program. That is, you could rewrite the code as follows:
cmp( See_If_an_Error_Has_Occurred, true ); je HandleTheError; << Statements to execute if no error >> EndOfIf:
At some other point in your program (typically after a jmp
instruction) you would insert the following code:
HandleTheError: << Error handling statements >> jmp EndOfIf;
Note that the program isn't any shorter. The jmp
you removed from the original sequence winds up at the end of the else
clause. However, because the else
clause rarely executes, moving the jmp
instruction from the then
clause (which executes frequently) to the else
clause is a big performance win because the then
clause executes using only straight-line code. This technique is surprisingly effective in many time-critical code segments.
There is a difference between writing destructured code and writing unstructured code. Unstructured code is written in an unstructured way to begin with. It is generally hard to read, difficult to maintain, and often contains defects. Destructured code, on the other hand, starts out as structured code, and you make a conscious decision to eliminate the structure in order to gain a small performance boost. Generally, you've already tested the code in its structured form before destructuring it. Therefore, destructured code is often easier to work with than unstructured code.
On many processors in the 80x86 family, branches ( jumps) are very expensive compared to many other instructions. For this reason it is sometimes better to execute more instructions in a sequence than fewer instructions that involve branching. For example, consider the simple assignment eax = abs( eax );
. Unfortunately, there is no 80x86 instruction that computes the absolute value of an integer. The obvious way to handle this is with an instruction sequence like the following:
test( eax, eax ); jns ItsPositive; neg( eax ); ItsPositive:
However, as you can plainly see in this example, it uses a conditional jump to skip over the neg
instruction (that creates a positive value in EAX if EAX was negative). Now consider the following sequence that will also do the job:
// Set edx to $FFFF_FFFF if eax is negative, $0000_0000 if eax is // 0 or positive: cdq(); // If eax was negative, the following code inverts all the bits in eax; // otherwise it has no effect on eax. xor( edx, eax ); // If eax was negative, the following code adds 1 to eax; otherwise // it doesn't modify eax's value. and( 1, edx ); // edx = 0 or 1 (1 if eax was negative). add( edx, eax );
This code will invert all the bits in EAX and then add 1 to EAX if EAX was negative prior to the sequence; that is, it negates the value in EAX. If EAX was 0 or positive, then this code does not change the value in EAX.
Note that this sequence takes four instructions rather than the three the previous example requires. However, because there are no transfer-of-control instructions in this sequence, it may execute faster on many CPUs in the 80x86 family.
The HLA switch
statement takes the following form:
switch(reg32
) case(const1
) << Stmts1: code to execute ifreg32
equalsconst1
>> case(const2
) << Stmts2: code to execute ifreg32
equalsconst2
>> . . . case(constn
) << Stmtsn: code to execute ifreg32
equalsconstn
>> default // Note that the default section is optional. << Stmts_default: code to execute ifreg32
does not equal any of the case values >> endswitch;
When this statement executes, it checks the value of the register against the constants const1
..
constn
. If a match is found, then the corresponding statements execute. HLA places a few restrictions on the switch
statement. First, the HLA switch
statement allows only a 32-bit register as the switch
expression. Second, all the constants in the case
clauses must be unique. The reason for these restrictions will become clear in a moment.
Most introductory programming texts introduce the switch/case
statement by explaining it as a sequence of if..then..elseif..else..endif
statements. They might claim that the following two pieces of HLA code are equivalent:
switch( eax ) case(0) stdout.put("i=0"); case(1) stdout.put("i=1"); case(2) stdout.put("i=2"); endswitch; if( eax = 0 ) then stdout.put("i=0") elseif( eax = 1 ) then stdout.put("i=1") elseif( eax = 2 ) then stdout.put("i=2"); endif;
While semantically these two code segments may be the same, their implementation is usually different. Whereas the if..then..elseif..else..endif
chain does a comparison for each conditional statement in the sequence, the switch
statement normally uses an indirect jump to transfer control to any one of several statements with a single computation. Consider the two examples presented above; they could be written in assembly language with the following code:
// if..then..else..endif form: mov( i, eax ); test( eax, eax ); // Check for 0. jnz Not0; stdout.put( "i=0" ); jmp EndCase; Not0: cmp( eax, 1 ); jne Not1; stdou.put( "i=1" ); jmp EndCase; Not1: cmp( eax, 2 ); jne EndCase; stdout.put( "i=2" ); EndCase: // Indirect Jump Version readonly JmpTbl:dword[3] := [ &Stmt0, &Stmt1, &Stmt2 ]; . . . mov( i, eax ); jmp( JmpTbl[ eax*4 ] ); Stmt0: stdout.put( "i=0" ); jmp EndCase; Stmt1: stdout.put( "I=1" ); jmp EndCase; Stmt2: stdout.put( "I=2" ); EndCase:
The implementation of the if..then..elseif..else..endif
version is fairly obvious and needs little in the way of explanation. The indirect jump version, however, is probably quite mysterious to you, so let's consider how this particular implementation of the switch
statement works.
Remember that there are three common forms of the jmp
instruction. The standard unconditional jmp
instruction, like the jmp EndCase
; instruction in the previous examples, transfers control directly to the statement label specified as the jmp
operand. The second form of the jmp
instruction—jmp(
reg32
);
— transfers control to the memory location specified by the address found in a 32-bit register. The third form of the jmp
instruction, the one the previous example uses, transfers control to the instruction specified by the contents of a double-word memory location. As this example clearly illustrates, that memory location can use any addressing mode. You are not limited to the displacement-only addressing mode. Now let's consider exactly how this second implementation of the switch
statement works.
To begin with, a switch
statement requires that you create an array of pointers with each element containing the address of a statement label in your code (those labels must be attached to the sequence of instructions to execute for each case in the switch
statement). In the example above, the JmpTbl
array serves this purpose. Note that this code initializes JmpTbl
with the address of the statement labels Stmt0
, Stmt1
, and Stmt2
. The program places this array in the readonly
section because the program should never change these values during execution.
Whenever you initialize an array with a set of addresses of statement labels as in this example, the declaration section in which you declare the array (e.g., readonly
in this case) must be in the same procedure that contains the statement labels.[107]
During the execution of this code sequence, the program loads the EAX register with i
's value. Then the program uses this value as an index into the JmpTbl
array and transfers control to the 4-byte address found at the specified location. For example, if EAX contains 0, the jmp( JmpTbl[eax*4] );
instruction will fetch the double word at address JmpTbl+0 ( eax*4=0 )
. Because the first double word in the table contains the address of Stmt0
, the jmp
instruction transfers control to the first instruction following the Stmt0
label. Likewise, if i
(and therefore, EAX) contains 1, then the indirect jmp
instruction fetches the double word at offset 4 from the table and transfers control to the first instruction following the Stmt1
label (because the address of Stmt1
appears at offset 4 in the table). Finally, if i
/EAX contains 2, then this code fragment transfers control to the statements following the Stmt2
label because it appears at offset 8 in the JmpTbl
table.
You should note that as you add more (consecutive) cases, the jump table implementation becomes more efficient (in terms of both space and speed) than the if/elseif
form. Except for simple cases, the switch
statement is almost always faster and usually by a large margin. As long as the case
values are consecutive, the switch
statement version is usually smaller as well.
What happens if you need to include nonconsecutive case
labels or you cannot be sure that the switch
value doesn't go out of range? With the HLA switch
statement, such an occurrence will transfer control to the first statement after the endswitch
clause (or to a default
case, if one is present in the switch). However, this doesn't happen in the example above. If variable i
does not contain 0, 1, or 2, executing the code above produces undefined results. For example, if i
contains 5 when you execute the code in the previous example, the indirect jmp
instruction will fetch the dword at offset 20 (5 * 4) in JmpTbl
and transfer control to that address. Unfortunately, JmpTbl
doesn't have six entries; so the program will wind up fetching the value of the third double word following JmpTbl
and use that as the target address. This will often crash your program or transfer control to an unexpected location.
The solution is to place a few instructions before the indirect jmp
to verify that the switch
selection value is within some reasonable range. In the previous example, we'd probably want to verify that i
's value is in the range 0..2 before executing the jmp
instruction. If i
's value is outside this range, the program should simply jump to the endcase
label (this corresponds to dropping down to the first statement after the endswitch
clause). The following code provides this modification:
readonly JmpTbl:dword[3] := [ &Stmt0, &Stmt1, &Stmt2 ]; . . . mov( i, eax ); cmp( eax, 2 ); // Verify that i is in the range ja EndCase; // 0..2 before the indirect jmp. jmp( JmpTbl[ eax*4 ] ); Stmt0: stdout.put( "i=0" ); jmp EndCase; Stmt1: stdout.put( "i=1" ); jmp EndCase; Stmt2: stdout.put( "i=2" ); EndCase:
Although the example above handles the problem of selection values being outside the range 0..2, it still suffers from a couple of severe restrictions:
The cases must start with the value 0. That is, the minimum case
constant has to be 0 in this example.
The case values must be contiguous.
Solving the first problem is easy, and you deal with it in two steps. First, you must compare the case selection value against a lower and upper bounds before determining if the case value is legal. For example:
// SWITCH statement specifying cases 5, 6, and 7: // WARNING: This code does *NOT* work. Keep reading to find out why. mov( i, eax ); cmp( eax, 5 ); jb EndCase cmp( eax, 7 ); // Verify that i is in the range ja EndCase; // 5..7 before the indirect jmp. jmp( JmpTbl[ eax*4 ] ); Stmt5: stdout.put( "i=5" ); jmp EndCase; Stmt6: stdout.put( "i=6" ); jmp EndCase; Stmt7: stdout.put( "i=7" ); EndCase:
As you can see, this code adds a pair of extra instructions, cmp
and jb
, to test the selection value to ensure it is in the range 5..7. If not, control drops down to the EndCase
label; otherwise control transfers via the indirect jmp
instruction. Unfortunately, as the comments point out, this code is broken. Consider what happens if variable i
contains the value 5: the code will verify that 5 is in the range 5..7 and then it will fetch the dword at offset 20 (5*@size(dword)
) and jump to that address. As before, however, this loads 4 bytes outside the bounds of the table and does not transfer control to a defined location. One solution is to subtract the smallest case selection value from EAX before executing the jmp
instruction, as shown in the following example.
// SWITCH statement specifying cases 5, 6, and 7: // WARNING: There is a better way to do this. Keep reading. readonly JmpTbl:dword[3] := [ &Stmt5, &Stmt6, &Stmt7 ]; . . . mov( i, eax ); cmp( eax, 5 ); jb EndCase cmp( eax, 7 ); // Verify that i is in the range ja EndCase; // 5..7 before the indirect jmp. sub( 5, eax ); // 5->0, 6->1, 7->2. jmp( JmpTbl[ eax*4 ] ); Stmt5: stdout.put( "i=5" ); jmp EndCase; Stmt6: stdout.put( "i=6" ); jmp EndCase; Stmt7: stdout.put( "i=7" ); EndCase:
By subtracting 5 from the value in EAX, this code forces EAX to take on the value 0, 1, or 2 prior to the jmp
instruction. Therefore, case-selection value 5 jumps to Stmt5
, case-selection value 6 transfers control to Stmt6
, and case-selection value 7 jumps to Stmt7
.
There is a sneaky way to improve the code above. You can eliminate the sub
instruction by merging this subtraction into the jmp
instruction's address expression. Consider the following code that does this:
// SWITCH statement specifying cases 5, 6, and 7: readonly JmpTbl:dword[3] := [ &Stmt5, &Stmt6, &Stmt7 ]; . . . mov( i, eax ); cmp( eax, 5 ); jb EndCase cmp( eax, 7 ); // Verify that i is in the range ja EndCase; // 5..7 before the indirect jmp. jmp( JmpTbl[ eax*4 - 5*@size(dword)] ); Stmt5: stdout.put( "i=5" ); jmp EndCase; Stmt6: stdout.put( "i=6" ); jmp EndCase; Stmt7: stdout.put( "i=7" ); EndCase:
The HLA switch
statement provides a default
clause that executes if the case-selection value doesn't match any of the case values. For example:
switch( ebx ) case( 5 ) stdout.put( "ebx=5" ); case( 6 ) stdout.put( "ebx=6" ); case( 7 ) stdout.put( "ebx=7" ); default stdout.put( "ebx does not equal 5, 6, or 7" ); endswitch;
Implementing the equivalent of the default
clause in pure assembly language is very easy. Just use a different target label in the jb
and ja
instructions at the beginning of the code. The following example implements an HLA switch
statement similar to the one immediately above:
// SWITCH statement specifying cases 5, 6, and 7 with a DEFAULT clause: readonly JmpTbl:dword[3] := [ &Stmt5, &Stmt6, &Stmt7 ]; . . . mov( i, eax ); cmp( eax, 5 ); jb DefaultCase; cmp( eax, 7 ); // Verify that i is in the range ja DefaultCase; // 5..7 before the indirect jmp. jmp( JmpTbl[ eax*4 - 5*@size(dword)] ); Stmt5: stdout.put( "i=5" ); jmp EndCase; Stmt6: stdout.put( "i=6" ); jmp EndCase; Stmt7: stdout.put( "i=7" ); jmp EndCase; DefaultCase: stdout.put( "i does not equal 5, 6, or 7" ); EndCase:
The second restriction noted earlier, that the case values need to be contiguous, is easy to handle by inserting extra entries into the jump table. Consider the following HLA switch
statement:
switch( ebx ) case( 1 ) stdout.put( "ebx = 1" ); case( 2 ) stdout.put( "ebx = 2" ); case( 4 ) stdout.put( "ebx = 4" ); case( 8 ) stdout.put( "ebx = 8" ); default stdout.put( "ebx is not 1, 2, 4, or 8" ); endswitch;
The minimum switch value is 1 and the maximum value is 8. Therefore, the code before the indirect jmp
instruction needs to compare the value in EBX against 1 and 8. If the value is between 1 and 8, it's still possible that EBX might not contain a legal case-selection value. However, because the jmp
instruction indexes into a table of double words using the case-selection table, the table must have eight double-word entries. To handle the values between 1 and 8 that are not case-selection values, simply put the statement label of the default
clause (or the label specifying the first instruction after the endswitch
if there is no default
clause) in each of the jump table entries that don't have a corresponding case
clause. The following code demonstrates this technique:
readonly JmpTbl2: dword := [ &Case1, &Case2, &dfltCase, &Case4, &dfltCase, &dfltCase, &dfltCase, &Case8 ]; . . . cmp( ebx, 1 ); jb dfltCase; cmp( ebx, 8 ); ja dfltCase; jmp( JmpTbl2[ ebx*4 - 1*@size(dword) ] ); Case1: stdout.put( "ebx = 1" ); jmp EndOfSwitch; Case2: stdout.put( "ebx = 2" ); jmp EndOfSwitch; Case4: stdout.put( "ebx = 4" ); jmp EndOfSwitch; Case8: stdout.put( "ebx = 8" ); jmp EndOfSwitch; dfltCase: stdout.put( "ebx is not 1, 2, 4, or 8" ); EndOfSwitch:
There is a problem with this implementation of the switch
statement. If the case
values contain nonconsecutive entries that are widely spaced, the jump table could become exceedingly large. The following switch
statement would generate an extremely large code file:
switch( ebx ) case( 1 ) << Stmt1 >>; case( 100 ) << Stmt2 >>; case( 1_000 ) << Stmt3 >>; case( 10_000 ) << Stmt4 >>; default << Stmt5 >>; endswitch;
In this situation, your program will be much smaller if you implement the switch
statement with a sequence of if
statements rather than using an indirect jump statement. However, keep one thing in mind—the size of the jump table does not normally affect the execution speed of the program. If the jump table contains two entries or two thousand, the switch
statement will execute the multiway branch in a constant amount of time. The if
statement implementation requires a linearly increasing amount of time for each case
label appearing in the case
statement.
Probably the biggest advantage to using assembly language over an HLL like Pascal or C/C++ is that you get to choose the actual implementation of statements like switch
. In some instances you can implement a switch
statement as a sequence of if..then..elseif
statements, or you can implement it as a jump table, or you can use a hybrid of the two:
switch( eax ) case( 0 ) << Stmt0 >>; case( 1 ) << Stmt1 >>; case( 2 ) << Stmt2 >>; case( 100 ) << Stmt3 >>; default << Stmt4 >>; endswitch;
This could become
cmp( eax, 100 ); je DoStmt3; cmp( eax, 2 ); ja TheDefaultCase; jmp( JmpTbl[ eax*4 ]); ...
Of course, HLA supports the following code high-level control structures:
if( ebx = 100 ) then << Stmt3 >>; else switch( eax ) case(0) << Stmt0 >>; case(1) << Stmt1 >>; case(2) << Stmt2 >>; Otherwise << Stmt4 >>; endswitch; endif;
But this tends to destroy the readability of the program. On the other hand, the extra code to test for 100 in the assembly language code doesn't adversely affect the readability of the program (perhaps because it's so hard to read already). Therefore, most people will add the extra code to make their program more efficient.
The C/C++ switch
statement is very similar to the HLA switch
statement. There is only one major semantic difference: The programmer must explicitly place a break
statement in each case
clause to transfer control to the first statement beyond the switch
. This break
corresponds to the jmp
instruction at the end of each case
sequence in the assembly code above. If the corresponding break
is not present, C/C++ transfers control into the code of the following case
. This is equivalent to leaving off the jmp
at the end of the case
's sequence:
switch (i) { case 0: << Stmt1 >>; case 1: << Stmt2 >>; case 2: << Stmt3 >>; break; case 3: << Stmt4 >>; break; default: << Stmt5 >>; }
This translates into the following 80x86 code:
readonly JmpTbl: dword[4] := [ &case0, &case1, &case2, &case3 ]; . . . mov( i, ebx ); cmp( ebx, 3 ); ja DefaultCase; jmp( JmpTbl[ ebx*4 ]); case0: Stmt1; case1: Stmt2; case2: Stmt3; jmp EndCase; // Emitted for the break stmt. case3: Stmt4; jmp EndCase; // Emitted for the break stmt. DefaultCase: Stmt5; EndCase:
[106] However, be aware of the fact that some expressions depend on the leftmost subexpression evaluating one way in order for the rightmost subexpression to be valid; for example, a common test in C/C++ is if( x != NULL && x->y )...
[107] If the switch
statement appears in your main program, you must declare the array in the declaration section of your main program.