4.20 Accessing Elements of a Single-Dimensional Array

To access an element of a zero-based array, you can use the simplified formula

Element_Address = Base_Address + index * Element_Size

For the Base_Address entry you can use the name of the array (because HLA associates the address of the first element of an array with the name of that array). The Element_Size entry is the number of bytes for each array element. If the object is an array of bytes, the Element_Size field is 1 (resulting in a very simple computation). If each element of the array is a word (or other 2-byte type), then Element_Size is 2, and so on. To access an element of the SixteenInts array in the previous section, you'd use the following formula (the size is 4 because each element is an int32 object):

Element_Address = SixteenInts + index*4

The 80x86 code equivalent to the statement eax := SixteenInts[index] is

mov( index, ebx );
               shl( 2, ebx );          // Sneaky way to compute 4*ebx
               mov( SixteenInts[ ebx ], eax );

There are two important things to notice here. First of all, this code uses the shl instruction rather than the intmul instruction to compute 4*index. The main reason for choosing shl is that it was more efficient. It turns out that shl is a lot faster than intmul on many processors.

The second thing to note about this instruction sequence is that it does not explicitly compute the sum of the base address plus the index times 4. Instead, it relies on the indexed addressing mode to implicitly compute this sum. The instruction mov( SixteenInts[ ebx ], eax ); loads EAX from location SixteenInts + ebx, which is the base address plus index*4 (because EBX contains index*4). Sure, you could have used

lea( eax, SixteenInts );
               mov( index, ebx );
               shl( 2, ebx );            // Sneaky way to compute 4*ebx
               add( eax, ebx );          // Compute base address plus index*4
               mov( [ebx], eax );

in place of the previous sequence, but why use five instructions where three will do the same job? This is a good example of why you should know your addressing modes inside and out. Choosing the proper addressing mode can reduce the size of your program, thereby speeding it up.

Of course, as long as we're discussing efficiency improvements, it's worth pointing out that the 80x86 scaled indexed addressing modes let you automatically multiply an index by 1, 2, 4, or 8. Because this current example multiplies the index by 4, we can simplify the code even more by using the scaled indexed addressing mode:

mov( index, ebx );
               mov( SixteenInts[ ebx*4 ], eax );

Note, however, that if you need to multiply by some constant other than 1, 2, 4 or 8, then you cannot use the scaled indexed addressing modes. Similarly, if you need to multiply by some element size that is not a power of 2, you will not be able to use the shl instruction to multiply the index by the element size; instead, you will have to use intmul or some other instruction sequence to do the multiplication.

The indexed addressing mode on the 80x86 is a natural for accessing elements of a single-dimensional array. Indeed, its syntax even suggests an array access. The important thing to keep in mind is that you must remember to multiply the index by the size of an element. Failure to do so will produce incorrect results.