1.7 The Memory Subsystem

A typical 80x86 processor running a modern 32-bit OS can access a maximum of 232 different memory locations, or just over 4 billion bytes. A few years ago, 4 gigabytes of memory would have seemed like infinity; modern machines, however, exceed this limit. Nevertheless, because the 80x86 architecture supports a maximum 4GB address space when using a 32-bit operating system like Windows, Mac OS X, FreeBSD, or Linux, the following discussion will assume the 4GB limit.

Of course, the first question you should ask is, "What exactly is a memory location?" The 80x86 supports byte-addressable memory. Therefore, the basic memory unit is a byte, which is sufficient to hold a single character or a (very) small integer value (we'll talk more about that in Chapter 2).

Think of memory as a linear array of bytes. The address of the first byte is 0 and the address of the last byte is 232−1. For an 80x86 processor, the following pseudo-Pascal array declaration is a good approximation of memory:

Memory: array [0..4294967295] of byte;

C/C++ and Java users might prefer the following syntax:

byte Memory[4294967296];

To execute the equivalent of the Pascal statement Memory [125] := 0; the CPU places the value 0 on the data bus, places the address 125 on the address bus, and asserts the write line (this generally involves setting that line to 0), as shown in Figure 1-7.

Memory write operation

Figure 1-7. Memory write operation

To execute the equivalent of CPU := Memory [125]; the CPU places the address 125 on the address bus, asserts the read line (because the CPU is reading data from memory), and then reads the resulting data from the data bus (see Figure 1-8).

Memory read operation

Figure 1-8. Memory read operation

This discussion applies only when accessing a single byte in memory. So what happens when the processor accesses a word or a double word? Because memory consists of an array of bytes, how can we possibly deal with values larger than a single byte? Easy—to store larger values, the 80x86 uses a sequence of consecutive memory locations. Figure 1-9 shows how the 80x86 stores bytes, words (2 bytes), and double words (4 bytes) in memory. The memory address of each of these objects is the address of the first byte of each object (that is, the lowest address).

Modern 80x86 processors don't actually connect directly to memory. Instead, there is a special memory buffer on the CPU known as the cache (pronounced "cash") that acts as a high-speed intermediary between the CPU and main memory. Although the cache handles the details automatically for you, one fact you should know is that accessing data objects in memory is sometimes more efficient if the address of the object is an even multiple of the object's size. Therefore, it's a good idea to align 4-byte objects (double words) on addresses that are multiples of 4. Likewise, it's most efficient to align 2-byte objects on even addresses. You can efficiently access single-byte objects at any address. You'll see how to set the alignment of memory objects in 3.4 HLA Support for Data Alignment.

Byte, word, and double-word storage in memory

Figure 1-9. Byte, word, and double-word storage in memory

Before leaving this discussion of memory objects, it's important to understand the correspondence between memory and HLA variables. One of the nice things about using an assembler/compiler like HLA is that you don't have to worry about numeric memory addresses. All you need to do is declare a variable in HLA, and HLA takes care of associating that variable with some unique set of memory addresses. For example, if you have the following declaration section:

static
     i8          :int8;
     i16         :int16;
     i32         :int32;

HLA will find some unused 8-bit byte in memory and associate it with the i8 variable; it will find a pair of consecutive unused bytes and associate i16 with them; finally, HLA will find 4 consecutive unused bytes and associate the value of i32 with those 4 bytes (32 bits). You'll always refer to these variables by their name. You generally don't have to concern yourself with their numeric address. Still, you should be aware that HLA is doing this for you behind your back.