3.2 Runtime Memory Organization

An operating system like Mac OS X, FreeBSD, Linux, or Windows tends to put different types of data into different sections (or segments) of memory. Although it is possible to reconfigure memory to your choice by running the linker and specifying various parameters, by default Windows loads an HLA program into memory using the organization appearing in Figure 3-7 (Linux, Mac OS X, and FreeBSD are similar, though they rearrange some of the sections).

HLA typical runtime memory organization

Figure 3-7. HLA typical runtime memory organization

The operating system reserves the lowest memory addresses. Generally, your application cannot access data (or execute instructions) at these low addresses. One reason the operating system reserves this space is to help trap NULL pointer references. If you attempt to access memory location 0, the operating system will generate a general protection fault, meaning you've accessed a memory location that doesn't contain valid data. Because programmers often initialize pointers to NULL (0) to indicate that the pointer is not pointing anywhere, an access of location 0 typically means that the programmer has made a mistake and has not properly initialized a pointer to a legal (non-NULL) value.

The remaining six areas in the memory map hold different types of data associated with your program. These sections of memory include the stack section, the heap section, the code section, the readonly section, the static section, and the storage section. Each of these memory sections correspond to some type of data you can create in your HLA programs. Each section is discussed in detail below.

The code section contains the machine instructions that appear in an HLA program. HLA translates each machine instruction you write into a sequence of one or more byte values. The CPU interprets these byte values as machine instructions during program execution.

By default, when HLA links your program it tells the system that your program can execute instructions in the code segment and you can read data from the code segment. Note, specifically, that you cannot write data to the code segment. The operating system will generate a general protection fault if you attempt to store any data into the code segment.

Remember, machine instructions are nothing more than data bytes. In theory, you could write a program that stores data values into memory and then transfers control to the data it just wrote, thereby producing a program that writes itself as it executes. This possibility produces romantic visions of Artificial Intelligence programs that modify themselves to produce some desired result. In real life, the effect is somewhat less glamorous. Generally, self-modifying programs are very difficult to debug because the instructions are constantly changing behind the programmer's back. Because most modern operating systems make it very difficult to write self-modifying programs, we will not consider them any further in this text.

HLA automatically stores the data associated with your machine code into the code section. In addition to machine instructions, you can also store data into the code section by using the following pseudo-opcodes:[37]

byte

int8

word

int16

dword

in32

uns8

boolean

uns16

char

uns32

 

The following byte statement exemplifies the syntax for each of these pseudo-opcodes:

byte comma_separated_list_of_byte_constants ;

Here are some examples:

boolean     true;
     char        'A';
     byte        0, 1, 2;
     byte        "Hello", 0
     word        0, 2;
     int8        −5;
     uns32       356789, 0;

If more than one value appears in the list of values after the pseudo-opcode, HLA emits each successive value to the code stream. So the first byte statement above emits 3 bytes to the code stream, the values 0, 1, and 2. If a string appears within a byte statement, HLA emits 1 byte of data for each character in the string. Therefore, the second byte statement above emits 6 bytes: the characters H, e, l, l, and o, followed by a 0 byte.

Keep in mind that the CPU will attempt to treat data you emit to the code stream as machine instructions unless you take special care not to allow the execution of the data. For example, if you write something like the following:

mov( 0, ax );
          byte 0,1,2,3;
          add( bx, cx );

your program will attempt to execute the 0, 1, 2, and 3 byte values as machine instructions after executing the mov. Unless you know the machine code for a particular instruction sequence, sticking such data values into the middle of your code will generally crash your program. Typically when you place such data in your programs, you'll execute some code that transfers control around the data.

The static section is where you will typically declare your variables. Although the static section syntactically appears as part of a program or procedure, keep in mind that HLA moves all static variables to the static section in memory. Therefore, HLA does not sandwich the variables you declare in the static section between procedures in the code section.

In addition to declaring static variables, you can also embed lists of data into the static declaration section. You use the same technique to embed data into your static section that you use to embed data into the code section: You use the byte, word, dword, uns32, and so on pseudo-opcodes. Consider the following example:

static
     b:   byte := 0;
          byte 1,2,3;

     u:   uns32 := 1;
          uns32 5,2,10;

     c:   char;
          char 'a', 'b', 'c', 'd', 'e', 'f';

     bn:  boolean;
          boolean true;

Data that HLA writes to the static memory segment using these pseudo-opcodes is written to the segment after the preceding variables. For example, the byte values 1, 2, and 3 are emitted to the static section after b's 0 byte. Because there aren't any labels associated with these values, you do not have direct access to these values in your program. You can use the indexed addressing modes to access these extra values (examples appear in Chapter 4).

In the examples above, note that the c and bn variables do not have an (explicit) initial value. However, if you don't provide an initial value, HLA will initialize the variables in the static section to all 0 bits, so HLA assigns the NUL character (ASCII code 0) to c as its initial value. Likewise, HLA assigns false as the initial value for bn. In particular, you should note that your variable declarations in the static section always consume memory, even if you haven't assigned them an initial value.

The readonly data section holds constants, tables, and other data that your program cannot change during execution. You create read-only objects by declaring them in the readonly declaration section. The readonly section is very similar to the static section with three primary differences:

Here's an example:

readonly
     pi:              real32 := 3.14159;
     e:               real32 := 2.71;
     MaxU16:          uns16 := 65_535;
     MaxI16:          int16 := 32_767;

All readonly object declarations must have an initializer because you cannot initialize the value under program control.[38] For all intents and purposes, you can think of readonly objects as constants. However, these constants consume memory, and other than the fact that you cannot write data to readonly objects, they behave like static variables. Because they behave like static objects, you cannot use a readonly object everywhere a constant is allowed; in particular, readonly objects are memory objects, so you cannot supply a readonly object (which you are treating like a constant) and some other memory object as the operands to an instruction.

As with the static section, you may embed data values in the readonly section using the byte, word, dword, and so on data declarations. For example:

readonly
     roArray: byte := 0;
              byte 1, 2, 3, 4, 5;
     qwVal:   qword := 1;
              qword 0;

The readonly section requires that you initialize all objects you declare. The static section lets you optionally initialize objects (or leave them uninitialized, in which case they have the default initial value of 0). The storage section completes the initialization coverage: you use it to declare variables that are always uninitialized when the program begins running. The storage section begins with the storage reserved word and contains variable declarations without initializers. Here is an example:

storage
     UninitUns32:     uns32;
     i:               int32;
     character:       char;
     b:               byte;

Linux, FreeBSD, Mac OS X, and Windows will initialize all storage objects to 0 when they load your program into memory. However, it's probably not a good idea to depend on this implicit initialization. If you need an object initialized with 0, declare it in a static section and explicitly set it to 0.

Variables you declare in the storage section may consume less disk space in the executable file for the program. This is because HLA writes out initial values for readonly and static objects to the executable file, but it may use a compact representation for uninitialized variables you declare in the storage section; note, however, that this behavior is OS- and object-module-format dependent.

Because the storage section does not allow initialized values, you cannot put unlabeled values in the storage section using the byte, word, dword, and so on pseudo-opcodes.

The @nostorage attribute lets you declare variables in the static data declaration sections (i.e., static, readonly, and storage) without actually allocating memory for the variable. The @nostorage option tells HLA to assign the current address in a declaration section to a variable but not to allocate any storage for the object. That variable will share the same memory address as the next object appearing in the variable declaration section. Here is the syntax for the @nostorage option:

variableName: varType; @nostorage;

Note that you follow the type name with @nostorage; rather than some initial value or just a semicolon. The following code sequence provides an example of using the @nostorage option in the readonly section:

readonly
     abcd: dword; nostorage;
           byte 'a', 'b', 'c', 'd';

In this example, abcd is a double word whose L.O. byte contains 97 ('a'), byte 1 contains 98 ('b'), byte 2 contains 99 ('c'), and the H.O. byte contains 100 ('d'). HLA does not reserve storage for the abcd variable, so HLA associates the following 4 bytes in memory (allocated by the byte directive) with abcd.

Note that the @nostorage attribute is legal only in the static, storage, and readonly sections (the so-called static declarations sections). HLA does not allow its use in the var section that you'll read about next.

HLA provides another variable declaration section, the var section, that you can use to create automatic variables. Your program will allocate storage for automatic variables whenever a program unit (i.e., main program or procedure) begins execution, and it will deallocate storage for automatic variables when that program unit returns to its caller. Of course, any automatic variables you declare in your main program have the same lifetime [39] as all the static, readonly, and storage objects, so the automatic allocation feature of the var section is wasted in the main program. In general, you should use automatic objects only in procedures (see Chapter 5 for details). HLA allows them in your main program's declaration section as a generalization.

Because variables you declare in the var section are created at runtime, HLA does not allow initializers on variables you declare in this section. So the syntax for the var section is nearly identical to that for the storage section; the only real difference in the syntax between the two is the use of the var reserved word rather than the storage reserved word.[40] The following example illustrates this:

var
     vInt:      int32;
     vChar:     char;

HLA allocates variables you declare within the var section within the stack memory section. HLA does not allocate var objects at fixed locations; instead, it allocates these variables in an activation record associated with the current program unit. Chapter 5 discusses activation records in greater detail; for now it is important only to realize that HLA programs use the EBP register as a pointer to the current activation record. Therefore, whenever you access a var object, HLA automatically replaces the variable name with [EBP±displacement]. Displacement is the offset of the object within the activation record. This means that you cannot use the full scaled-indexed addressing mode (a base register plus a scaled index register) with var objects because var objects already use the EBP register as their base register. Although you will not directly use the two register addressing modes often, the fact that the var section has this limitation is a good reason to avoid using the var section in your main program.

The static, readonly, storage, and var sections may appear zero or more times between the program header and the associated begin for the main program. Between these two points in your program, the declaration sections may appear in any order, as the following example demonstrates:

program demoDeclarations;

static
     i_static:     int32;

var
     i_auto:       int32;

storage
     i_uninit:     int32;

readonly
     i_readonly:   int32 := 5;

static
     j:            uns32;

var
     k:            char;

readonly
     i2:           uns8 := 9;

storage
     c:            char;

storage
     d:            dword;

begin demoDeclarations;

     << Code goes here. >>

end demoDeclarations;

In addition to demonstrating that the sections may appear in an arbitrary order, this section also demonstrates that a given declaration section may appear more than once in your program. When multiple declaration sections of the same type (for example, the three storage sections above) appear in a declaration section of your program, HLA combines them into a single group.



[37] This isn't a complete list. HLA generally allows you to use any scalar data type name as a statement to reserve storage in the code section. You'll learn more about the available data types in Chapter 4.

[38] There is one exception you'll see in Chapter 5.

[39] The lifetime of a variable is the point from which memory is first allocated to the point the memory is deallocated for that variable.

[40] Actually, there are a few other, minor, differences, but we won't deal with those differences in this text. See the HLA language reference manual for more details.