An operating system like Mac OS X, FreeBSD, Linux, or Windows tends to put different types of data into different sections (or segments) of memory. Although it is possible to reconfigure memory to your choice by running the linker and specifying various parameters, by default Windows loads an HLA program into memory using the organization appearing in Figure 3-7 (Linux, Mac OS X, and FreeBSD are similar, though they rearrange some of the sections).
The operating system reserves the lowest memory addresses. Generally, your application cannot access data (or execute instructions) at these low addresses. One reason the operating system reserves this space is to help trap NULL pointer references. If you attempt to access memory location 0, the operating system will generate a general protection fault, meaning you've accessed a memory location that doesn't contain valid data. Because programmers often initialize pointers to NULL (0) to indicate that the pointer is not pointing anywhere, an access of location 0 typically means that the programmer has made a mistake and has not properly initialized a pointer to a legal (non-NULL) value.
The remaining six areas in the memory map hold different types of data associated with your program. These sections of memory include the stack
section, the heap
section, the code
section, the readonly
section, the static
section, and the storage
section. Each of these memory sections correspond to some type of data you can create in your HLA programs. Each section is discussed in detail below.
The code
section contains the machine instructions that appear in an HLA program. HLA translates each machine instruction you write into a sequence of one or more byte values. The CPU interprets these byte values as machine instructions during program execution.
By default, when HLA links your program it tells the system that your program can execute instructions in the code segment and you can read data from the code segment. Note, specifically, that you cannot write data to the code segment. The operating system will generate a general protection fault if you attempt to store any data into the code segment.
Remember, machine instructions are nothing more than data bytes. In theory, you could write a program that stores data values into memory and then transfers control to the data it just wrote, thereby producing a program that writes itself as it executes. This possibility produces romantic visions of Artificial Intelligence programs that modify themselves to produce some desired result. In real life, the effect is somewhat less glamorous. Generally, self-modifying programs are very difficult to debug because the instructions are constantly changing behind the programmer's back. Because most modern operating systems make it very difficult to write self-modifying programs, we will not consider them any further in this text.
HLA automatically stores the data associated with your machine code into the code section. In addition to machine instructions, you can also store data into the code section by using the following pseudo-opcodes:[37]
|
|
|
|
|
|
|
|
|
|
|
The following byte
statement exemplifies the syntax for each of these pseudo-opcodes:
byte comma_separated_list_of_byte_constants
;
Here are some examples:
boolean true; char 'A'; byte 0, 1, 2; byte "Hello", 0 word 0, 2; int8 −5; uns32 356789, 0;
If more than one value appears in the list of values after the pseudo-opcode, HLA emits each successive value to the code stream. So the first byte
statement above emits 3 bytes to the code stream, the values 0, 1, and 2. If a string appears within a byte
statement, HLA emits 1 byte of data for each character in the string. Therefore, the second byte statement above emits 6 bytes: the characters H
, e
, l
, l
, and o
, followed by a 0
byte.
Keep in mind that the CPU will attempt to treat data you emit to the code stream as machine instructions unless you take special care not to allow the execution of the data. For example, if you write something like the following:
mov( 0, ax ); byte 0,1,2,3; add( bx, cx );
your program will attempt to execute the 0
, 1
, 2
, and 3
byte values as machine instructions after executing the mov
. Unless you know the machine code for a particular instruction sequence, sticking such data values into the middle of your code will generally crash your program. Typically when you place such data in your programs, you'll execute some code that transfers control around the data.
The static
section is where you will typically declare your variables. Although the static
section syntactically appears as part of a program or procedure, keep in mind that HLA moves all static variables to the static
section in memory. Therefore, HLA does not sandwich the variables you declare in the static
section between procedures in the code
section.
In addition to declaring static variables, you can also embed lists of data into the static
declaration section. You use the same technique to embed data into your static
section that you use to embed data into the code
section: You use the byte
, word
, dword
, uns32
, and so on pseudo-opcodes. Consider the following example:
static b: byte := 0; byte 1,2,3; u: uns32 := 1; uns32 5,2,10; c: char; char 'a', 'b', 'c', 'd', 'e', 'f'; bn: boolean; boolean true;
Data that HLA writes to the static
memory segment using these pseudo-opcodes is written to the segment after the preceding variables. For example, the byte values 1
, 2
, and 3
are emitted to the static
section after b
's 0
byte. Because there aren't any labels associated with these values, you do not have direct access to these values in your program. You can use the indexed addressing modes to access these extra values (examples appear in Chapter 4).
In the examples above, note that the c
and bn
variables do not have an (explicit) initial value. However, if you don't provide an initial value, HLA will initialize the variables in the static
section to all 0 bits, so HLA assigns the NUL character (ASCII code 0) to c
as its initial value. Likewise, HLA assigns false as the initial value for bn
. In particular, you should note that your variable declarations in the static
section always consume memory, even if you haven't assigned them an initial value.
The readonly
data section holds constants, tables, and other data that your program cannot change during execution. You create read-only objects by declaring them in the readonly
declaration section. The readonly
section is very similar to the static
section with three primary differences:
Here's an example:
readonly pi: real32 := 3.14159; e: real32 := 2.71; MaxU16: uns16 := 65_535; MaxI16: int16 := 32_767;
All readonly
object declarations must have an initializer because you cannot initialize the value under program control.[38] For all intents and purposes, you can think of readonly
objects as constants. However, these constants consume memory, and other than the fact that you cannot write data to readonly
objects, they behave like static
variables. Because they behave like static
objects, you cannot use a readonly
object everywhere a constant is allowed; in particular, readonly
objects are memory objects, so you cannot supply a readonly
object (which you are treating like a constant) and some other memory object as the operands to an instruction.
As with the static
section, you may embed data values in the readonly
section using the byte
, word
, dword
, and so on data declarations. For example:
readonly roArray: byte := 0; byte 1, 2, 3, 4, 5; qwVal: qword := 1; qword 0;
The readonly
section requires that you initialize all objects you declare. The static
section lets you optionally initialize objects (or leave them uninitialized, in which case they have the default initial value of 0). The storage
section completes the initialization coverage: you use it to declare variables that are always uninitialized when the program begins running. The storage
section begins with the storage
reserved word and contains variable declarations without initializers. Here is an example:
storage UninitUns32: uns32; i: int32; character: char; b: byte;
Linux, FreeBSD, Mac OS X, and Windows will initialize all storage objects to 0 when they load your program into memory. However, it's probably not a good idea to depend on this implicit initialization. If you need an object initialized with 0, declare it in a static
section and explicitly set it to 0.
Variables you declare in the storage
section may consume less disk space in the executable file for the program. This is because HLA writes out initial values for readonly
and static
objects to the executable file, but it may use a compact representation for uninitialized variables you declare in the storage
section; note, however, that this behavior is OS- and object-module-format dependent.
Because the storage
section does not allow initialized values, you cannot put unlabeled values in the storage
section using the byte
, word
, dword
, and so on pseudo-opcodes.
The @nostorage
attribute lets you declare variables in the static data declaration sections (i.e., static
, readonly
, and storage
) without actually allocating memory for the variable. The @nostorage
option tells HLA to assign the current address in a declaration section to a variable but not to allocate any storage for the object. That variable will share the same memory address as the next object appearing in the variable declaration section. Here is the syntax for the @nostorage
option:
variableName
:varType
; @nostorage;
Note that you follow the type name with @nostorage;
rather than some initial value or just a semicolon. The following code sequence provides an example of using the @nostorage
option in the readonly
section:
readonly abcd: dword; nostorage; byte 'a', 'b', 'c', 'd';
In this example, abcd
is a double word whose L.O. byte contains 97 ('a'
), byte 1 contains 98 ('b'
), byte 2 contains 99 ('c'
), and the H.O. byte contains 100 ('d'
). HLA does not reserve storage for the abcd
variable, so HLA associates the following 4 bytes in memory (allocated by the byte
directive) with abcd
.
Note that the @nostorage
attribute is legal only in the static
, storage
, and readonly
sections (the so-called static declarations sections). HLA does not allow its use in the var
section that you'll read about next.
HLA provides another variable declaration section, the var
section, that you can use to create automatic variables. Your program will allocate storage for automatic variables whenever a program unit (i.e., main program or procedure) begins execution, and it will deallocate storage for automatic variables when that program unit returns to its caller. Of course, any automatic variables you declare in your main program have the same lifetime [39] as all the static
, readonly
, and storage
objects, so the automatic allocation feature of the var
section is wasted in the main program. In general, you should use automatic objects only in procedures (see Chapter 5 for details). HLA allows them in your main program's declaration section as a generalization.
Because variables you declare in the var
section are created at runtime, HLA does not allow initializers on variables you declare in this section. So the syntax for the var
section is nearly identical to that for the storage
section; the only real difference in the syntax between the two is the use of the var
reserved word rather than the storage
reserved word.[40] The following example illustrates this:
var vInt: int32; vChar: char;
HLA allocates variables you declare within the var
section within the stack
memory section. HLA does not allocate var
objects at fixed locations; instead, it allocates these variables in an activation record associated with the current program unit. Chapter 5 discusses activation records in greater detail; for now it is important only to realize that HLA programs use the EBP register as a pointer to the current activation record. Therefore, whenever you access a var
object, HLA automatically replaces the variable name with [EBP±
displacement
]
. Displacement is the offset of the object within the activation record. This means that you cannot use the full scaled-indexed addressing mode (a base register plus a scaled index register) with var
objects because var
objects already use the EBP register as their base register. Although you will not directly use the two register addressing modes often, the fact that the var
section has this limitation is a good reason to avoid using the var
section in your main program.
The static
, readonly
, storage
, and var
sections may appear zero or more times between the program
header and the associated begin
for the main program. Between these two points in your program, the declaration sections may appear in any order, as the following example demonstrates:
program demoDeclarations; static i_static: int32; var i_auto: int32; storage i_uninit: int32; readonly i_readonly: int32 := 5; static j: uns32; var k: char; readonly i2: uns8 := 9; storage c: char; storage d: dword; begin demoDeclarations; << Code goes here. >> end demoDeclarations;
In addition to demonstrating that the sections may appear in an arbitrary order, this section also demonstrates that a given declaration section may appear more than once in your program. When multiple declaration sections of the same type (for example, the three storage
sections above) appear in a declaration section of your program, HLA combines them into a single group.
[37] This isn't a complete list. HLA generally allows you to use any scalar data type name as a statement to reserve storage in the code section. You'll learn more about the available data types in Chapter 4.
[39] The lifetime of a variable is the point from which memory is first allocated to the point the memory is deallocated for that variable.
[40] Actually, there are a few other, minor, differences, but we won't deal with those differences in this text. See the HLA language reference manual for more details.