1.10 Introduction to the HLA Standard Library

There are two reasons HLA is much easier to learn and use than standard assembly language. The first reason is HLA's high-level syntax for declarations and control structures. This leverages your high-level language knowledge, allowing you to learn assembly language more efficiently. The other half of the equation is the HLA Standard Library. The HLA Standard Library provides many common, easy-to-use, assembly language routines that you can call without having to write this code yourself (and, more importantly, having to learn how to write yourself). This eliminates one of the larger stumbling blocks many people have when learning assembly language: the need for sophisticated I/O and support code in order to write basic statements. Prior to the advent of a standardized assembly language library, it often took considerable study before a new assembly language programmer could do as much as print a string to the display. With the HLA Standard Library, this roadblock is removed, and you can concentrate on learning assembly language concepts rather than learning low-level I/O details that are specific to a given operating system.

A wide variety of library routines is only part of HLA's support. After all, assembly language libraries have been around for quite some time.[11] HLA's Standard Library complements HLA by providing a high-level language interface to these routines. Indeed, the HLA language itself was originally designed specifically to allow the creation of a high-level set of library routines. This high-level interface, combined with the high-level nature of many of the routines in the library, packs a surprising amount of power in an easy-to-use package.

The HLA Standard Library consists of several modules organized by category. Table 1-5 lists many of the modules that are available.[12]

Table 1-5. HLA Standard Library Modules

Name

Description

args

Command-line parameter-parsing support routines.

arrays

Array declarations and operations.

bits

Bit-manipulation functions.

blobs

Binary large objects—operations on large blocks of binary data.

bsd

OS API calls for FreeBSD (HLA FreeBSD version only).

chars

Operations on character data.

console

Portable console (text screen) operations (cursor movement, screen clears, etc.).

conv

Various conversions between strings and other values.

coroutines

Support for coroutines ("cooperative multitasking").

cset

Character set functions.

DateTime

Calendar, date, and time functions.

env

Access to OS environment variables.

excepts

Exception-handling routines.

fileclass

Object-oriented file input and output.

fileio

File input and output routines.

filesys

Access to the OS file system.

hla

Special HLA constants and other values.

Linux

Linux system calls (HLA Linux version only).

lists

An HLA class for manipulating linked lists.

mac

OS API calls for Mac OS X (HLA Mac OS X version only).

math

Extended-precision arithmetic, transcendental functions, and other mathematical functions.

memmap

Memory-mapped file operations.

memory

Memory allocation, deallocation, and support code.

patterns

The HLA pattern-matching library.

random

Pseudo-random number generators and support code.

sockets

A set of network communication functions and classes.

stderr

Provides user output and several other support functions.

stdin

User input routines.

stdio

A support module for stderr, stdin, and stdout.

stdout

Provides user output and several other support routines.

strings

HLA's powerful string library.

tables

Table (associative array) support routines.

threads

Support for multithreaded applications and process synchronization.

timers

Support for timing events in an application.

win32

Constants used in Windows calls (HLA Windows version only).

x86

Constants and other items specific to the 80x86 CPU.

Later sections of this text will explain many of these modules in greater detail. This section will concentrate on the most important routines (at least to beginning HLA programmers), the stdio library.

Perhaps the first place to start is with a description of some common constants that the stdio module defines for you. Consider the following (typical) example:

stdout.put( "Hello World", nl );

The nl appearing at the end of this statement stands for newline. The nl identifier is not a special HLA reserved word, nor is it specific to the stdout.put statement. Instead, it's simply a predefined constant that corresponds to the string containing the standard end-of-line sequence (a carriage return/line feed pair under Windows or just a line feed under Linux, FreeBSD, and Mac OS X).

In addition to the nl constant, the HLA standard I/O library module defines several other useful character constants, as listed in Table 1-6.

Except for nl, these characters appear in the stdio namespace[13] (and therefore require the stdio. prefix). The placement of these ASCII constants within the stdio namespace helps avoid naming conflicts with your own variables. The nl name does not appear within a namespace because you will use it very often, and typing stdio.nl would get tiresome very quickly.

Many of the HLA I/O routines have a stdin or stdout prefix. Technically, this means that the standard library defines these names in a namespace. In practice, this prefix suggests where the input is coming from (the standard input device) or going to (the standard output device). By default, the standard input device is the system keyboard. Likewise, the default standard output device is the console display. So, in general, statements that have stdin or stdout prefixes will read and write data on the console device.

When you run a program from the command-line window (or shell), you have the option of redirecting the standard input and/or standard output devices. A command-line parameter of the form >outfile redirects the standard output device to the specified file (outfile). A command-line parameter of the form <infile redirects the standard input so that its data comes from the specified input file (infile). The following examples demonstrate how to use these parameters when running a program named testpgm in the command window:[14]

testpgm <input.data
                    testpgm >output.txt
                    testpgm <in.txt >output.txt

The stdout.newln procedure prints a newline sequence to the standard output device. This is functionally equivalent to saying stdout.put( nl );. The call to stdout.newln is sometimes a little more convenient. For example:

stdout.newln();

The stdout.puti8, stdout.puti16, and stdout.puti32 library routines print a single parameter (one byte, two bytes, or four bytes, respectively) as a signed integer value. The parameter may be a constant, a register, or a memory variable, as long as the size of the actual parameter is the same as the size of the formal parameter.

These routines print the value of their specified parameter to the standard output device. These routines will print the value using the minimum number of print positions possible. If the number is negative, these routines will print a leading minus sign. Here are some examples of calls to these routines:

stdout.puti8( 123 );
                    stdout.puti16( dx );
                    stdout.puti32( i32Var );

The stdout.puti8Size, stdout.puti16Size, and stdout.puti32Size routines output signed integer values to the standard output, just like the stdout.putiX routines. These routines, however, provide more control over the output; they let you specify the (minimum) number of print positions the value will require on output. These routines also let you specify a padding character should the print field be larger than the minimum needed to display the value. These routines require the following parameters:

stdout.puti8Size( Value8, width, padchar );
                    stdout.puti16Size( Value16, width, padchar );
                    stdout.puti32Size( Value32, width, padchar );

The Value* parameter can be a constant, a register, or a memory location of the specified size. The width parameter can be any signed integer constant that is between −256 and +256; this parameter may be a constant, register (32-bit), or memory location (32-bit). The padchar parameter should be a single-character value.

Like the stdout.putiX routines, these routines print the specified value as a signed integer constant to the standard output device. These routines, however, let you specify the field width for the value. The field width is the minimum number of print positions these routines will use when printing the value. The width parameter specifies the minimum field width. If the number would require more print positions (e.g., if you attempt to print 1234 with a field width of 2), then these routines will print however many characters are necessary to properly display the value. On the other hand, if the width parameter is greater than the number of character positions required to display the value, then these routines will print some extra padding characters to ensure that the output has at least width character positions. If the width value is negative, the number is left justified in the print field; if the width value is positive, the number is right justified in the print field.

If the absolute value of the width parameter is greater than the minimum number of print positions, then these stdout.putiXSize routines will print a padding character before or after the number. The padchar parameter specifies which character these routines will print. Most of the time you would specify a space as the pad character; for special cases, you might specify some other character. Remember, the padchar parameter is a character value; in HLA character constants are surrounded by apostrophes, not quotation marks. You may also specify an 8-bit register as this parameter.

Example 1-4 provides a short HLA program that demonstrates the use of the stdout.puti32Size routine to display a list of values in tabular form.

The stdout.put routine[15] is the one of the most flexible output routines in the standard output library module. It combines most of the other output routines into a single, easy-to-use procedure.

The generic form for the stdout.put routine is the following:

stdout.put( list_of_values_to_output );

The stdout.put parameter list consists of one or more constants, registers, or memory variables, each separated by a comma. This routine displays the value associated with each parameter appearing in the list. Because we've already been using this routine throughout this chapter, you've already seen many examples of this routine's basic form. It is worth pointing out that this routine has several additional features not apparent in the examples appearing in this chapter. In particular, each parameter can take one of the following two forms:

value
value:width

The value may be any legal constant, register, or memory variable object. In this chapter, you've seen string constants and memory variables appearing in the stdout.put parameter list. These parameters correspond to the first form above. The second parameter form above lets you specify a minimum field width, similar to the stdout.putiXSize routines.[16] The program in Example 1-5 produces the same output as the program in Example 1-4; however, Example 1-5 uses stdout.put rather than stdout.puti32Size.

The stdout.put routine is capable of much more than the few attributes this section describes. This text will introduce those additional capabilities as appropriate.

The stdin.getc routine reads the next available character from the standard input device's input buffer.[17] It returns this character in the CPU's AL register. The program in Example 1-6 demonstrates a simple use of this routine.

This program uses the stdin.ReadLn routine to force a new line of input from the user. A description of stdin.ReadLn appears in 1.10.9 The stdin.readLn and stdin.flushInput Routines.

The stdin.geti8, stdin.geti16, and stdin.geti32 routines read 8-, 16-, and 32-bit signed integer values from the standard input device. These routines return their values in the AL, AX, or EAX register, respectively. They provide the standard mechanism for reading signed integer values from the user in HLA.

Like the stdin.getc routine, these routines read a sequence of characters from the standard input buffer. They begin by skipping over any whitespace characters (spaces, tabs, and so on) and then convert the following stream of decimal digits (with an optional leading minus sign) into the corresponding integer. These routines raise an exception (that you can trap with the try..endtry statement) if the input sequence is not a valid integer string or if the user input is too large to fit in the specified integer size. Note that values read by stdin.geti8 must be in the range −128..+127; values read by stdin.geti16 must be in the range −32,768..+32,767; and values read by stdin.geti32 must be in the range −2,147,483,648..+2,147,483,647.

The sample program in Example 1-7 demonstrates the use of these routines.

You should compile and run this program and then test what happens when you enter a value that is out of range or enter an illegal string of characters.

Whenever you call an input routine like stdin.getc or stdin.geti32, the program does not necessarily read the value from the user at that moment. Instead, the HLA Standard Library buffers the input by reading a whole line of text from the user. Calls to input routines will fetch data from this input buffer until the buffer is empty. While this buffering scheme is efficient and convenient, sometimes it can be confusing. Consider the following code sequence:

stdout.put( "Enter a small integer between −128 and +127: " );
stdin.geti8();
mov( al, i8 );

stdout.put( "Enter a small integer between −32768 and +32767: " );
stdin.geti16();
mov( ax, i16 );

Intuitively, you would expect the program to print the first prompt message, wait for user input, print the second prompt message, and wait for the second user input. However, this isn't exactly what happens. For example, if you run this code (from the sample program in the previous section) and enter the text 123 456 in response to the first prompt, the program will not stop for additional user input at the second prompt. Instead, it will read the second integer (456) from the input buffer read during the execution of the stdin.geti16 call.

In general, the stdin routines read text from the user only when the input buffer is empty. As long as the input buffer contains additional characters, the input routines will attempt to read their data from the buffer. You can take advantage of this behavior by writing code sequences such as the following:

stdout.put( "Enter two integer values: " );
stdin.geti32();
mov( eax, intval );
stdin.geti32();
mov( eax, AnotherIntVal );

This sequence allows the user to enter both values on the same line (separated by one or more whitespace characters), thus preserving space on the screen. So the input buffer behavior is desirable every now and then. The buffered behavior of the input routines can be counterintuitive at other times.

Fortunately, the HLA Standard Library provides two routines, stdin.readLn and stdin.flushInput, that let you control the standard input buffer. The stdin.readLn routine discards everything that is in the input buffer and immediately requires the user to enter a new line of text. The stdin.flushInput routine simply discards everything that is in the buffer. The next time an input routine executes, the system will require a new line of input from the user. You would typically call stdin.readLn immediately before some standard input routine; you would normally call stdin.flushInput immediately after a call to a standard input routine.

Note

If you are calling stdin.readLn and you find that you are having to input your data twice, this is a good indication that you should be calling stdin.flushInput rather than stdin.readLn. In general, you should always be able to call stdin.flushInput to flush the input buffer and read a new line of data on the next input call. The stdin.readLn routine is rarely necessary, so you should use stdin.flushInput unless you really need to immediately force the input of a new line of text.

The stdin.get routine combines many of the standard input routines into a single call, just as the stdout.put combines all of the output routines into a single call. Actually, stdin.get is a bit easier to use than stdout.put because the only parameters to this routine are a list of variable names.

Let's rewrite the example given in the previous section:

stdout.put( "Enter two integer values: " );
stdin.geti32();
mov( eax, intval );
stdin.geti32();
mov( eax, AnotherIntVal );

Using the stdin.get routine, we could rewrite this code as:

stdout.put( "Enter two integer values: " );
stdin.get( intval, AnotherIntVal );

As you can see, the stdin.get routine is a little more convenient to use.

Note that stdin.get stores the input values directly into the memory variables you specify in the parameter list; it does not return the values in a register unless you actually specify a register as a parameter. The stdin.get parameters must all be variables or registers.



[11] For example, see the UCR Standard Library for 80x86 Assembly Language Programmers.

[12] Because the HLA Standard Library is expanding, this list is probably out of date. See the HLA documentation for a current list of Standard Library modules.

[13] Namespaces are the subject of Chapter 5.

[14] For Linux, FreeBSD, and Mac OS X users, depending on how your system is set up, you may need to type ./ in front of the program's name to actually execute the program (e.g., ./testpgm <input.data).

[15] stdout.put is actually a macro, not a procedure. The distinction between the two is beyond the scope of this chapter. Chapter 9 describes their differences.

[16] Note that you cannot specify a padding character when using the stdout.put routine; the padding character defaults to the space character. If you need to use a different padding character, call the stdout.putiXSize routines.

[17] Buffer is just a fancy term for an array.