Chapter 2. DATA REPRESENTATION

DATA REPRESENTATION

A major stumbling block many beginners encounter when attempting to learn assembly language is the common use of the binary and hexadecimal numbering systems. Although hexadecimal numbers are a little strange, their advantages outweigh their disadvantages by a large margin. Understanding the binary and hexadecimal numbering systems is important because their use simplifies the discussion of other topics, including bit operations, signed numeric representation, character codes, and packed data.

This chapter discusses several important concepts, including:

The binary and hexadecimal numbering systems
Binary data organization (bits, nibbles, bytes, words, and double words)
Signed and unsigned numbering systems
Arithmetic, logical, shift, and rotate operations on binary values
Bit fields and packed data

This is basic material, and the remainder of this text depends on your understanding these concepts. If you are already familiar with these terms from other courses or study, you should at least skim this material before proceeding to the next chapter. If you are unfamiliar with this material, or only vaguely familiar with it, you should study it carefully before proceeding. All of the material in this chapter is important! Do not skip over any material.

2.1 Numbering Systems

Most modern computer systems do not represent numeric values using the decimal (base-10) system. Instead, they typically use a binary or two's complement numbering system.

2.1.1 A Review of the Decimal System

You've been using the decimal numbering system for so long that you probably take it for granted. When you see a number like 123, you don't think about the value 123; rather, you generate a mental image of how many items this value represents. In reality, however, the number 123 represents:

1*10² + 2*10¹ + 3*10⁰

100 + 20 + 3

In a decimal positional numbering system, each digit appearing to the left of the decimal point represents a value between 0 and 9 times an increasing power of 10. Digits appearing to the right of the decimal point represent a value between 0 and 9 times an increasing negative power of 10. For example, the value 123.456 means:

1*10² + 2*10¹ + 3*10⁰ + 4*10⁻¹ + 5*10⁻² + 6*10⁻³

100 + 20 + 3 + 0.4 + 0.05 + 0.006

2.1.2 The Binary Numbering System

Most modern computer systems operate using binary logic. The computer represents values using two voltage levels (usually 0v and +2.4..5v). Two such levels can represent exactly two unique values. These could be any two different values, but they typically represent the values 0 and 1. These values, coincidentally, correspond to the two digits in the binary numbering system.

The binary numbering system works just like the decimal numbering system, with two exceptions: Binary allows only the digits 0 and 1 (rather than 0..9), and binary uses powers of 2 rather than powers of 10. Therefore, it is very easy to convert a binary number to decimal. For each 1 in the binary string, add in 2n where n is the zero-based position of the binary digit. For example, the binary value 11001010₂ represents:

1*2⁷ + 1*2⁶ + 0*2⁵ + 0*2⁴ + 1*2³ + 0*2² + 1*2¹ + 0*2⁰

=

128 + 64 + 8 + 2

=

202₁₀

To convert decimal to binary is slightly more difficult. You must find those powers of 2 that, when added together, produce the decimal result.

A simple way to convert decimal to binary is the even/odd - divide by two algorithm. This algorithm uses the following steps:

If the number is even, emit a 0. If the number is odd, emit a 1.
Divide the number by 2 and throw away any fractional component or remainder.
If the quotient is 0, the algorithm is complete.
If the quotient is not 0 and is odd, insert a 1 before the current string; if the number is even, prefix your binary string with 0.
Go back to step 2 and repeat.

Binary numbers, although they have little importance in high-level languages, appear everywhere in assembly language programs. So you should be somewhat comfortable with them.

2.1.3 Binary Formats

In the purest sense, every binary number contains an infinite number of digits (or bits, which is short for binary digits). For example, we can represent the number 5 by any of the following:

101 00000101 0000000000101 ...000000000000101

Any number of leading zero digits may precede the binary number without changing its value.

We will adopt the convention of ignoring any leading zeros present in a value. For example, 101₂ represents the number 5 but because the 80x86 typically works with groups of 8 bits, we'll find it much easier to zero extend all binary numbers to some multiple of 4 or 8 bits. Therefore, following this convention, we'd represent the number 5 as 0101₂ or 00000101₂.

In the United States, most people separate every three digits with a comma to make larger numbers easier to read. For example, 1,023,435,208 is much easier to read and comprehend than 1023435208. We'll adopt a similar convention in this text for binary numbers. We will separate each group of four binary bits with an underscore. For example, we will write the binary value 1010111110110010 as 1010_1111_1011_0010.

We'll number each bit as follows:

The rightmost bit in a binary number is bit position 0.
Each bit to the left is given the next successive bit number.
An 8-bit binary value uses bits 0..7:
X₇ X₆ X₅ X₄ X₃ X₂ X₁ X₀
A 16-bit binary value uses bit positions 0..15:
X₁₅ X₁₄ X₁₃ X₁₂ X₁₁ X₁₀ X₉ X₈ X₇ X₆ X₅ X₄ X₃ X₂ X₁ X₀
A 32-bit binary value uses bit positions 0..31, and so on.

Bit 0 is the low-order (L.O.) bit (some refer to this as the least significant bit). The leftmost bit is called the high-order (H.O.) bit (or the most significant bit). We'll refer to the intermediate bits by their respective bit numbers.