Beginning Programming All-In-One Desk Reference For Dummies

Chapter 1: Structures and Arrays

In This Chapter

bullet Using structures to store and retrieve data

bullet Creating arrays

bullet Using resizable arrays

bullet Running multi-dimensional arrays

bullet Combining structures with arrays

bullet Detailing the drawbacks of arrays

All programs need to store data. If a program asks the user to type in her name, the program needs to store that name somewhere so it can find the name again. The most common way programs store data is to dump data in a variable.

Unfortunately, a variable can only hold one chunk of data at a time, such as a single number or a name. If you want to store a person’s first and last name along with their age, you have to create three separate variables, such as

Dim FirstName as String

Dim LastName as String

Dim Age as Integer

Creating separate variables to store related data can be like carrying around three separate wallets with one wallet holding your cash, a second wallet holding your credit cards, and a third wallet holding your driver’s license. Just as it’s more convenient to store your cash, credit cards, and driver’s license in a single wallet, so it’s also more convenient to store related data in a single variable. Two ways to store related data in one place are structures and arrays.

Because structures and arrays are two other ways to store data, they’re often called data structures.

Using Structures

A structure (also dubbed a record in some programming languages) does nothing more than group separate variables together. So rather than create and try to keep track of three separate variables, a structure lets you store multiple variables within another variable. So if you had three variables — FirstName, LastName, and Age — you could store them all within a structure, such as

Structure Person

Dim FirstName as String

Dim LastName as String

Dim Age as Integer

End Structure

A structure is a user-defined data type. You can’t use a structure until you declare a variable to represent that structure like this:

Dim Employees as Person

The preceding code creates an Employees variable that actually contains the FirstName, LastName, and Age variables, as shown in Figure 1-1.

Figure 1-1: A structure can contain multiple variables.

Storing data

To store data in a structure, you must

1. Identify the variable that represents that structure.

2. Identify the specific variable inside the structure to use.

So if you want to store the name Joe in the FirstName variable inside the Employee variable, you could do the following:

Employee.FirstName = “Joe”

If you wanted to store the name Smith in the LastName variable and the number 24 in the Age variable, inside the Employee variable, you could do the following:

Employee.FirstName = “Joe”

Employee.Age = “24”

Retrieving data

After you store data in a structure, you can always retrieve it again. Just identify

The variable that represents that structure

The actual variable name that holds the data

Suppose you defined a structure, as follows:

Structure Workers

Dim Name as String

Dim ID as Integer

Dim Salary as Single

End Structure

Before you can store any data in a structure, you must first declare a variable to represent that structure like this:

Dim Employees as Workers

Now you can store and retrieve data from a structure, as follows:

‘ This stores a name in the Employees structure

Employees.Name = “Jessie Balkins”

To retrieve data from this structure, identify the variable name that represents that structure and the variable that holds the data like this:

Print Employees.Name

This would retrieve the data in the Name variable, stored in the Employees variable structure, and print Jessie Balkins on-screen.

Structures are just a way to cram multiple variables into a single variable. A structure can hold only one group of related data. To make structures more useful, programmers typically use structures with another data structure or an array.

Using an Array

The problem with a single variable is that it can hold only a single chunk of data. So if you wanted to store a name, you could create a variable, such as

Dim Name as String

If you wanted to store a second name, you’d have to create a second variable, such as

Dim Name as String

Dim Name2 as String

The more names you want to store, the more separate variables you need to create. Because creating separate variables to store similar types of information can get tedious, computer scientists have created a “super” variable — an array. Unlike an ordinary variable that can hold only one chunk of data, an array can hold multiple chunks of data.

To create an array, you need to define these three items:

A variable name

The number of items you want to store (the array size)

The type of data to store (such as integers or strings)

So if you wanted to store 15 names in a variable, you could create a name array, such as

Dim NameArray(15) as String

The preceding code tells the computer to create a NameArray array, which can hold up to 15 strings, as shown in Figure 1-2.

Figure 1-2: An array can hold multiple chunks of data.

Defining the size

An array acts like a bunch of buckets (dubbed elements) that can hold exactly one item. When you create an array, you must first define the size of the array, which defines how many chunks of data (elements) that the array can hold.

Bounds

The size of an array is defined by two numbers:

The lower bound defines the number of the first array element.

The upper bound defines the number of the last array element.

Default bounds

The default value of the lower bound depends on the programming language:

Many programming languages, including the curly bracket language family of C and Java, always define the lower bound of an array starting with the number 0 (known as zero-based arrays).

Other programming languages always define the lower bound of an array starting with the number 1 (known as one-based arrays).

The following BASIC code actually creates a zero-based array that can hold six elements, numbered 0 through 5, as shown in Figure 1-3:

Dim LotteryNumbers(5) as Integer

Figure 1-3: One-based array numbers array elements differently than zero-based arrays.

If the programming language created a one-based array, the array would hold only five elements.

Zero-based arrays were made popular in the C language. As a result, any language derived from the C language, such as C++, C#, Java, Python, and Objective-C, will also use zero-based arrays. Because many programmers are familiar with zero-based arrays, many other programming languages also use zero-based arrays, such as Visual Basic and REALbasic. One-based arrays are less common, but found in some versions of BASIC along with less popular languages like Pascal and Smalltalk.

When defining arrays, always make sure you know whether your programming language creates zero-based or one-based arrays. Otherwise, you may try to store data in non-existent array elements.

Definable bounds

To avoid confusion, some programming languages (such as Pascal) let you define both the lower and upper bound arrays.

If you wanted to create an array to hold five integers, you could use the following code:

Var

LotteryNumbers[1..5] of Integer;

This would number the LotteryNumbers array from 1 to 5. However, you could choose any number range of five like this:

Var

LotteryNumbers[33..37] of Integer;

This would create an array of five elements, numbered from 33 to 37, as shown in Figure 1-4.

Figure 1-4: Some programming languages let you define the numbering of an array.

One advantage of defining the numbering of an array is that you can use meaningful numbers. For example, if you wanted to store the names of employees in an array, you could number the array so each array element is identified by an employee number. So if Jan Howards has employee ID number 102, Mike Edwards has employee ID number 103, and John Perkins has employee ID number 104, you could create a three-element array, as shown in Figure 1-5, like this:

Var

EmployeeList[102..104] of String;

Figure 1-5: By defining your own numbering for an array, you can make those numbers useful and meaningful.

Initializing

When you define an array, it’s a good idea to initialize that array. Initializing an array means filling it with initial data, such as

Zeroes for storing numbers in an array

Spaces for storing strings in an array

If you don’t initialize an array, the computer may randomly store data in an array, which could confuse your program later.

Loops

To initialize an array, most programmers use a loop. This code uses a FOR-NEXT loop to initialize an array with zeroes:

Dim LotteryNumbers(5) as Integer

For I = 1 to 5

LotteryNumbers(I) = 0

Next I

1.1.1.4 @Heading 4:Declarations

Some programming languages let you initialize an array without a loop. Instead, you declare an array and its initial data on the same line. This C++ code declares an array that can hold five integers and stores 0 in each array element:

int lotterynumbers[] = {0, 0, 0, 0, 0};

Storing data

To store data in an array, you need to define two items:

The array name

The array element where you want to store the data

So if you wanted to store data in the first element of a zero-based array, you could do this:

int myarray[5];

myarray[0] = 357;

If you wanted to store data in the first element of a one-based array, you could do this:

Dim myarray(5) as Integer

myarray(1) = 357

You can store data in array elements in any order you want, such as storing the number 47 in the first array element, the number 91 in the fourth array element, and the number 6 in the second array element, such as

int myarray[5];

myarray[0] = 47;

myarray[3] = 91;

myarray[1] = 6;

Retrieving data

To retrieve data from an array, you need to identify

The array name

The array element number that contains the data you want to retrieve

Suppose you had the following BASIC code that creates an array that stores three names:

Dim Names(3) as String

Names(1) = “Nancy Titan”

Names(2) = “Johnny Orlander”

Names(3) = “Doug Slanders”

If you wanted to retrieve and print the data stored in the second element of the Names array, you could use the following:

Print Names(2)

This would print Johnny Orlander on-screen.

Working with Resizable Arrays

One problem with arrays is that you must define their size before you can use them:

If you define an array too large, you waste memory.

If you define an array too small, your program can’t store all the data it needs to keep.

To get around these problems, some programming languages let you create dynamic or resizable arrays. A resizable array lets you change the array’s size while your program is running.

Here are reasons for and against using resizable arrays:

Advantage: You can make the array grow or shrink as needed so you don’t waste memory creating an array too large or limit your program by creating an array too small.

Drawbacks: The nuisance of constantly defining the size of an array, and the possibility that some programming languages won’t let you preserve the contents of a resizable array each time the array grows or expands.

To create a resizable array, every programming language requires different steps. The following sections provide a couple of examples.

BASIC

In BASIC, you can declare an array, such as

Dim BigArray(5) as String

Then to change the size of that array, you have to use the ReDim command and define a new upper bound for the array, as shown in Figure 1-6, like this:

ReDim BigArray(2)

Figure 1-6: Resizing an array lets you expand or shrink an array.

Resizing an array erases everything currently stored in that array.

If you want to resize an array and save the data in the array, you can use the Preserve command like this:

ReDim Preserve BigArray(2)

Not every programming language lets you resize an array and preserve its contents.

C++

To create a resizable array in C++, you have to go through slightly different steps.

First, you must define a resizable array like this:

datatype *arrayname;

So if you wanted to create a resizable array of integers, you’d declare your array as follows:

int *numberarray;

Before you could store any data in this array, you’d have to define its size using the new command. So if you wanted to resize the array to hold six elements (numbered 0 to 5), you could use the following:

int *numberarray;

numberarray = new int[5];

At this point, you could start storing data in your array like this:

int *numberarray;

numberarray = new int[5];

numberarray[0] = 23;

numberarray[5] = 907;

To resize an array again, you have to use the new command along with a new upper bound, such as

int *numberarray;

numberarray = new int[5];

numberarray[0] = 23;

numberarray[5] = 907;

numberarray = new int[2];

numberarray[1] = 48;

This C++ code first defines a resizable array and then defines its upper bound as 5 to store the numbers 23 and 907 in the array elements numbered 0 and 5, respectively.

Then the second new command resizes the entire array, erasing all data stored in that array, and defines the array’s upper bound as 2. Finally, it stores the number 48 into array element 1, as shown in Figure 1-7.

Figure 1-7: Resizing an array erases all data in the array.

Working with Multi-Dimensional Arrays

Most arrays are one-dimensional because you define only the array’s length. However, you can create multi-dimensional arrays by defining multiple array sizes.

The most common multi-dimensional array is a two-dimensional array, which looks like a grid, as shown in Figure 1-8.

Figure 1-8: A two-dimensional array lets you store data in a grid.

You can create 3-, 4-, or even 19-dimensional arrays. However, after you get past a three-dimensional array, understanding how that array works can be too confusing, so most programmers stick to two-dimensional or three-dimensional arrays.

Creating a multi-dimensional array

To create a multi-dimensional array, you have to define another upper bound for an array. So if you wanted to create a 4 x 2 two-dimensional array, you could use the following BASIC code:

Dim BigArray(4,2) as String

To create the same two-dimensional array in C++, you could use the following code:

string bigarray[4][2];

To create three or more dimensional arrays, keep adding on additional bounds, such as

Dim BigArray(2,4,3,8) as String

The equivalent multi-dimensional array in C++ would look like this:

string bigarray[2][4][3][8];

Storing and retrieving data

To store data in a multi-dimensional array, you need to specify the specific array location. So if you had a two-dimensional array, you’d have to specify each of the two dimensions, such as

Dim BigArray(4,2) as String

BigArray(4,1) = “Ollie Bird”

After you store data in a multi-dimensional array, you can retrieve that data again by specifying the array name and the specific array element that contains the data you want. So if you had previously stored the string Ollie Bird in a two-dimensional array, you could retrieve the data stored in the 4,1 array element, such as

Print BigArray(4,1)

This command would print the string Ollie Bird.

The more dimensions you add to your array, the more space you create in your array, and the more memory your program needs. Don’t be afraid to use a multi-dimensional array; just don’t create one unless you really need one.

Two-dimensional arrays can be useful for modeling real-life items, such as checkerboards or tic-tac-toe games, which already look like two-dimensional arrays (grids) anyway.

Using Structures with Arrays

All arrays can hold only one specific data type, such as integers or strings. So if you create an array that contains five elements, each element must all contain the same data type, such as all integers.

Rather than define an array to contain a data type, like strings or integers, you can also define an array to contain a structure. A structure lets you cram multiple variables into a single variable, but a single structure by itself is fairly useless. After you store data in a single structure, you don’t have any room left to store anything else, as shown in Figure 1-9.

Figure 1-9: A structure can hold only one group of related data, but an array of structures can hold multiple groups of related data.

To use a structure with an array, you must first define a structure and the variables you want to store inside that structure. So if you want to store a company name, contact person, and total sales made to the company, you could define a structure like this:

Structure Company

Dim Name as String

Dim Contact as String

Dim Sales as Single

End Structure

Next, you can define your array, but instead of making your array hold a data type, like strings or integers, you can make your array hold your structure like this:

Dim Customers(3) as Company

This code creates an array, with elements numbered from 0 to 3, which holds the Company structure that you defined, as shown in Figure 1-10.

Figure 1-10: An array of structures acts like a Rolodex file or a simple database.

To store data in an array of structures, you need to identify the array element (in this example numbered 0 to 3) and the specific variable inside the structure to store your data. So if you wanted to store data in array element number 2, you could do the following:

Customers(2).Name = “Microsoft”

Customers(2).Contact = “Bill Gates”

Customers(2).Sales = 50195.27

Retrieving data from an array of structures means identifying the array name, the array element followed by the variable name stored in that structure. So if you wanted to print the name stored in the Contact variable of array element number 2, you could do the following:

Print Customers(2).Contact

This code would print Bill Gates on-screen. Storing and retrieving data from an array of structures means identifying the following items:

The array name (such as Customers )

The array element number (such as 2 )

The variable inside the structure (such as Contact )

Drawbacks of Arrays

Arrays can be handy for storing lists of related data in a single location. However, arrays have several drawbacks:

Large arrays take up space.

Arrays can hold only one data type at a time.

Searching and sorting arrays is difficult.

Inserting and removing data from arrays is clumsy.

Sizing

The biggest problem with arrays is defining the size of an array ahead of time:

If you know you need to store 20 names, it’s easy to define an array to hold 20 strings.

If you aren’t sure if your program needs to store 20 names or 20,000 names, you have to define the largest array you think your program will ever need, so most of the array will be empty and waste memory.

To get around the problem of creating large arrays that aren’t needed, you can create resizable or dynamic arrays that can grow or shrink as you need them (see the earlier section, “Working with Resizable Arrays”). Such resizable arrays can be convenient, but you have to make sure that each time you resize an array, you don’t accidentally delete data that you want to keep.

Data types

Another limitation of arrays is that they can hold only one data type at a time. So if you want to store a list of names and numbers, you have to create two separate arrays:

One array to store the names

Another array to store the numbers

Some programming languages allow you to create a data type called a variant. A variant data type can hold any type of data, so if you create an array of variant data types, you can create an array that can hold both strings and numbers.

Searching and sorting

Another problem with arrays is searching and sorting an array. If you create an array to hold 10,000 names, how can you find the name Bill Gates stored in that array? To search for data stored in an array, you have to search through the entire array from start to finish. For a small array, this can be acceptable, but searching through an array that contains thousands of names or numbers can get tedious and slow, especially if you need to search through an array on a regular basis.

So if an array contains 10,000 names and the name you want is the last element in that array, you have to search through 10,000 array elements just to find the name you want.

More cumbersome than searching an array is sorting an array. If you store 10,000 names in an array and suddenly decide you want to sort those names in alphabetical order, you have to move and sort the entire list one array element at a time. Doing this once may be acceptable, but doing this on a regular basis can be cumbersome and slow.

Adding and deleting

Rather than dump all your data in an array and try to sort it out later, you might want to sort data while you store it. Adding data to an empty array is easy; dump the data in any array element. The problem comes when you want to add data in between two array elements.

Suppose you have the names Charles Green and Mary Halls in an array, as shown in Figure 1-11. If you wanted to insert the name Johnny Grey in between Charles Green and Mary Halls, you’d have to copy all array elements starting with Mary Hall and move them to the next array element.

Figure 1-11: Inserting data into an array means copying and moving data from one array element to another.

For a small array, this isn’t a problem, but for a large array of 10,000 names, copying and moving several thousand names consistently is cumbersome and slow.

Even worse, what if you want to delete an array element? It’s easy to delete an array element by just setting that array element to a blank value, such as zero or a space. However, the more items you delete from an array, the more empty spaces you have, wasting space.

The time to use arrays depends on both the amount of data you need to store and whether you need to manipulate that data later:

Perfect : Store a small, fixed size list of one data type.

Not so good: Store large amounts of data that can change in quantity, needs to be sorted or searched, or data that contains different types of information, such as numbers and text.

In this case, arrays can be too restrictive. You may want to look at other data structures, such as collections (see Book III, Chapter 3).

The data structure you choose for your program can determine the speed and efficiency of your program:

Choose the right data structure, and writing your program is easy.

Choose the wrong data structure, and you may waste time writing code to overcome the limitations of your chosen data structure, such as writing code to sort an array that contains 10,000 names.