Chapter 1
IN THIS CHAPTER
Working with arrays and multidimensional arrays
Understanding the connection between arrays and pointers
Dealing with pointers in all their forms
Using reference variables
When the C programming language, predecessor to C++, came out in the early 1970s, it was a breakthrough because it was small. C had only a few keywords. Tasks like printing to the console were handled not by built-in keywords but by functions. Technically, C++ still has few keywords, so it’s still small. So what makes C++ big?
In this chapter, you encounter the full rundown of topics that lay the foundation for C++: arrays, pointers, and references. In C++, these items come up again and again.
This chapter assumes that you have a basic understanding of C++ — that you understand the material in Books 1 and 2. You know the basics of pointers and arrays (and maybe just a teeny bit about references) and you’re ready to grasp them thoroughly. When you finish this chapter, you’ll have expanded on your knowledge enough to perform some intermediate and advanced tasks with relative ease.
As you work with C++ arrays, it seems like you can do a million things with them. This section provides the complete details on arrays. The more you know about arrays, the less likely you are to use them incorrectly, which would result in a bug. (The following sections don’t tell you about the std::array
class described at https://en.cppreference.com/w/cpp/container/array
. Instead, they discuss basic C++ arrays. Book 3, Chapter 1 provides a simple std::array
example in the “Creating a declarative C++ example” section, and you get a more detailed look in the “Working with std::array” section in Chapter 6 of this minibook.)
The usual way of declaring an array is to line up the type name, followed by a variable name, followed by a size in brackets, as in this line of code:
int Numbers[10];
int Numbers[];
In certain situations, you can declare an array without putting a number in the brackets. For example, you can initialize an array without specifying the number of elements:
int MyNumbers[] = {1,2,3,4,5,6,7,8,9,10};
The compiler is smart enough to count how many elements you put inside the braces, and then the compiler makes that count the array size.
Specifying the array size helps decrease your chances of having bugs. Plus, it has the added benefit that, in the actual declaration, if the number in brackets does not match the number of elements inside braces, the compiler issues an error. The following
int MyNumbers[5] = {1,2,3,4,5,6,7,8,9,10};
yields this compiler error:
error: too many initializers for 'int [5]'
But if the number in brackets is greater than the number of elements, as in the following code, you won’t get an error. Instead, the remaining array elements aren’t defined, so you can’t access them. So be careful!
int MyNumbers[15] = {1,2,3,4,5,6,7,8,9,10};
You also can skip specifying the array size when you pass an array into a function, like this:
int AddUp(int Numbers[], int Count) {
int loop;
int sum = 0;
for (loop = 0; loop < Count; loop++) {
sum += Numbers[loop];
}
return sum;
}
This technique is particularly powerful because the AddUp()
function can work for any size array. You can call the function like this:
cout << AddUp(MyNumbers, 10) << endl;
But this way to do it is kind of annoying because you have to specify the size each time you call in to the function. However, you can get around this problem. Look at this line of code:
cout << AddUp(MyNumbers, sizeof(MyNumbers) / 4) << endl;
With the array, the sizeof
operator tells you how many bytes it uses. But the size of the array is usually the number of elements, not the number of bytes. So you divide the result of sizeof
by 4
(the size of each int
element).
But now you have that magic number, 4, sitting there. (By magic number, we mean a seemingly arbitrary number that’s stuffed somewhere into your code.) So a slightly better approach would be to enter this line:
cout << AddUp(MyNumbers, sizeof(MyNumbers) / sizeof(int))
<< endl;
Now this line of code works, and here’s why: The sizeof
the array divided by the sizeof
each element in the array gives the number of elements in the array. (You also see this technique demonstrated in the “Declaring and accessing an array” section of Book 2, Chapter 2.)
The name of the array is a pointer to the array itself. The array is a sequence of variables stored in memory. The array name points to the first item. The following sections discuss arrays and pointers.
An interesting question about arrays and pointers is whether it’s possible to use a function header, such as the following line, and rely on sizeof
to determine how many elements are in the array. If so, this function wouldn’t need to have the caller specify the size of the array.
int AddUp(int Numbers[]) {
Consider this function found in the Array01
example and a main()
that calls it:
#include <iostream>
using namespace std;
void ProcessArray(int Numbers[]) {
cout << "Inside function: Size in bytes is "
<< sizeof(Numbers) << endl;
}
int main(int argc, char *argv[]) {
int MyNumbers[] = {1,2,3,4,5,6,7,8,9,10};
cout << "Outside function: Size in bytes is ";
cout << sizeof(MyNumbers) << endl;
ProcessArray(MyNumbers);
return 0;
}
When you run this application, here’s what you see:
Outside function: Size in bytes is 40
Inside function: Size in bytes is 4
Outside the function, the code knows that the size of the array is 40 bytes. However, inside the function the size reported is 4 bytes. The reason is that even though it appears that you’re passing an array, you’re really passing a pointer to an array. The size of the pointer is just 4, and so that’s what the final cout
line prints.
Declaring arrays has a slight idiosyncrasy. When you declare an array by giving a definite number of elements, such as
int MyNumbers[5];
the compiler knows that you have an array, and the sizeof
operator gives you the size of the entire array. The array name, then, is both a pointer and an array! But if you declare a function header without an array size, such as
void ProcessArray(int Numbers[]) {
the compiler treats this as simply a pointer and nothing more. This last line is, in fact, equivalent to the following line:
void ProcessArray(int *Numbers) {
Thus, inside the functions that either line declares, the following two lines of code are equivalent:
Numbers[3] = 10;
*(Numbers + 3) = 10;
This equivalence means that if you use an extern
declaration on an array, such as
extern int MyNumbers[];
and then take the size of this array, the compiler will get confused. Here’s an example: If you have two files, numbers.cpp
and main.cpp
, where numbers.cpp
declares an array and main.cpp
externally declares it (as shown in the Array02
example), you will get a compiler error if you call sizeof
:
#include <iostream>
using namespace std;
extern int MyNumbers[];
int main(int argc, char *argv[]) {
cout << sizeof(MyNumbers) << endl;
return 0;
}
In Code::Blocks, the gcc compiler gives this error:
error: invalid application of 'sizeof' to incomplete type 'int []'
The solution is to put the size of the array inside brackets, such as extern int MyNumbers[10];
. Just make sure that the size is the same as in the other source code file! You can fake out the compiler by changing the number, and you won’t get an error. But that’s bad programming style and just asking for errors.
The converse is also true. Look at this function:
void ProcessArray(int *Numbers) {
cout << Numbers[1] << endl;
}
This function takes a pointer as a parameter, yet you access it as an array. Again, don’t write code like this; instead, you should try to understand why code like this works. That way, you gain a deeper knowledge of arrays and how they live inside the computer, and this knowledge, in turn, can help you write code that works properly.
Even though this chapter tells you that the array name is just a pointer, the name of an array of integers isn’t the exact same thing as a pointer to an integer. Check out these lines of code (found in the Array03
example):
#include <iostream>
using namespace std;
int main() {
int LotsONumbers[50];
int x;
LotsONumbers = &x;
}
The code tries to point the LotsONumbers
pointer to something different: something declared as an integer. The compiler doesn’t let you do this; you get an error. That wouldn’t be the case if LotsONumbers
were declared as int *LotsONumbers;
then this code would work. But as written, this code gives you a compiler error like this one:
error: incompatible types in assignment of 'int*' to 'int [50]'
This error implies the compiler does see a definite distinction between the two types, int*
and int[50]
. Nevertheless, the array name is indeed a pointer, and you can use it as one; you just can’t do everything with it that you can with a normal pointer, such as reassign it. These tips will help you keep your arrays bug-free:
&(MyNumbers[0])
if this makes the code clearer — though it’s equivalent to just MyNumbers
.extern
keyword to declare an array, also put the array size inside brackets. But be consistent! Don’t use one number one time and a different number another time. The easiest way to be consistent is to use a constant, such as const int ArraySize = 10;
in a common header file and then use that in your array declaration: int MyArray[ArraySize];
.Arrays do not have to be just one-dimensional. Dimensions make it possible to model data more realistically. For example, a three-dimensional array would allow you to better model a specific place in 3-D space. The following sections discuss using multidimensional arrays.
You can declare a multidimensional array using a technique similar to a single-dimensional array, as shown in the Array04
example in Listing 1-1. The difference is that you must declare each dimension separately.
LISTING 1-1: Using a Multidimensional Array
#include <iostream>
using namespace std;
int MemorizeThis[10][20];
int main() {
for (int x = 0; x < 10; x++) {
for (int y = 0; y < 20; y++ ) {
MemorizeThis[x][y] = x * y;
}
}
cout << MemorizeThis[9][13] << endl;
cout << sizeof(MemorizeThis) / sizeof(int) << endl;
return 0;
}
When you run this, MemorizeThis
gets filled with the multiplication tables. Here’s the output for the application, which is the contents of MemorizeThis[9][13]
, and then the size of the entire two-dimensional array:
117
200
And indeed, 9 times 13 is 117. The size of the array is 200 elements. Because each element, being an integer, is 4 bytes, the size of the array in bytes is 800.
int BigStuff[4][3][5][3][5][6][9];
And the following array has 4,838,400 elements, for a total of 19,353,600 bytes. That’s about 19 megabytes!
int ReallyBigStuff[8][6][10][6][5][7][12][4];
int GiantMonster[18][16][10][16][15][17][12][14];
You’ll get an error of:
error: size of array 'GiantMonster' is too large
The data type of your array also makes a difference. Here are some byte values for arrays of the same size, but using different types:
char CharArray[20][20]; // 400 bytes
short ShortArray[20][20]; // 800 bytes
long LongArray[20][20]; // 1,600 bytes
float FloatArray[20][20]; // 1,600 bytes
double DoubleArray[20][20]; // 3,200 bytes
Just as you can initialize a single-dimensional array by using braces and separating the elements by commas, you can initialize a multidimensional array with braces and commas and all that jazz, too. But to do this, you combine arrays inside arrays, as in this code:
int Numbers[5][6] = {
{1,2,3,4,5,6},
{7,8,9,10,12},
{13,14,15,16,17,18},
{19,20,21,22,23,24},
{25,26,27,28,29,30}
};
The hard part is remembering whether you put in five arrays containing six subarrays or six arrays containing five subarrays. Think of it like this: Each time you add another dimension, it goes inside the previous dimension. That is, you can write a single-dimensional array like this:
int MoreNumbers[5] = {
100,
200,
300,
400,
500,
};
Then, if you add a dimension to this array, each number in the initialization is replaced by an array initializer of the form {1,2,3,4,5,6}
. Then you end up with a properly formatted multidimensional array.
If you have to pass a multidimensional array to a function, things can get just a bit hairy. That’s because you don’t have as much freedom in leaving off the array sizes as you do with single-dimensional arrays. Suppose you have this function:
int AddAll(int MyGrid[5][6]) {
int x,y;
int sum = 0;
for (x = 0; x < 5; x++) {
for (y = 0; y < 6; y++) {
sum += MyGrid[x][y];
}
}
return sum;
}
So far, the function header is fine because it explicitly states the size of each dimension. But you may want to do this:
int AddAll(int MyGrid[][]) {
or maybe pass the sizes as well:
int AddAll(int MyGrid[][], int rows, int columns) {
But unfortunately, the compiler displays this error for both lines:
declaration of 'MyGrid' as multidimensional array
must have bounds for all dimensions except the first
The compiler is telling you that you must explicitly list all the dimensions, but it’s okay if you leave the first one blank, as with one-dimensional arrays.
So this crazy code will compile:
int AddAll(int MyGrid[][6]) {
The reason is that the compiler treats multidimensional arrays in a special way. A multidimensional array is not really a two-dimensional array, for example; rather, it’s an array of an array. Thus, deep down inside C++, the compiler treats the statement MyGrid[5][6]
as if it were MyGrid[5]
where each item in the array is itself an array of size 6. And you’re free not to specify the size of a one-dimensional array. Well, the first brackets represent the one-dimensional portion of the array. So you can leave that space blank, as you can with other one-dimensional arrays. But then, after that, you have to give the subarrays bounds, a specific number of entries.
int AddAll(int MyGrid[][6]) {
int AddAll(int MyGrid[][6], int count) {
Here’s a way around this problem: Use a typedef
, which is cleaner:
typedef int GridRow[6];
int AddAll(GridRow MyGrid[], int Size) {
int x,y;
int sum = 0;
for (x = 0; x < Size; x++) {
for (y = 0; y < 6; y++) {
sum += MyGrid[x][y];
}
}
return sum;
}
The typedef
line defines a new type called GridRow
. This type is an array of six integers. Then, in the function, you’re passing an array of GridRows
.
Using this typedef
is the same as simply using two brackets, except it emphasizes that you’re passing an array of an array — that is, an array in which each member is itself an array of type GridRow
.
In a typical C++ application, the main()
function receives an array and a count as command-line parameters — parameters provided as part of the command to execute that application at the command line. However, to beginning programmers, the parameters can look intimidating. But they’re not: Think of the two parameters as an array of strings and a size of the array. However, each string in this array of strings is actually a character array. In the old days of C, and earlier breeds of C++, no string
class was available. Thus strings were always character arrays, usually denoted as char *MyString
. (Remember, an array and a pointer can be used interchangeably for the most part). Thus you could take this thing and turn it into an array — either by throwing brackets at the end, as in char *MyString[]
, or by making use of the fact that an array is a pointer and adding a second pointer symbol, as in char **MyString
. The following code from the CommandLineParams
example shows how you can get the command-line parameters:
#include <iostream>
using namespace std;
int main(int argc, char *argv[]) {
int loop;
for (loop = 0; loop < argc; loop++) {
cout << argv[loop] << endl;
}
return 0;
}
Before you build your application, add some command-line arguments to it by choosing Project ⇒ Set Program’s Arguments to display the Select Target dialog box, shown in Figure 1-1. Type these arguments (one on each line) in the Program Arguments field and then click OK.
FIGURE 1-1: Use the Select Target dialog box to add program arguments.
You see the following output when you run the application. (Note that the application name comes in as the first parameter and the quoted items come in as a single parameter.)
C:\CPP_AIO4\BookV\Chapter01\CommandLineParams\bin\Debug\CommandLineParams.exe
abc
def
abc 123
The first argument is always the name of the executable. The executable name can be accompanied by the .exe
extension, the executable path, and the drive on which the executable resides. What you see depends on your IDE and compiler.
Arrays are useful, but it would be a bummer if the only way you could use them were as stack variables. This section shows an exception to not treating arrays as pointers by telling how you can allocate an array on the heap by using the new
keyword. (If you can’t quite remember the difference between the stack and the heap, check out the “Heaping and Stacking the Variables” section in Book 1, Chapter 8.) But first you need to know about a couple of little tricks to make it work.
You can easily declare an array on the heap by using new int[50]
, for example. But think about what this is doing: It declares 50 integers on the heap, and the new
word returns a pointer to the allocated array. But, unfortunately, the makers of C++ didn’t see it that way. For some reason, they based the array pointer type on the first element of the array (which is, of course, the same as all the elements in the array). Thus the call:
new int[50]
returns a pointer of type int *
,
not something that explicitly points to an array, just as this call does:
new int;
If you want to save the results of new int [50]
in a variable, you have to have a variable of type int *
, as in the following:
int *MyArray = new int[50];
In this case, an array name is a pointer and vice versa. So now that you have a pointer to an integer, you can treat it like an array:
MyArray[0] = 25;
When you finish using the array, you can call delete
. But you can’t just call delete MyArray;
. The reason is that the compiler knows only that MyArray
is a pointer to an integer; it doesn’t know that it’s an array! Thus delete MyArray
will only delete the first item in the array, leaving the rest of the elements sitting around on the heap, wondering when their time will come. So the makers of C++ gave us a special form of delete
to handle this situation. It looks like this:
delete[] MyArray;
If you’re really curious about the need for delete[]
and delete
, consider that there’s a distinction between allocating an array and allocating a single element on the stack. Look closely at these two lines:
int *MyArray = new int[50];
int *somenumber = new int;
The first allocates an array of 50 integers, while the second allocates a single array. But look at the types of the pointer variables: They’re both the same! They’re both pointers; each one points to an integer. And so the statement
delete something;
is ambiguous if something
is a pointer to an integer: Is it an array, or is it a single number? The designers of C++ knew this was a problem, so they unambiguated it. They declared and proclaimed that delete
shall delete only a single member. Then they invented a little extra that must have given the compiler writers a headache: They said that if you want to delete an array instead, just throw on an opening and closing bracket after the word delete
. And all will be good.
Because of the similarities between arrays and pointers, you are likely to encounter some strange notation. For example, in main()
itself, you have seen both of these at different times:
char **argc
char *argc[]
If you work with arrays of arrays and arrays of pointers, the best bet is to make sure that you completely understand what these kinds of statements mean. Remember that although you can treat an array name as a pointer, you’re in for some technical differences. The following lines of code show these differences. First, think about what happens if you initialize a two-dimensional array of characters like this:
char NameArray[][6] = {
{'T', 'o', 'm', '\0', '\0', '\0'},
{'S', 'u', 'z', 'y' , '\0', '\0'},
{'H', 'a', 'r', 'r' , 'y', '\0'}
};
This is an array of an array. Each inner array is an array of six characters. The outer array stores the three inner arrays. (The individual content of an array is sometimes called a member — the inner array has six members and the outer array has three members.) Inside memory, the 18 characters are stored in one consecutive row, starting with T
, then o
, and ending with m
and finally three copies of \0
, which is the null character. But now take a look at this:
char* NamePointers[] = {
"Tom",
"Suzy",
"Harry"
};
This is an array of character arrays as well, except that it’s not the same as the code that came just before it. This is actually an array holding three pointers: The first points to a character string in memory containing Tom
(which is followed by a null-terminator, \0
); the second points to a string in memory containing Suzy
ending with a null-terminator; and so on. Thus, if you look at the memory in the array, you won’t see a bunch of characters; instead, you see three numbers, each being a pointer.
char **PointerToPointer = {
"Tom",
"Suzy",
"Harry"
};
you will get an error:
error: initializer for scalar variable requires one element
A scalar is just another name for a regular variable that is not an array. In other words, the PointerToPointer
variable is a regular variable (that is, a scalar), not an array.
Yet, inside the function header for main()
, you can use char **
, and you can access this as an array. As usual, there’s a slight but definite difference between an array and a pointer. You can’t always treat a pointer as an array; for example, you can’t initialize a pointer as an array. But you can go the other way: You can take an array and treat it as a pointer most of the time. Thus you can do this:
char* NamePointers[] = {
"Tom",
"Harry",
"Suzy"
};
char **AnotherArray = NamePointers;
This code compiles, and you can access the strings through AnotherArray[0]
, for example. Yet you’re not allowed to skip a step and just start out initializing the AnotherArray
variable like so:
char** AnotherArray = {
"Tom",
"Harry",
"Suzy"
};
If you write the code that way, it’s the same as the code shown just before this example — and it yields a compiler error! This is one (perhaps obscure) example in which the slight differences between arrays and pointers become obvious, but it does help explain why you can see something like this:
int main(int argc, char **argv)
and you are free to use the argv
variable to access an array of pointers — specifically, in this case, an array of character pointers, also called strings.
If you have an array and you don’t want its contents to change, you can make it a constant array. The following lines of code, found in the Array05
example, demonstrate this approach:
const int Permanent[5] = { 1, 2, 3, 4, 5 };
cout << Permanent[1] << endl;
This array works like any other array, except you cannot change the numbers inside it. If you add a line like the following line, you get a compiler error, because the compiler is aware of constants:
Permanent[2] = 5;
Here’s the error you get when working in Code::Blocks:
error: assignment of read-only location 'Permanent[2]'
Arrays have a certain constancy built in. For example, you can’t assign one array to another. If you make the attempt (as shown in the Array06
example), the Code::Blocks compiler presents you with an error: invalid array assignment
error message.)
int NonConstant[5] = { 1, 2, 3, 4, 5 };
int OtherList[5] = { 10, 11, 12, 13, 14 };
OtherList = NonConstant;
In other words, that third line is saying, “Forget what OtherList
points to; instead, make it point to the NonConstant
array, {1,2,3,4,5}
!” The point is that arrays are always constant. If you want to make the array elements constant, you can precede its type with the word const
. When you do so, the array name is constant, and the elements inside the array are also constant.
To fully understand C++ and all its strangeness and wonders, you need to become an expert in pointers. (Fortunately, many modern innovations in C++ are making the need to know pointers less of an issue — you have other means at your disposal, as discussed in Book 1, Chapter 8.) One of the biggest sources of bugs is when programmers who have a so-so understanding of C++ work with pointers and mess them up. But what’s bad in such cases is that the application may run properly for a while and then suddenly not work. Those bugs are the hardest bugs to catch, because the user may see the problem occur and then report it, but programmers can’t reproduce the problem. In this section, you see how you can get the most out of pointers and use them correctly in your applications so that you won’t have these strange problems.
You could see a function header like this:
void MyFunction(char ***a) {
Yikes! What are all those asterisks for? Looks like a pointer to a pointer to a pointer to … something! How confusing. Some humans have brains that are more like computers, and they can look at that code and understand it just fine, but most people can’t. The following sections help you understand how passing pointers to functions works.
To understand the code, think about this: Suppose you have a pointer variable, and you want a function to change what the pointer variable points to. What this is saying is that the function wants to make the pointer point to something else, rather than change the contents of the thing that it points to. There’s a big difference between the two. Any time you want a function to change a variable, you have to either pass it by reference or pass its address. This process can get confusing with a pointer. One way to reduce the confusion is to define a new type — using the typedef
word. It goes like this (as shown in the Pointer01
example):
typedef char *PChar;
This is a new type called PChar
that is equivalent to char *
. That is, PChar
is a pointer to a character.
Now look at this function:
void MyFunction(PChar &x) {
x = new char('B');
}
This function takes a pointer variable and points it to the result of new char(’B’)
. That is, it points it to a newly allocated character variable containing the letter B
. Now, think this through carefully: A PChar
simply contains a memory address — really. You pass it by reference into the function, and the function modifies the PChar
so that the PChar
contains a different address. That is, the PChar
now points to something different from what it previously did.
To try this function, here’s some code you can put in main()
that tests MyFunction()
:
PChar ptr = new char('A');
PChar copy = ptr;
MyFunction(ptr);
cout << "ptr points to " << *ptr << endl;
cout << "copy points to " << *copy << endl;
The code creates two variables of type PChar
: ptr
and copy
. It assigns a new char
, ’A’
, to ptr
and then copies the address of ptr
to copy
so that they both point to the same location in memory. At this point, then, ptr
and copy
both have the same memory address in them.
Next, the code calls MyFunction()
, which changes where ptr
points in memory. On return from the function, the code prints two characters: the character that ptr
points to and the character that copy
points to. Here’s what you see when you run it:
ptr points to B
copy points to A
This means that MyFunction()
worked! The ptr
variable now points to the character allocated in MyFunction
(a B
), while the copy
variable still points to the original A
. In other words, they no longer point to the same thing: MyFunction()
managed to change what the variable points to.
Now consider the same function, but instead of using references, try it with pointers. Here’s a modified form (as found in the Pointer02
example):
typedef char *PChar;
void AnotherFunction(PChar *x) {
*x = new char('C');
}
The parameter is really a char **
in this case. You could create another typedef
to handle it, as in typedef char **PPChar;
. Because the parameter is a pointer, you have to dereference it to modify its value. Thus you see an asterisk, *
, at the beginning of the middle line. Here’s a modified main()
that calls this function:
PChar ptr = new char('A');
PChar copy = ptr;
AnotherFunction(&ptr);
cout << "ptr points to " << *ptr << endl;
cout << "copy points to " << *copy << endl;
Because the function uses a pointer rather than a reference, you have to pass the address of the ptr
variable, not the ptr
variable directly. So notice that the call to AnotherFunction()
has an ampersand, &
, in front of the ptr
. This code works as expected. When you run it, you see this output:
ptr points to C
copy points to A
This version of the function, called AnotherFunction()
, made a new character called C
. Indeed, it’s working correctly: ptr
now points to a C
character, and copy
hasn’t changed. Again, the function pointed ptr
to something else.
The previous examples created a typedef
to make it much easier to understand what the functions are doing. However, not everybody does it that way; therefore you have to understand what other people are doing when you attempt to fix their code. So here are the same two functions found in the previous sections, MyFunction()
and AnotherFunction()
, but without typedef
. Instead of using the new PChar
type, they directly use the equivalent char *
type:
void MyFunction(char *&x) {
x = new char('B');
}
void AnotherFunction(char **x) {
*x = new char('C');
}
To remove the use of the typedef
s, all you do is replace the PChar
in the two function headers and the variable declarations with its equivalent char *
. You can see that the headers now look goofier. But they mean exactly the same as before: The first is a reference to a pointer, and the second is a pointer to a pointer.
But think about char **x
for a moment. Because char *
is also the same as a character array in many regards, char **x
is a pointer to a character array. In fact, sometimes you may see the header for main()
written like this
int main(int argc, char **argv)
instead of
int main(int argc, char *argv[])
Notice the argv
parameter in the first of these two is the same type as we’ve been talking about: a pointer to a pointer (or, in a more easily understood manner, the address of a PChar
). But you know that the argument for main()
is an array of strings.
Now it’s time to consider what happens when you have a pointer that points to an array of strings and a function that is going to make it point to a different array of strings. The Pointer03
example begins by creating the required typedef
s:
typedef char **StringArray;
typedef char *PChar;
StringArray
is a type equivalent to an array of strings. In fact, if you put these two lines of code before your main()
, you can actually change your main()
header into the following and it will compile:
int main(int argc, StringArray argv)
Now here’s a function that will take as a parameter an array of strings, create a new array of strings, and set the original array of strings to point to this new array of strings:
void ChangeAsReference(StringArray &array) {
StringArray NameArray = new PChar[3];
NameArray[0] = "Tom";
NameArray[1] = "Suzy";
NameArray[2] = "Harry";
array = NameArray;
}
Just to make sure that it works, here’s something you can put in main()
:
StringArray OrigList = new PChar[3];
OrigList[0] = "John";
OrigList[1] = "Paul";
OrigList[2] = "George";
StringArray CopyList = OrigList;
ChangeAsReference(OrigList);
cout << OrigList[0] << endl;
cout << OrigList[1] << endl;
cout << OrigList[2] << endl << endl;
cout << CopyList[0] << endl;
cout << CopyList[1] << endl;
cout << CopyList[2] << endl;
The code creates a pointer to an array of three strings. It then stores three strings in the array. Next, the code saves a copy of the pointer in the variable called CopyList
, changes the OrigList
pointer by calling ChangeAsReference()
, and prints all the values. Here’s the output:
Tom
Suzy
Harry
John
Paul
George
The first three outputs are the elements in OrigList
, which were passed into ChangeAsReference()
. They no longer have the values John
,
Paul
, and George
. The three original Beatles names have been replaced by three new names: Tom
, Harry
, and Suzy
. However, the Copy
variable still points to the original string list. Thus, once again, changing the pointer reference worked.
The previous section uses typedef
s to work with string arrays, but you can also do it with pointers. Here’s the modified version of the function, this time using said pointers (as shown in the Pointer04
example):
void ChangeAsPointer(StringArray *array) {
StringArray NameArray = new PChar[3];
NameArray[0] = "Tom";
NameArray[1] = "Harry";
NameArray[2] = "Suzy";
*array = NameArray;
}
As before, here’s the slightly modified sample code that tests the function:
StringArray OrigList = new PChar[3];
OrigList[0] = "John";
OrigList[1] = "Paul";
OrigList[2] = "George";
StringArray CopyList = OrigList;
ChangeAsPointer(&OrigList);
cout << OrigList[0] << endl;
cout << OrigList[1] << endl;
cout << OrigList[2] << endl << endl;
cout << CopyList[0] << endl;
cout << CopyList[1] << endl;
cout << CopyList[2] << endl;
You can see that when the code calls ChangeAsPointer()
, it passes the address of OrigList
. The output of this version is the same as that of the previous version.
Here are the two function headers without using the typedef
s:
int ChangeAsReference(char **&array) {
and
int ChangeAsPointer(char ***array) {
You may see code like these two lines from time to time. Such code isn’t the easiest to understand, but after you know what these lines mean, you can interpret them.
When an application is running, the functions in the application exist in the memory; so just like anything else in memory, they have an address.
You can access the address of a function by taking the name of it and putting the address-of operator (&
) in front of the function name, like this:
address = &MyFunction;
But to make this work, you need to know what type to declare address
. The address
variable is a pointer to a function, and the cleanest way to assign a type is to use auto, like this:
auto address = &MyFunction;
The traditional method is to use a typedef
(as shown in the FunctionPointer01
example). Here’s the typedef
you need:
typedef int(*FunctionPtr)(int);
It’s hard to follow, which is why using auto
is better, but the name of the new type is FunctionPtr
. This defines a type called FunctionPtr
that returns an integer (the leftmost int
) and takes an integer as a parameter (the rightmost int
, which must be in parentheses). The middle part of this statement is the name of the new type, and you must precede it by an asterisk, which means that it’s a pointer to all the rest of the expression. Also, you must put the type name and its preceding asterisk inside parentheses. Now you’re ready to declare some variables! Here goes:
FunctionPtr address = &MyFunction;
This line declares address
as a pointer to a function and initializes it to MyFunction()
. For this to work, the code for MyFunction()
must have the same prototype declared in the typedef
: In this case, it must take an integer as a parameter and return an integer. So, for example, you may have a function like this:
int TheSecretNumber(int x) {
return x + 1;
}
Then you could have a main()
that stores the address of this function in a variable — and then calls the function by using the variable:
int main() {
typedef int (*FunctionPtr)(int);
int MyPasscode = 20;
FunctionPtr address = &TheSecretNumber;
cout << address(MyPasscode) << endl;
}
Using the typedef
approach has the advantage of specifying precisely what you want in the way of inputs, but the auto
form is shorter and easier to understand. Here’s main()
using the auto
form (you must be using C++ 11 or above to use this form):
int main() {
int MyPasscode = 20;
auto address = &TheSecretNumber;
cout << address(MyPasscode) << endl;
}
Just so you can say that you’ve seen it, here’s what the address
declaration would look like without using a typedef
:
int (*address)(int) = &TheSecretNumber;
The giveaway should be that you have two things in parentheses side by side, and the set on the right has only types inside it. The one on the left has a variable name. So this line is not declaring a type; rather, it’s declaring a variable.
When working with object-oriented programming (OOP), you need a way to access the methods within the object. Within an object’s code, you use the this
pointer to obtain the address of an object’s method so that you can access the method instance data directly.
Remember that each instance of a class gets its own copy of the properties unless the properties are static. But methods are shared throughout the class. Yes, you can distinguish static methods from nonstatic methods. But doing so just refers to the types of variables they access: Static methods can access only static properties, and you don’t need to refer to them with an instance. Nonstatic (that is, normal, regular) methods work with a particular instance. However, inside the memory, really only one copy of the method exists.
So how does the method know which instance to work with? The this
parameter gets passed into the method to differentiate between instances. Suppose you have a class called Gobstopper
that has a method called Chew()
. Next, you have an instance called MyGum
, and you call the Chew()
method, like so:
MyGum.Chew();
When the compiler generates assembly code for this, it actually passes a parameter into the function — the address of the MyGum
instance, also known as the this
pointer. Therefore only one Chew()
function is in the code, but to call it, you must use a particular instance of the class.
Because only one copy of the Chew()
method is in memory, you can take its address. But to do so requires some sort of cryptic-looking code. Here it is, quick and to the point. Suppose your class looks like this:
class Gobstopper {
public:
int WhichGobstopper;
int Chew(string name) {
cout << WhichGobstopper << endl;
cout << name << endl;
return WhichGobstopper;
}
};
The Chew()
method takes a string and returns an integer. Here’s a typedef
for a pointer to the Chew()
function:
typedef int (Gobstopper::*GobMember)(string);
And here’s a variable of the type GobMember
:
GobMember func = &Gobstopper::Chew;
auto func = &Gobstopper::Chew;
If you look closely at the typedef
, it looks similar to a regular function pointer. The only difference is that the class name and two colons precede the asterisk. Other than that, it’s a regular old function pointer.
But whereas a regular function pointer is limited to pointing to functions of a particular set of parameter types and a return type, this function pointer shares those restrictions but has a further limitation: It can point only to methods within the class Gobstopper
.
To call the function stored in the pointer, you need to have a particular instance. Notice that in the assignment of func
in the earlier code, there was no instance, just the class name and function, &Gobstopper::Chew
. So to call the function, grab an instance, add func
, and go! The FunctionPointer02
example, shown in Listing 1-2, contains a complete example with the class, the method address, and two separate instances.
LISTING 1-2: Taking the Address of a Method
#include <iostream>
using namespace std;
class Gobstopper {
public:
int WhichGobstopper;
int Chew(string name) {
cout << WhichGobstopper << endl;
cout << name << endl;
return WhichGobstopper;
}
};
int main() {
typedef int (Gobstopper::*GobMember)(string);
GobMember func = &Gobstopper::Chew;
Gobstopper inst;
inst.WhichGobstopper = 10;
Gobstopper another;
another.WhichGobstopper = 20;
(inst.*func)("Greg W.");
(another.*func)("Jennifer W.");
return 0;
}
The code begins by creating a typedef
called GobMember
, as discussed earlier in this section. It then creates a method pointer, func
, to access the method. When using C++ 11 or above, you can replace these two lines with the much easier-to-understand single line:
auto func = &Gobstopper::Chew;
Of course, when you use this alternative, the compiler must deduce the correct types, which it may not always do correctly. Using the typedef
gives you additional control at the cost of complexity.
The code then creates two instances of Gobstopper
, inst
and another
. In both cases, it directly assigns a value to WhichGobstopper
, which will vary depending on instance and accessed through this
. The final section calls the Chew()
method indirectly using func
in each instance and assigns a name as the input string.
When you run the code, you can see from the output that it is indeed calling the correct method for each instance:
10
Greg W.
20
Jennifer W.
Now, when you hear “the correct method for each instance,” what the statement really means is that the code is calling the same method each time but using a different instance. If you’re thinking in object-oriented terms, consider each instance as having its own copy of the method. Therefore it’s okay to say “the correct method for each instance.”
A static method is, in many senses, just a plain old function. The difference is that you have to use a class name to call a static function. But remember that a static method does not go with any particular instance of a class; therefore you don’t need to specify an instance when you call the static function.
Here’s an example class (as shown in the FunctionPointer03
example) with a static function:
class Gobstopper {
public:
static string MyClassName() {
return "Gobstopper!";
}
int WhichGobstopper;
int Chew(string name) {
cout << WhichGobstopper << endl;
cout << name << endl;
return WhichGobstopper;
}
};
And here’s some code that takes the address of the static function and calls it by using the address:
int main() {
typedef string (*StaticMember)();
StaticMember staticfunc = &Gobstopper::MyClassName;
cout << staticfunc() << endl;
return 0;
}
Note that the call staticfunc()
doesn’t refer to a specific instance and it doesn’t refer to the class, either. The application just called it. Because the truth is that deep down inside, the static function is just a plain old function.
This section discusses how to use references and assumes that you already know how to pass a parameter by reference when you’re writing a function. (For more information about passing parameters by reference, see Book 1, Chapter 8.) But you can use references for more than just parameter lists. You can declare a variable as a reference type. And just like job references, this use of references can be both good and devastating. So be careful when you use them.
Declaring a variable that is a reference is easy. Whereas the pointer uses an asterisk, *
, the reference uses an ampersand, &
. But it has a twist. You can’t just declare it, like this:
int &BestReference; // Nope! This won't work!
If you try this, you see an error that says BestReference declared as reference but not initialized
.
That sounds like a hint: Looks like you need to initialize it.
Yes, references need to be initialized. As the name implies, reference refers to another variable. Therefore, you need to initialize the reference so that it refers to some other variable, like so (as shown in the Reference01
example):
int ImSomebody;
int &BestReference = ImSomebody;
From this point on, the variable BestReference
refers to — that is, is an alias for — ImSomebody
. So, if you change the value of BestReference
, as shown here:
BestReference = 10;
you’ll really be setting ImSomebody
to 10
. Look at this code that could go inside main()
:
int ImSomebody;
int &BestReference = ImSomebody;
BestReference = 10;
cout << ImSomebody << endl;
When you run this code, you see the output
10
That is, setting BestReference
to 10
caused ImSomebody
to change to 10
, which you can see when you print the value of ImSomebody
. That’s what a reference does: It refers to another variable.
It’s possible to return a reference from a function. But be careful if you try to do this: You don’t want to return a reference to a local variable within a function, because when the function ends, the storage space for the local variables goes away.
But you can return a reference to a global variable. Or, if the function is a method, you can return a reference to a property.
For example, here’s a class found in the Reference02
example that has a function that returns a reference to one of its variables:
class DigInto {
private:
int secret;
public:
DigInto() { secret = 150; }
int &GetSecretVariable() { return secret; }
void Write() { cout << secret << endl; }
};
Notice that the constructor stores 150
in the secret
variable, which is private
. The GetSecretVariable()
function returns a reference to the private variable called secret
. The Write()
function writes out the value of the secret
variable. Lots of secrets here! And some surprises, too, which you discover shortly. You can use this class like so:
int main()
{
DigInto inst;
inst.Write();
int &pry = inst.GetSecretVariable();
pry = 30;
inst.Write();
auto &pry2 = inst.GetSecretVariable();
pry2 = 40;
inst.Write();
return 0;
}
The example uses two kinds of references, one int
and one auto
(you must have C++ 11 or above installed to use the second type). Notice that using auto
doesn’t eliminate the need for the &
to create a reference. If you were to write auto pry2 = inst.GetSecretVariable();
instead, you’d receive a warning message stating warning: variable ’pry2’ set but not used
. However, the code would still compile, and if you use the Build ⇒ Build and Run option, you might not even notice the warning.
When you run this example, you see the following output:
150
30
40
Here’s a look at the code in a little more detail. The first output line is the value in the secret
variable right after the application creates the instance. But look at the code carefully: The variable called pry
is a reference to an integer, and it gets the results of GetSecretVariable()
.What is that result? It’s a reference to the private variable called secret
— which means that pry
itself is now a reference to that variable. Yes, a variable outside the class now refers directly to a private member in the instance! After that, the code sets pry
to 30
. When the code calls Write()
again, the private variable will indeed change. (The same sequence occurs when you use the auto
variable pry2
.)
Creating code like this is a bad idea because it provides access to a private variable. The GetSecretVariable()
function pretty much wipes out any sense of the variable’s actually remaining private. The main()
function is able to grab a reference to it and poke around and change it however it wanted, as if it were not private!
That’s a problem with references: They can potentially leave your code wide open. Therefore, think twice before returning a reference to a variable. Here’s one of the biggest risks: Somebody else may be using this code, may not understand references, and may not realize that the variables called pry
and pry2
have a direct link to the private secret
variable. Such an inexperienced programmer might then write code that uses and changes pry
or pry2
— without realizing that the property is changing along with it. Later on, then, a bug results — a pretty nasty one at that!
However, now that you’ve received the usual warnings, know that references can be very powerful, provided that you understand what they do. When you use a reference, you can easily modify another variable without having to go through pointers — which can make life much easier sometimes. So, please: Use your newfound powers carefully.