string
ClassI try to catch every sentence, every word you and I say, and quickly lock all these sentences and words away in my literary storehouse because they might come in handy.
ANTON CHEKHOV, The Seagull
In Section 8.1, we introduced C strings. These C strings were simply arrays of characters terminated with the null character '\0'
. In order to manipulate these C strings, you needed to worry about all the details of handling arrays. For example, when you want to add characters to a C string and there is not enough room in the array, you must create another array to hold this longer string of characters. In short, C strings require the programmer to keep track of all the low-level details of how the C strings are stored in memory. This is a lot of extra work and a source of programmer errors. The ANSI/ISO standard for C++ specified that C++ must also have a class string
that allows the programmer to treat strings as a basic data type without needing to worry about implementation details. In this section we introduce you to this string
type.
string
The class string
is defined in the library whose name is also <string>
, and the definitions are placed in the std namespace
. So, in order to use the class string
, your code must contain the following (or something more or less equivalent):
#include <string>
using namespace std;
The class string
allows you to treat string values and string expressions very much like values of a simple type. You can use the =
operator to assign a value to a string variable, and you can use the +
sign to concatenate two strings. For example, suppose s1
, s2
, and s3
are objects of type string
and both s1
and s2
have string values. Then s3
can be set equal to the concatenation of the string value in s1
followed by the string value in s2
as follows:
s3 = s1 + s2;
There is no danger of s3
being too small for its new string value. If the sum of the lengths of s1
and s2
exceeds the capacity of s3
, then more space is automatically allocated for s3
.
As we noted earlier in this chapter, quoted strings are really C strings and so they are not literally of type string
. However, C++ provides automatic type casting of quoted strings to values of type string
. So, you can use quoted strings as if they were literal values of type string
, and we (and most others) will often refer to quoted strings as if they were values of type string
. For example,
s3 = "Hello Mom!";
sets the value of the string variable s3
to a string object with the same characters as in the C string "Hello Mom!"
.
The class string
has a default constructor that initializes a string
object to the empty string. The class string
also has a second constructor that takes one argument that is a standard C string and so can be a quoted string. This second constructor initializes the string
object to a value that represents the same string as its C-string argument. For example,
string phrase;
string noun("ants");
The first line declares the string variable phrase and initializes it to the empty string. The second line declares noun
to be of type string
and initializes it to a string value equivalent to the C string "ants"
. Most programmers when talking loosely would say that “noun is initialized to "ants"
,” but there really is a type conversion here. The quoted string "ants"
is a C string, not a value of type string
. The variable noun
receives a string value that has the same characters as "ants"
in the same order as "ants"
, but the string value is not terminated with the null character '\0'
. In fact, in theory at least, you do not know or care whether the string value of noun
is even stored in an array, as opposed to some other data structure.
There is an alternate notation for declaring a string variable and invoking a constructor. The following two lines are exactly equivalent:
string noun("ants");
string noun = "ants";
These basic details about the class string
are illustrated in Display 8.4. Note that, as illustrated there, you can output string
values using the operator <<
.
Consider the following line from Display 8.4:
phrase = "I love " + adjective + " " + noun + "!";
C++ must do a lot of work to allow you to concatenate strings in this simple and natural fashion. The string constant "I love"
is not an object of type string
. A string constant like "I love"
is stored as a C string (in other words, as a null-terminated array of characters). When C++ sees "I love"
as an argument to +, it finds the definition (or overloading) of + that applies to a value such as "I love"
. There are overloadings of the +
operator that have a C string on the left and a string on the right, as well as the reverse of this positioning. There is even a version that has a C string on both sides of the +
and produces a string
object as the value returned. Of course, there is also the overloading you expect, with the type string
for both operands.
C++ did not really need to provide all those overloading cases for +
. If these overloadings were not provided, C++ would look for a constructor that could perform a type conversion to convert the C string "I love"
to a value for which + did apply. In this case, the constructor with the one C-string parameter would perform just such a conversion. However, the extra overloadings are presumably more efficient.
The class string
is often thought of as a modern replacement for C strings. However, in C++ you cannot easily avoid also using C strings when you program with the class string
.
string
You can use the insertion operator <<
and cout
to output string
objects just as you do for data of other types. This is illustrated in Display 8.4. Input with the class string
is a bit more subtle.
The extraction operator >>
and cin
work the same for string
objects as for other data, but remember that the extraction operator ignores initial whitespace and stops reading when it encounters more whitespace. This is as true for strings as it is for other data. For example, consider the following code:
string s1, s2;
cin >> s1;
cin >> s2;
If the user types in
May the hair on your toes grow long and curly!
then s1
will receive the value "May"
with any leading (or trailing) whitespace deleted. The variable s2
receives the string "the"
. Using the extraction operator >>
and cin
, you can only read in words; you cannot read in a line or other string that contains a blank. Sometimes this is exactly what you want, but sometimes it is not at all what you want.
If you want your program to read an entire line of input into a variable of type string
, you can use the function getline
. The syntax for using getline
with string
objects is a bit different from what we described for C strings in Section 8.1. You do not use cin.getline
; instead, you make cin
the first argument to getline
.2 (Thus, this version of getline
is not a member function.)
string line;
cout << "Enter a line of input:\n";
getline(cin, line);
cout << line << "END OF OUTPUT\n";
When embedded in a complete program, this code produces a dialogue like the following:
Enter some input:
Do bedo to you!
Do bedo to you!END OF OUTPUT
If there were leading or trailing blanks on the line, then they too would be part of the string value read by getline
. This version of getline
is in the library <string>
. You can use a stream object connected to a text file in place of cin
to do input from a file using getline
.
You cannot use cin
and >>
to read in a blank character. If you want to read one character at a time, you can use cin.get
, which we discussed in Chapter 6. The function cin.get
reads values of type char
, not of type string
, but it can be helpful when handling string
input. Display 8.5 contains a program that illustrates both getline
and cin.get
used for string
input. The significance of the function newLine
is explained in the Pitfall subsection entitled Mixing cin >>
variable and getline
Consider the following code (and assume that it is embedded in a complete and correct program and then run):
string s1, s2;
cout << "Enter a line of input:\n";
cin >> s1 >> s2;
cout << s1 << "*" << s2 << "<END OF OUTPUT";
If the dialogue begins as follows, what will be the next line of output?
Enter a line of input:
A string is a joy forever!
Consider the following code (and assume that it is embedded in a complete and correct program and then run):
string s;
cout << "Enter a line of input:\n";
getline(cin, s);
cout << s << "<END OF OUTPUT";
If the dialogue begins as follows, what will be the next line of output?
Enter a line of input:
A string is a joy forever!
getline
So far, we have described the following way of using getline
:
string line;
cout << "Enter a line of input:\n";
getline(cin, line);
This version stops reading when it encounters the end-of-line marker '\n'
. There is a version that allows you to specify a different character to use as a stopping signal. For example, the following will stop when the first question mark is encountered:
string line;
cout << "Enter some input:\n";
getline(cin, line, '?');
It makes sense to use getline
as if it were a void
function, but it actually returns a reference to its first argument, which is cin
in the code above. Thus, the following will read a line of text into s1
and a string of nonwhitespace characters into s2
:
string s1, s2;
getline(cin, s1) >> s2;
The invocation getline
(cin,s1
) returns a reference to cin
, so that after the invocation of getline
, the next thing to happen is equivalent to
cin >> s2;
This kind of use of getline
seems to have been designed for use in a C++ quiz show rather than to meet any actual programming need, but it can come in handy sometimes.
string
The class string
allows you to perform the same operations that you can perform with the C strings we discussed in Section 8.1 and more. You can access the characters in a string
object in the same way that you access array elements, so string
objects have all the advantages of arrays of characters plus a number of advantages that arrays do not have, such as automatically increasing their capacity. If lastName
is the name of a string
object, then lastName[i]
gives access to the i
th character in the string represented by lastName
. This use of array square brackets is illustrated in Display 8.6.
Display 8.6 also illustrates the member function length
. Every string
object has a member function named length
that takes no arguments and returns the length of the string represented by the string
object. Thus, not only can a string
object be used like an array but the length member function makes it behave like a partially filled array that automatically keeps track of how many positions are occupied.
When used with an object of the class string
, the array square brackets do not check for illegal indexes. If you use an illegal index (that is, an index that is greater than or equal to the length of the string in the object), then the results are unpredictable but are bound to be bad. You may just get strange behavior without any error message that tells you that the problem is an illegal index value.
There is a member function named at
that does check for illegal index values. This member function behaves basically the same as the square brackets, except for two points: You use function notation with at
, so instead of a[i]
, you use a.at(i)
; and the at
member function checks to see if i
evaluates to an illegal index. If the value of i
in a.at(i)
is an illegal index, then you should get a run-time error message telling you what is wrong. In the following two example code fragments, the attempted access is out of range, yet the first of these probably will not produce an error message, although it will be accessing a nonexistent indexed variable:
string str("Mary");
cout << str[6] << endl;
The second example, however, will cause the program to terminate abnormally, so you at least know that something is wrong:
string str("Mary");
cout << str.at(6) << endl;
But be warned that some systems give very poor error messages when str.at(i)
has an illegal index i.
You can change a single character in the string by assigning a char
value to the indexed variable, such as str[i]
. This may also be done with the member function at
. For example, to change the third character in the string
object str
to 'X'
, you can use either of the following code fragments:
str.at(2) = 'X';
or
str[2] = 'X';
As in an ordinary array of characters, character positions for objects of type string
are indexed starting with 0, so the third character in a string
is in index position 2.
Display 8.7 gives a partial list of the member functions of the class string
. In many ways, objects of the class string
are better behaved than the C strings we introduced in Section 8.1. In particular, the ==
operator on objects of the string
class returns a result that corresponds to our intuitive notion of strings being equal—namely, it returns true
if the two strings contain the same characters in the same order, and returns false
otherwise. Similarly, the comparison operators <, >, < =, > =
compare string objects using lexicographic ordering. (Lexicographic ordering is alphabetic ordering using the order of symbols given in the ASCII character set in Appendix 3. If the strings consist of all letters and are both either all uppercase or all lowercase letters, then for this case lexicographic ordering is the same as everyday alphabetical ordering.)
string
Example | Remarks |
---|---|
Constructors | |
string str; |
Default constructor creates empty string object str . |
string str("sample"); |
Creates a string object with data “sample”. |
string str(aString); |
Creates a string object str that is a copy of aString; aString is an object of the class string. |
Accessors | |
str[i] |
Returns read/write reference to character in str at index i. Does not check for illegal index. |
str.at(i) |
Returns read/write reference to character in str at index i. Same as str[i] , but this version checks for illegal index. |
str.substr(position, length) |
Returns the substring of the calling object starting at position and having length characters. |
str.length( ) |
Returns the length of str . |
Assignment/Modifiers | |
str1 = str2; |
Initializes str1 to str2's data. |
str1 += str2; |
Character data of str2 is concatenated to the end of str1 . |
str.empty( ) |
Returns true if str is an empty string; false otherwise. |
str1 + str2 |
Returns a string that has str2's data concatenated to the end of str1's data. |
str.insert(pos, str2); |
Inserts str2 into str beginning at position pos . |
str.erase(pos, length); |
Removes substring of size length , starting at position pos . |
Comparison | |
str1 == str2 str1 != str2 |
Compare for equality or inequality; returns a Boolean value. |
str1 < str2 str1 > str2 |
Four comparisons. All are lexicographical comparisons. |
str1 <= str2 str1 >= str2 |
|
Finds | |
str.find(str1) |
Returns index of the first occurrence of str1 in str . If str1 is not found, then the special value string::npos is returned. |
str.find(str1, pos) |
Returns index of the first occurrence of string str1 in str; the search starts at position pos . |
str.find_first_of(str1, pos) |
Returns the index of the first instance in str of any character in str1 , starting the search at position pos . |
str.find_first_not_of (str1, pos) |
Returns the index of the first instance in str of any character not in str1 , starting the search at position pos . |
A palindrome is a string that reads the same front to back as it does back to front. The program in Display 8.8 tests an input string to see if it is a palindrome. Our palindrome test will disregard all spaces and punctuations and will consider upper- and lowercase versions of a letter to be the same when deciding if something is a palindrome. Some palindrome examples are as follows:
Able was I ere I saw Elba.
I Love Me, Vol. I.
Madam, I’m Adam.
A man, a plan, a canal, Panama.
Rats live on no evil star.
radar
deed
mom
racecar
The removePunct
function is of interest in that it uses the string member functions substr
and find
. The member function substr
extracts a substring of the calling object, given the position and length of the desired substring.
The first three lines of removePunct
declare variables for use in the function. The for
loop runs through the characters of the parameters one at a time and tries to find them in the punct
string. To do this, a string that is the substring of s
, of length 1 at each character position, is extracted. The position of this substring in the punct
string is determined using the find
member function. If this one-character string is not in the punct
string, then the one-character string is concatenated to the noPunct
string that is to be returned.
=
and ==
Are Different for strings
and C StringsThe operators =, ==, !=, <, >, <=, >=
, when used with the standard C++ type string
, produce results that correspond to our intuitive notion of how strings compare. They do not misbehave as they do with the C strings, as we discussed in Section 8.1
Consider the following code:
string s1, s2("Hello");
cout << "Enter a line of input:\n";
cin >> s1;
if (s1 == s2)
cout << "Equal\n";
else cout << "Not equal\n";
If the dialogue begins as follows, what will be the next line of output?
Enter a line of input:
Hello friend!
What is the output produced by the following code?
string s1, s2("Hello");
s1 = s2;
s2[0] = 'J';
cout << s1 << " " << s2;
string
Objects and C StringsYou have already seen that C++ will perform an automatic type conversion to allow you to store a C string in a variable of type string
. For example, the following will work fine:
char aCString[] = "This is my C string.";
string stringVariable;
stringVariable = aCString;
However, the following will produce a compiler error message:
aCString = stringVariable; //ILLEGAL
The following is also illegal:
strcpy(aCString, stringVariable); //ILLEGAL
strcpy
cannot take a string
object as its second argument, and there is no automatic conversion of string
objects to C strings, which is the problem we cannot seem to get away from.
To obtain the C string corresponding to a string
object, you must perform an explicit conversion. This can be done with the string
member function c_str( )
. The correct version of the copying we have been trying to do is the following:
strcpy(aCString, stringVariable.c_str( )); //Legal;
Note that you need to use the strcpy
function to do the copying. The member function c_str( )
returns the C string corresponding to the string
calling object. As we noted earlier in this chapter, the assignment operator does not work with C strings. So, just in case you thought the following might work, we should point out that it too is illegal.
aCString = stringVariable.c_str( ); //ILLEGAL
Prior to C++11 it was a bit complicated to convert between strings and numbers, but in C++11 it is simply a matter of calling a function. Use stof
, stod
, stoi
, or stol
to convert a string to a float
, double
, int
, or long
, respectively. Use to_string
to convert a numeric type to a string. These functions are illustrated in the following example:
int i;
double d;
string s;
i = stoi("35"); // Converts the string "35" to an integer 35
d = stod("2.5"); // Converts the string "2.5" to the double 2.5
s = to_string(d*2); // Converts the double 5.0 to a string "5.0000"
cout << i << " " << d << " " << s << endl;
The output is 35 2.5 5.0000