Let’s start by leaving out lines that have no content at
all. There’s a special constant for the empty string; we saw it earlier:
String.Empty
. Let’s see what happens if
we use the code in Example 10-75, which
writes the line to the console only if it is not equal to String.Empty
.
Example 10-75. Detecting empty strings
foreach (string line in strings) { if (line != String.Empty) { output.AppendLine(line); } else { System.Diagnostics.Debug.WriteLine("Found a blank line"); } }
You might be wondering exactly how string comparisons are
performed. Some languages base string comparison on object identity so
that "Abc"
is not equal to a different
string object that also contains "Abc"
.
(That may seem weird, but in one sense it’s consistent: comparing
reference types always means asking “do these two variables refer to the
same thing?”) But in C#, when you have distinct string objects, it
performs a “character-like” comparison between strings, so any two strings
containing the same sequence of characters are equal. This is different
from how most reference types work, but by treating strings as a special
case, the result is closer to what most people would expect. (Or at least
to what most people who hadn’t already become accustomed to the oddities
of another language might expect.)
Because not all languages use by-value string comparison, the .NET
Framework supports the by-identity style too. Consequently, you get
by-value comparison only if the C# compiler knows it’s dealing with
strings. If you store two strings in variables of type object
, the C# compiler loses track of the
fact that they are strings, so if you compare these variables with the
==
operator, it doesn’t
know it should provide the string-specific by-value comparison, and will
instead do the default by-identity comparison you get for most reference
types.
For the sake of working out what is going on, we’re also writing a message to the debug output each time we find a blank line.
If we build and run, the output to the console looks like this:
To be, or not to be--that is the question: Whether 'tis nobelr in the mind to suffer, The slings and arrows of outrageous fortune , Or to take arms against a sea of troubles, And by opposing end them.
The debug output indicates that the code found and removed eight blank lines. (If you can’t see the Output panel in Visual Studio, you can show it with the View→Output menu item. Ensure that the “Show output from” drop down has Debug selected.) But apparently it missed some, judging by the output.
So which are the eight “blank” lines—that is, the lines that are the
equivalent of String.Empty
? If you
single-step through the debugger, you’ll see that they are the ones that
look like ""
and String.Empty
.
The ones that contain just whitespace account for some of
the remaining blanks in the output. While visibly blank, these are clearly
not “empty”—they contain whitespace characters. We’ll deal with that in a
minute. The other line that looks “empty” but isn’t is the null
string.
As we said earlier, strings are reference types. There is,
therefore, a considerable difference between a null reference to a string,
and an empty string, as far as the .NET runtime is concerned. However, a
lot of applications don’t care about this distinction, so it can sometimes
be useful to treat a null string in much the same way as an empty string. The
String
class offers a static method
that lets us test for nullness-or-emptiness with a single call, which
Example 10-76 uses.
Example 10-76. Testing for either blank or null
foreach (string line in strings)
{
if (!String.IsNullOrEmpty(line))
{
output.AppendLine(line);
}
else
{
System.Diagnostics.Debug.WriteLine("Found a blank line");
}
}
Notice we have to use the !
operator, as the static method returns true
if the string is null
or empty. Our output is now stripped of “blank” lines except the one that
contains just whitespace. If you check the debug output panel, you’ll see
that nine lines have been ignored:
To be, or not to be--that is the question: Whether 'tis nobelr in the mind to suffer, The slings and arrows of outrageous fortune , Or to take arms against a sea of troubles, And by opposing end them.
So, what can we do about that remaining blank line at the start? We can deal with this by stripping out spurious whitespace, and then looking to see whether anything is left. Not only will this fix our blank-line problem, but it will also remove any whitespace that the user has left at the start and end of the line.