C# 4.0: The Complete Reference

CHAPTER 3
Data Types, Literals, and Variables

This chapter examines three fundamental elements of C#: data types, literals, and variables. In general, the types of data that a language provides define the kinds of problems to which the language can be applied. As you might expect, C# offers a rich set of built-in data types, which makes C# suitable for a wide range of applications. You can create variables of any of these types, and you can specify constants of each type, which in the language of C# are called literals.

Why Data Types Are Important

Data types are especially important in C# because it is a strongly typed language. This means that, as a general rule, all operations are type-checked by the compiler for type compatibility. Illegal operations will not be compiled. Thus, strong type-checking helps prevent errors and enhances reliability. To enable strong type-checking, all variables, expressions, and values have a type. There is no concept of a “typeless” variable, for example. Furthermore, a value’s type determines what operations are allowed on it. An operation allowed on one type might not be allowed on another.

NOTE C# 4.0 adds a new data type called dynamic, which causes type checking to be deferred until runtime, rather than occurring at compile time. Thus, the dynamic type is an exception to C#’s normal compile-time type checking. The dynamic type is discussed in Chapter 17.

C#’s Value Types

C# contains two general categories of built-in data types: value types and reference types. The difference between the two types is what a variable contains. For a value type, a variable holds an actual value, such 3.1416 or 212. For a reference type, a variable holds a reference to the value. The most commonly used reference type is the class, and a discussion of classes and reference types is deferred until later in this book. The value types are described here.

At the core of C# are the 13 value types shown in Table 3-1. Collectively, these are referred to as the simple types. They are called simple types because they consist of a single value. (In other words, they are not a composite of two or more values.) They form the foundation of C#’s type system, providing the basic, low-level data elements upon which a program operates. The simple types are also sometimes referred to as primitive types.

TABLE 3-1 The C# Value Types

C# strictly specifies a range and behavior for each value type. Because of portability requirements, C# is uncompromising on this account. For example, an int is the same in all execution environments. There is no need to rewrite code to fit a specific platform. Although strictly specifying the size of the value types may cause a small loss of performance in some environments, it is necessary in order to achieve portability.

NOTE In addition to the simple types, C# defines three other categories of value types. These are enumerations, structures, and nullable types, all of which are described later in this book.

Integers

C# defines nine integer types: char, byte, sbyte, short, ushort, int, uint, long, and ulong. However, the char type is primarily used for representing characters, and it is discussed later in this chapter. The remaining eight integer types are used for numeric calculations. Their bit-width and ranges are shown here:

As the table shows, C# defines both signed and unsigned versions of the various integer types. The difference between signed and unsigned integers is in the way the high-order bit of the integer is interpreted. If a signed integer is specified, then the C# compiler will generate code that assumes the high-order bit of an integer is to be used as a sign flag. If the sign flag is 0, then the number is positive; if it is 1, then the number is negative. Negative numbers are almost always represented using the two’s complement approach. In this method, all bits in the negative number are reversed, and then 1 is added to this number.

Signed integers are important for a great many algorithms, but they have only half the absolute magnitude of their unsigned relatives. For example, as a short, here is 32,767:

0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

For a signed value, if the high-order bit were set to 1, the number would then be interpreted as –1 (assuming the two’s complement format). However, if you declared this to be a ushort, then when the high-order bit was set to 1, the number would become 65,535.

Probably the most commonly used integer type is int. Variables of type int are often employed to control loops, to index arrays, and for general-purpose integer math. When you need an integer that has a range greater than int, you have many options. If the value you want to store is unsigned, you can use uint. For large signed values, use long. For large unsigned values, use ulong. For example, here is a program that computes the distance from the Earth to the sun, in inches. Because this value is so large, the program uses a long variable to hold it.

// Compute the distance from the Earth to the sun, in inches.

using System;

class Inches {
  static void Main() {
    long inches;
    long miles;

    miles = 93000000; // 93,000,000 miles to the sun

    // 5,280 feet in a mile, 12 inches in a foot.
    inches = miles * 5280 * 12;

    Console.WriteLine("Distance to the sun: " +
                      inches + " inches.");
  }
}

Here is the output from the program:

Distance to the sun: 5892480000000 inches.

Clearly, the result could not have been held in an int or uint variable.

The smallest integer types are byte and sbyte. The byte type is an unsigned value between 0 and 255. Variables of type byte are especially useful when working with raw binary data, such as a byte stream produced by some device. For small signed integers, use sbyte. Here is an example that uses a variable of type byte to control a for loop that produces the summation of the number 100.

// Use byte.

using System;

class Use_byte {
  static void Main() {
    byte x;
    int sum;

    sum = 0;
    for(x = 1; x <= 100; x++)
      sum = sum + x;

   Console.WriteLine("Summation of 100 is " + sum);
  }
}

The output from the program is shown here:

Summation of 100 is 5050

Since the for loop runs only from 0 to 100, which is well within the range of a byte, there is no need to use a larger type variable to control it.

When you need an integer that is larger than a byte or sbyte, but smaller than an int or uint, use short or ushort.

Floating-Point Types

The floating-point types can represent numbers that have fractional components. There are two kinds of floating-point types, float and double, which represent single- and double-precision numbers, respectively. The type float is 32 bits wide and has an approximate range of 1.5E–45 to 3.4E+38. The double type is 64 bits wide and has an approximate range of 5E–324 to 1.7E+308.

Of the two, double is the most commonly used. One reason for this is that many of the math functions in C#’s class library (which is the .NET Framework library) use double values. For example, the Sqrt( ) method (which is defined by the library class System.Math) returns a double value that is the square root of its double argument. Here, Sqrt( ) is used to compute the radius of a circle given the circle’s area:

// Find the radius of a circle given its area.

using System;

class FindRadius {
  static void Main() {
    Double r;
    Double area;

    area = 10.0;

    r = Math.Sqrt(area / 3.1416);
    Console.WriteLine("Radius is " + r);
  }
}

The output from the program is shown here:

Radius is 1.78412203012729

One other point about the preceding example. As mentioned, Sqrt( ) is a member of the Math class. Notice how Sqrt( ) is called; it is preceded by the name Math. This is similar to the way Console precedes WriteLine( ). Although not all standard methods are called by specifying their class name first, several are, as the next example shows.

The following program demonstrates several of C#’s trigonometric functions, which are also part of C#’s math library. They also operate on double data. The program displays the sine, cosine, and tangent for the angles (measured in radians) from 0.1 to 1.0.

//  Demonstrate Math.Sin(), Math.Cos(), and Math.Tan().

using System;

class Trigonometry {
  static void Main() {
    Double theta; // angle in radians

    for(theta = 0.1; theta <= 1.0; theta = theta + 0.1) {
      Console.WriteLine("Sine of " + theta + "  is " +
                         Math.Sin(theta));
      Console.WriteLine("Cosine of " + theta + "  is " +
                         Math.Cos(theta));
      Console.WriteLine("Tangent of " + theta + "  is " +
                         Math.Tan(theta));
      Console.WriteLine();
    }
  }
}

Here is a portion of the program’s output:

Sine of 0.1  is 0.0998334166468282
Cosine of 0.1  is 0.995004165278026
Tangent of 0.1  is 0.100334672085451

Sine of 0.2  is 0.198669330795061
Cosine of 0.2  is 0.980066577841242
Tangent of 0.2  is 0.202710035508673

Sine of 0.3  is 0.29552020666134
Cosine of 0.3  is 0.955336489125606
Tangent of 0.3  is 0.309336249609623

To compute the sine, cosine, and tangent, the standard library methods Math.Sin( ), Math.Cos( ), and Math.Tan( ) are used. Like Math.Sqrt( ), the trigonometric methods are called with a double argument, and they return a double result. The angles must be specified in radians.

The decimal Type

Perhaps the most interesting C# numeric type is decimal, which is intended for use in monetary calculations. The decimal type utilizes 128 bits to represent values within the range 1E–28 to 7.9E+28. As you may know, normal floating-point arithmetic is subject to a variety of rounding errors when it is applied to decimal values. The decimal type eliminates these errors and can accurately represent up to 28 decimal places (or 29 places in some cases). This ability to represent decimal values without rounding errors makes it especially useful for computations that involve money.

Here is a program that uses a decimal type in a financial calculation. The program computes the discounted price given the original price and a discount percentage.

// Use the decimal type to compute a discount.

using System;

class UseDecimal {
  static void Main() {
    decimal price;
    decimal discount;
    decimal discounted_price;

    // Compute discounted price.
    price = 19.95m;
    discount = 0.15m; // discount rate is 15%

    discounted_price = price - ( price * discount);

    Console.WriteLine("Discounted price: $" + discounted_price);
  }
}

The output from this program is shown here:

Discounted price: $16.9575

In the program, notice that the decimal constants are followed by the m suffix. This is necessary because without the suffix, these values would be interpreted as standard floating-point constants, which are not compatible with the decimal data type. You can assign an integer value, such as 10, to a decimal variable without the use of the m suffix, though. (A detailed discussion of numeric constants is found later in this chapter.)

Here is another example that uses the decimal type. It computes the future value of an investment that has a fixed rate of return over a period of years.

/*
   Use the decimal type to compute the future value
   of an investment.
*/

using System;

class FutVal {
  static void Main() {
    decimal amount;
    decimal rate_of_return;
    int years, i;

    amount = 1000.0M;
    rate_of_return = 0.07M;
    years = 10;

    Console.WriteLine("Original investment: $" + amount);
    Console.WriteLine("Rate of return: " + rate_of_return);
    Console.WriteLine("Over " + years + " years");

    for(i = 0; i < years; i++)
       amount = amount + (amount * rate_of_return);

    Console.WriteLine("Future value is $" + amount);
  }
}

Here is the output:

Original investment: $1000
Rate of return: 0.07
Over 10 years
Future value is $1967.151357289565322490000

Notice that the result is accurate to several decimal places—more than you would probably want! Later in this chapter you will see how to format such output in a more appealing fashion.

Characters

In C#, characters are not 8-bit quantities like they are in many other computer languages, such as C++. Instead, C# uses a 16-bit character type called Unicode. Unicode defines a character set that is large enough to represent all of the characters found in all human languages. Although many languages, such as English, French, and German, use relatively small alphabets, some languages, such as Chinese, use very large character sets that cannot be represented using just 8 bits. To address this situation, in C#, char is an unsigned 16-bit type having a range of 0 to 65,535. The standard 8-bit ASCII character set is a subset of Unicode and ranges from 0 to 127. Thus, the ASCII characters are still valid C# characters.

A character variable can be assigned a value by enclosing the character inside single quotes. For example, this assigns X to the variable ch:

char ch;
ch = 'X';

You can output a char value using a WriteLine( ) statement. For example, this line outputs the value in ch:

Console.WriteLine("This is ch: " + ch);

Although char is defined by C# as an integer type, it cannot be freely mixed with integers in all cases. This is because there are no automatic type conversions from integer to char.

For example, the following fragment is invalid:

char ch;

ch = 88; // error, won't work

The reason the preceding code will not work is that 10 is an integer value, and it won’t automatically convert to a char. If you attempt to compile this code, you will see an error message. To make the assignment legal, you would need to employ a cast, which is described later in this chapter.

The bool Type

The bool type represents true/false values. C# defines the values true and false using the reserved words true and false. Thus, a variable or expression of type bool will be one of these two values. Furthermore, there is no conversion defined between bool and integer values. For example, 1 does not convert to true, and 0 does not convert to false.

Here is a program that demonstrates the bool type:

// Demonstrate bool values.

using System;

class BoolDemo {
  static void Main() {
    bool b;

    b = false;
    Console.WriteLine("b is " + b);
    b = true;
    Console.WriteLine("b is " + b);

    // A bool value can control the if statement.
    if(b) Console.WriteLine("This is executed.");

    b = false;
    if(b) Console.WriteLine("This is not executed.");

    // Outcome of a relational operator is a bool value.
    Console.WriteLine("10 > 9 is " + (10 > 9));
  }
}

The output generated by this program is shown here:

b is False
b is True
This is executed.
10 > 9 is True

There are three interesting things to notice about this program. First, as you can see, when a bool value is output by WriteLine( ), “True” or “False” is displayed. Second, the value of a bool variable is sufficient, by itself, to control the if statement. There is no need to write an if statement like this:

if(b == true) ...

Third, the outcome of a relational operator, such as <, is a bool value. This is why the expression 10 > 9 displays the value “True.” Further, the extra set of parentheses around 10 > 9 is necessary because the + operator has a higher precedence than the >.

Some Output Options

Up to this point, when data has been output using a WriteLine( ) statement, it has been displayed using the default format. However, the .NET Framework defines a sophisticated formatting mechanism that gives you detailed control over how data is displayed. Although formatted I/O is covered in detail later in this book, it is useful to introduce some formatting options at this time. Using these options, you will be able to specify the way values look when output via a WriteLine( ) statement. Doing so enables you to produce more appealing output. Keep in mind that the formatting mechanism supports many more features than described here.

When outputting lists of data, you have been separating each part of the list with a plus sign, as shown here:

Console.WriteLine("You ordered " + 2 + " items at $" + 3 + " each.");

While very convenient, outputting numeric information in this way does not give you any control over how that information appears. For example, for a floating-point value, you can’t control the number of decimal places displayed. Consider the following statement:

Console.WriteLine("Here is 10/3: " + 10.0/3.0);

It generates this output:

Here is 10/3: 3.33333333333333

Although this might be fine for some purposes, displaying so many decimal places could be inappropriate for others. For example, in financial calculations, you will usually want to display two decimal places.

To control how numeric data is formatted, you will need to use a second form of WriteLine( ), shown here, which allows you to embed formatting information:

WriteLine(“format string”, arg0, arg1, ..., argN);

In this version, the arguments to WriteLine( ) are separated by commas and not + signs. The format string contains two items: regular, printing characters that are displayed as-is, and format specifiers. Format specifiers take this general form:

{argnum, width: fmt}

Here, argnum specifies the number of the argument (starting from zero) to display. The minimum width of the field is specified by width, and the format is specified by fmt. The width and fmt are optional.

During execution, when a format specifier is encountered in the format string, the corresponding argument, as specified by argnum, is substituted and displayed. Thus, the position of a format specification within the format string determines where its matching data will be displayed. Both width and fmt are optional. Therefore, in its simplest form, a format specifier simply indicates which argument to display. For example, {0} indicates arg0, {1} specifies arg1, and so on.

Let’s begin with a simple example. The statement

Console.WriteLine("February has {0} or {1} days.", 28, 29);

produces the following output:

February has 28 or 29 days.

As you can see, the value 28 is substituted for {0}, and 29 is substituted for {1}. Thus, the format specifiers identify the location at which the subsequent arguments, in this case 28 and 29, are displayed within the string. Furthermore, notice that the additional values are separated by commas, not + signs.

Here is a variation of the preceding statement that specifies minimum field widths:

Console.WriteLine("February has {0,10} or {1,5} days.", 28, 29);

It produces the following output:

February has 28 or 29 days.

As you can see, spaces have been added to fill out the unused portions of the fields. Remember, a minimum field width is just that: the minimum width. Output can exceed that width if needed.

Of course, the arguments associated with a format command need not be constants. For example, this program displays a table of squares and cubes. It uses format commands to output the values.

// Use format commands.

using System;

class DisplayOptions {
  static void Main() {
    int i;

    Console.WriteLine("Value\tSquared\tCubed");

    for(i = 1; i < 10; i++)
      Console.WriteLine("{0}\t{1}\t{2}", i, i*i, i*i*i);
  }
}

The output is shown here:

Value   Squared Cubed
1       1        1
2       4        8
3       9        27
4       16       64
5       25       125
6       36       216
7       49       343
8       64       512
9       81       729

In the preceding examples, no formatting was applied to the values themselves. Of course, the purpose of using format specifiers is to control the way the data looks. The types of data most commonly formatted are floating-point and decimal values. One of the easiest ways to specify a format is to describe a template that WriteLine( ) will use. To do this, show an example of the format that you want, using #s to mark the digit positions. You can also specify the decimal point and commas. For example, here is a better way to display 10 divided by 3:

Console.WriteLine("Here is 10/3: {0:#.##}", 10.0/3.0);

The output from this statement is shown here:

Here is 10/3: 3.33

In this example, the template is #.##, which tells WriteLine( ) to display two decimal places. It is important to understand, however, that WriteLine( ) will display more than one digit to the left of the decimal point, if necessary, so as not to misrepresent the value.

Here is another example. This statement

Console.WriteLine("{0:###,###.##}", 123456.56);

generates this output:

123,456.56

If you want to display monetary values, use the C format specifier. For example:

decimal balance;

balance = 12323.09m;
Console.WriteLine("Current balance is {0:C}", balance);

The output from this sequence is shown here (in U.S. dollar format):

Current balance is $12,323.09

The C format can be used to improve the output from the price discount program shown earlier:

// Use the C format specifier to output dollars and cents.

using System;
class UseDecimal {
  static void Main() {
    decimal price;
    decimal discount;
    decimal discounted_price;

    // Compute discounted price.
    price = 19.95m;
    discount = 0.15m; // discount rate is 15%

    discounted_price = price - ( price * discount);

    Console.WriteLine("Discounted price: {0:C}", discounted_price);
  }
}

Here is the way the output now looks:

Discounted price: $16.96

Literals

In C#, literals refer to fixed values that are represented in their human-readable form. For example, the number 100 is a literal. For the most part, literals and their usage are so intuitive that they have been used in one form or another by all the preceding sample programs. Now the time has come to explain them formally.

C# literals can be of any simple type. The way each literal is represented depends upon its type. As explained earlier, character literals are enclosed between single quotes. For example, ‘a’ and ‘%’ are both character literals.

Integer literals are specified as numbers without fractional components. For example, 10 and –100 are integer literals. Floating-point literals require the use of the decimal point followed by the number’s fractional component. For example, 11.123 is a floating-point literal. C# also allows you to use scientific notation for floating-point numbers.

Since C# is a strongly typed language, literals, too, have a type. Naturally, this raises the following question: What is the type of a numeric literal? For example, what is the type of 12, 123987, or 0.23? Fortunately, C# specifies some easy-to-follow rules that answer these questions.

First, for integer literals, the type of the literal is the smallest integer type that will hold it, beginning with int. Thus, an integer literal is either of type int, uint, long, or ulong, depending upon its value. Second, floating-point literals are of type double.

If C#’s default type is not what you want for a literal, you can explicitly specify its type by including a suffix. To specify a long literal, append an l or an L. For example, 12 is an int, but 12L is a long. To specify an unsigned integer value, append a u or U. Thus, 100 is an int, but 100U is a uint. To specify an unsigned, long integer, use ul or UL. For example, 984375UL is of type ulong.

To specify a float literal, append an F or f to the constant. For example, 10.19F is of type float. Although redundant, you can specify a double literal by appending a D or d. (As just mentioned, floating-point literals are double by default.)

To specify a decimal literal, follow its value with an m or M. For example, 9.95M is a decimal literal.

Although integer literals create an int, uint, long, or ulong value by default, they can still be assigned to variables of type byte, sbyte, short, or ushort as long as the value being assigned can be represented by the target type.

Hexadecimal Literals

As you probably know, in programming it is sometimes easier to use a number system based on 16 instead of 10. The base 16 number system is called hexadecimal and uses the digits 0 through 9 plus the letters A through F, which stand for 10, 11, 12, 13, 14, and 15. For example, the hexadecimal number 10 is 16 in decimal. Because of the frequency with which hexadecimal numbers are used, C# allows you to specify integer literals in hexadecimal format. A hexadecimal literal must begin with 0x (a 0 followed by an x). Here are some examples:

count = 0xFF; // 255 in decimal
incr = 0x1a; // 26 in decimal

Character Escape Sequences

Enclosing character literals in single quotes works for most printing characters, but a few characters, such as the carriage return, pose a special problem when a text editor is used. In addition, certain other characters, such as the single and double quotes, have special meaning in C#, so you cannot use them directly. For these reasons, C# provides special escape sequences, sometimes referred to as backslash character constants, shown in Table 3-2. These sequences are used in place of the characters they represent.

For example, this assigns ch the tab character:

ch = '\t';

The next example assigns a single quote to ch:

ch = '\'';

TABLE 3-2 Character Escape Sequences

String Literals

C# supports one other type of literal: the string. A string literal is a set of characters enclosed by double quotes. For example,

"this is a test"

is a string. You have seen examples of strings in many of the WriteLine( ) statements in the preceding sample programs.

In addition to normal characters, a string literal can also contain one or more of the escape sequences just described. For example, consider the following program. It uses the \n and \t escape sequences.

// Demonstrate escape sequences in strings.

using System;

class StrDemo {
  static void Main() {
    Console.WriteLine("Line One\nLine Two\nLine Three");
    Console.WriteLine("One\tTwo\tThree");
    Console.WriteLine("Four\tFive\tSix");

    // Embed quotes.
    Console.WriteLine("\"Why?\", he asked.");
  }
}

The output is shown here:

Line One
Line Two
Line Three
One Two Three
Four Five Six
"Why?", he asked.

Notice how the \n escape sequence is used to generate a new line. You don’t need to use multiple WriteLine( ) statements to get multiline output. Just embed \n within a longer string at the points where you want the new lines to occur. Also note how a quotation mark is generated inside a string.

In addition to the form of string literal just described, you can also specify a verbatim string literal. A verbatim string literal begins with an @, which is followed by a quoted string. The contents of the quoted string are accepted without modification and can span two or more lines. Thus, you can include newlines, tabs, and so on, but you don’t need to use the escape sequences. The only exception is that to obtain a double quote (“), you must use two double quotes in a row (“”). Here is a program that demonstrates verbatim string literals:

// Demonstrate verbatim string literals.

using System;

class Verbatim {
  static void Main() {
    Console.WriteLine(@"This is a verbatim
string literal
that spans several lines.
");
    Console.WriteLine(@"Here is some tabbed output:
1       2       3       4
5       6       7       8
");
    Console.WriteLine(@"Programmers say, ""I like C#.""");
  }
}

The output from this program is shown here:

This is a verbatim
string literal
that spans several lines.

Here is some tabbed output:
1 2 3 4
5 6 7 8

Programmers say, "I like C#."

The important point to notice about the preceding program is that the verbatim string literals are displayed precisely as they are entered into the program.

The advantage of verbatim string literals is that you can specify output in your program exactly as it will appear on the screen. However, in the case of multiline strings, the wrapping will obscure the indentation of your program. For this reason, the programs in this book will make only limited use of verbatim string literals. That said, they are still a wonderful benefit for many formatting situations.

One last point: Don’t confuse strings with characters. A character literal, such as 'X', represents a single letter of type char. A string containing only one letter, such as "X", is still a string.

A Closer Look at Variables

Variables are declared using this form of statement:

type var-name;

where type is the data type of the variable and var-name is its name. You can declare a variable of any valid type, including the value types just described. It is important to understand that a variable’s capabilities are determined by its type. For example, a variable of type bool cannot be used to store floating-point values. Furthermore, the type of a variable cannot change during its lifetime. An int variable cannot turn into a char variable, for example.

All variables in C# must be declared prior to their use. As a general rule, this is necessary because the compiler must know what type of data a variable contains before it can properly compile any statement that uses the variable. It also enables the compiler to perform strict type-checking.

C# defines several different kinds of variables. The kind that we have been using are called local variables because they are declared within a method.

Initializing a Variable

One way to give a variable a value is through an assignment statement, as you have already seen. Another way is by giving it an initial value when it is declared. To do this, follow the variable’s name with an equal sign and the value being assigned. The general form of initialization is shown here:

type var-name = value;

Here, value is the value that is given to the variable when it is created. The value must be compatible with the specified type.

Here are some examples:

int count = 10; // give count an initial value of 10
char ch = 'X'; // initialize ch with the letter X
float f = 1.2F; // f is initialized with 1.2

When declaring two or more variables of the same type using a comma-separated list, you can give one or more of those variables an initial value. For example:

int a, b = 8, c = 19, d; // b and c have initializations

In this case, only b and c are initialized.

Dynamic Initialization

Although the preceding examples have used only constants as initializers, C# allows variables to be initialized dynamically, using any expression valid at the point at which the variable is declared. For example, here is a short program that computes the hypotenuse of a right triangle given the lengths of its two opposing sides.

// Demonstrate dynamic initialization.

using System;

class DynInit {
  static void Main() {
    // Length of sides.
    double s1 = 4.0;
    double s2 = 5.0;

    // Dynamically initialize hypot.
    double hypot = Math.Sqrt( (s1 * s1) + (s2 * s2) );

    Console.Write("Hypotenuse of triangle with sides " +
                  s1 + " by " + s2 + " is ");

    Console.WriteLine("{0:#.###}.", hypot);
  }
}

Here is the output:

Hypotenuse of triangle with sides 4 by 5 is 6.403.

Here, three local variables—s1, s2, and hypot—are declared. The first two, s1 and s2, are initialized by constants. However, hypot is initialized dynamically to the length of the hypotenuse. Notice that the initialization involves calling Math.Sqrt( ). As explained, you can use any expression that is valid at the point of the initialization. Since a call to Math.Sqrt( ) (or any other library method) is valid at this point, it can be used in the initialization of hypot. The key point here is that the initialization expression can use any element valid at the time of the initialization, including calls to methods, other variables, or literals.

Implicitly Typed Variables

As explained, in C# all variables must be declared. Normally, a declaration includes the type of the variable, such as int or bool, followed by the name of the variable. However, beginning with C# 3.0, it became possible to let the compiler determine the type of a local variable based on the value used to initialize it. This is called an implicitly typed variable.

An implicitly typed variable is declared using the keyword var, and it must be initialized. The compiler uses the type of the initializer to determine the type of the variable. Here is an example:

var e = 2.7183;

Because e is initialized with a floating-point literal (whose type is double by default), the type of e is double. Had e been declared like this:

var e = 2.7183F;

then e would have the type float, instead.

The following program demonstrates implicitly typed variables. It reworks the program shown in the preceding section so that all variables are implicitly typed.

//  Demonstrate implicitly typed variables.

using System;

class ImplicitlyTypedVar {
  static void Main() {

    // These are now implicitly typed variables. They
    // are of type double because their initializing
    // expressions are of type double.
    var s1 = 4.0;
    var s2 = 5.0;

    // Now, hypot is implicitly typed.  Its type is double
    // because the return type of Sqrt() is double.
    var hypot = Math.Sqrt( (s1 * s1) + (s2 * s2) );

    Console.Write("Hypotenuse of triangle with sides " +
                  s1 + " by " + s2 + " is ");
    Console.WriteLine("{0:#.###}.", hypot);

    // The following statement will not compile because
    // s1 is a double and cannot be assigned a decimal value.
//    s1 = 12.2M;  // Error!
  }
}

The output is the same as before.

It is important to emphasize that an implicitly typed variable is still a strongly typed variable. Notice this commented-out line in the program:

// s1 = 12.2M; // Error!

This assignment is invalid because s1 is of type double. Thus, it cannot be assigned a decimal value. The only difference between an implicitly typed variable and a “normal” explicitly typed variable is how the type is determined. Once that type has been determined, the variable has a type, and this type is fixed throughout the lifetime of the variable. Thus, the type of s1 cannot be changed during execution of the program.

Implicitly typed variables were not added to C# to replace “normal” variable declarations. Instead, implicitly typed variables are designed to handle some special-case situations, the most important of which relate to Language-Integrated Query (LINQ), which is described in Chapter 19. Therefore, for most variable declarations, you should continue to use explicitly typed variables because they make your code easier to read and easier to understand.

One last point: Only one implicitly typed variable can be declared at any one time. Therefore, the following declaration,

var s1 = 4.0, s2 = 5.0; // Error!

is wrong and won’t compile because it attempts to declare both s1 and s2 at the same time.

The Scope and Lifetime of Variables

So far, all of the variables that we have been using are declared at the start of the Main( ) method. However, C# allows a local variable to be declared within any block. As explained in Chapter 1, a block begins with an opening curly brace and ends with a closing curly brace. A block defines a scope. Thus, each time you start a new block, you are creating a new scope. A scope determines what names are visible to other parts of your program without qualification. It also determines the lifetime of local variables.

The most important scopes in C# are those defined by a class and those defined by a method. A discussion of class scope (and variables declared within it) is deferred until later in this book, when classes are described. For now, we will examine only the scopes defined by or within a method.

The scope defined by a method begins with its opening curly brace and ends with its closing curly brace. However, if that method has parameters, they too are included within the scope defined by the method.

As a general rule, local variables declared inside a scope are not visible to code that is defined outside that scope. Thus, when you declare a variable within a scope, you are protecting it from access or modification from outside the scope. Indeed, the scope rules provide the foundation for encapsulation.

Scopes can be nested. For example, each time you create a block of code, you are creating a new, nested scope. When this occurs, the outer scope encloses the inner scope. This means that local variables declared in the outer scope will be visible to code within the inner scope. However, the reverse is not true. Local variables declared within the inner scope will not be visible outside it.

To understand the effect of nested scopes, consider the following program:

// Demonstrate block scope.

using System;

class ScopeDemo {
  static void Main() {
    int x; // known to all code within Main()

    x = 10;
    if(x == 10) { // start new scope
      int y = 20; // known only to this block

      // x and y both known here.
      Console.WriteLine("x and y: " + x + " " + y);
      x = y * 2;
    }
    // y = 100; // Error! y not known here.

    // x is still known here.
    Console.WriteLine("x is " + x);
  }
}

As the comments indicate, the variable x is declared at the start of Main( )’s scope and is accessible to all subsequent code within Main( ). Within the if block, y is declared. Since a block defines a scope, y is visible only to other code within its block. This is why outside of its block, the line y = 100; is commented out. If you remove the leading comment symbol, a compile-time error will occur because y is not visible outside of its block. Within the if block, x can be used because code within a block (that is, a nested scope) has access to variables declared by an enclosing scope.

Within a block, variables can be declared at any point, but are valid only after they are declared. Thus, if you define a variable at the start of a method, it is available to all of the code within that method. Conversely, if you declare a variable at the end of a block, it is effectively useless, because no code will have access to it.

If a variable declaration includes an initializer, then that variable will be reinitialized each time the block in which it is declared is entered. For example, consider this program:

// Demonstrate lifetime of a variable.

using System;

class VarInitDemo {
  static void Main() {
    int x;

    for(x = 0; x < 3; x++) {
      int y = -1; // y is initialized each time block is entered
      Console.WriteLine("y is: " + y); // this always prints -1
      y = 100;
      Console.WriteLine("y is now: " + y);
    }
  }
}

The output generated by this program is shown here:

y is: -1
y is now: 100
y is: -1
y is now: 100
y is: -1
y is now: 100

As you can see, y is always reinitialized to –1 each time the inner for loop is entered. Even though it is subsequently assigned the value 100, this value is lost.

There is one quirk to C#’s scope rules that may surprise you: Although blocks can be nested, no variable declared within an inner scope can have the same name as a variable declared by an enclosing scope. For example, the following program, which tries to declare two separate variables with the same name, will not compile.

/*
   This program attempts to declare a variable
   in an inner scope with the same name as one
   defined in an outer scope.

   *** This program will not compile. ***
*/

using System;

class NestVar {
  static void Main() {
    int count;

    for(count = 0; count < 10; count = count+1) {
      Console.WriteLine("This is count: " + count);

      int count; // illegal!!!
      for(count = 0; count < 2; count++)
        Console.WriteLine("This program is in error!");
    }
  }
}

If you come from a C/C++ background, then you know that there is no restriction on the names you give variables declared in an inner scope. Thus, in C/C++ the declaration of count within the block of the outer for loop is completely valid. However, in C/C++, such a declaration hides the outer variable. The designers of C# felt that this type of name hiding could easily lead to programming errors and disallowed it.

Type Conversion and Casting

In programming, it is common to assign one type of variable to another. For example, you might want to assign an int value to a float variable, as shown here:

int i;
float f;

i = 10;
f = i; // assign an int to a float

When compatible types are mixed in an assignment, the value of the right side is automatically converted to the type of the left side. Thus, in the preceding fragment, the value in i is converted into a float and then assigned to f. However, because of C#’s strict type-checking, not all types are compatible, and thus, not all type conversions are implicitly allowed. For example, bool and int are not compatible. Fortunately, it is still possible to obtain a conversion between incompatible types by using a cast. A cast performs an explicit type conversion. Both automatic type conversion and casting are examined here.

Automatic Conversions

When one type of data is assigned to another type of variable, an implicit type conversion will take place automatically if

• The two types are compatible.

• The destination type has a range that is greater than the source type.

When these two conditions are met, a widening conversion takes place. For example, the int type is always large enough to hold all valid byte values, and both int and byte are compatible integer types, so an implicit conversion can be applied.

For widening conversions, the numeric types, including integer and floating-point types, are compatible with each other. For example, the following program is perfectly valid since long to double is a widening conversion that is automatically performed.

// Demonstrate implicit conversion from long to double.

using System;

class LtoD {
  static void Main() {
    long L;
    double D;

    L = 100123285L;
    D = L;

    Console.WriteLine("L and D: " + L + " " + D);
  }
}

Although there is an implicit conversion from long to double, there is no implicit conversion from double to long since this is not a widening conversion. Thus, the following version of the preceding program is invalid:

// *** This program will not compile. ***

using System;

class LtoD {
  static void Main() {
    long L;
    double D;

    D = 100123285.0;
    L = D; // Illegal!!!

    Console.WriteLine("L and D: " + L + " " + D);
  }
}

In addition to the restrictions just described, there are no implicit conversions between decimal and float or double, or from the numeric types to char or bool. Also, char and bool are not compatible with each other.

Casting Incompatible Types

Although the implicit type conversions are helpful, they will not fulfill all programming needs because they apply only to widening conversions between compatible types. For all other cases you must employ a cast. A cast is an instruction to the compiler to convert the outcome of an expression into a specified type. Thus, it requests an explicit type conversion. A cast has this general form:

(target-type) expression

Here, target-type specifies the desired type to convert the specified expression to. For example, given

double x, y;

if you want the type of the expression x/y to be int, you can write

(int) (x / y)

Here, even though x and y are of type double, the cast converts the outcome of the expression to int. The parentheses surrounding x / y are necessary. Otherwise, the cast to int would apply only to the x and not to the outcome of the division. The cast is necessary here because there is no implicit conversion from double to int.

When a cast involves a narrowing conversion, information might be lost. For example, when casting a long into an int, information will be lost if the long’s value is greater than the range of an int because its high-order bits are removed. When a floating-point value is cast to an integer type, the fractional component will also be lost due to truncation. For example, if the value 1.23 is assigned to an integer, the resulting value will simply be 1. The 0.23 is lost.

The following program demonstrates some type conversions that require casts. It also shows some situations in which the casts cause data to be lost.

The output from the program is shown here:

Integer outcome of x / y: 3

b after assigning 255: 255 -- no data lost.
b after assigning 257: 1 -- data lost.

s after assigning 32000: 32000 -- no data lost.
s after assigning 64000: -1536 -- data lost.

u after assigning 64000: 64000 -- no data lost.
u after assigning -12: 4294967284 -- data lost.

ch after assigning 88: X

Let’s look at each assignment. The cast of (x / y) to int results in the truncation of the fractional component, and information is lost.

No loss of information occurs when b is assigned the value 255 because a byte can hold the value 255. However, when the attempt is made to assign b the value 257, information loss occurs because 257 exceeds a byte’s range. In both cases the casts are needed because there is no implicit conversion from int to byte.

When the short variable s is assigned the value 32,000 through the uint variable u, no data is lost because a short can hold the value 32,000. However, in the next assignment, u has the value 64,000, which is outside the range of a short, and data is lost. In both cases the casts are needed because there is no implicit conversion from uint to short.

Next, u is assigned the value 64,000 through the long variable l. In this case, no data is lost because 64,000 is within the range of a uint. However, when the value –12 is assigned to u, data is lost because a uint cannot hold negative numbers. In both cases the casts are needed because there is no implicit conversion from long to uint.

Finally, no information is lost, but a cast is needed when assigning a byte value to a char.

Type Conversion in Expressions

In addition to occurring within an assignment, type conversions also take place within an expression. In an expression, you can freely mix two or more different types of data as long as they are compatible with each other. For example, you can mix short and long within an expression because they are both numeric types. When different types of data are mixed within an expression, they are converted to the same type, on an operation-by-operation basis.

The conversions are accomplished through the use of C#’s type promotion rules. Here is the algorithm that they define for binary operations:

IF one operand is a decimal, THEN the other operand is promoted to decimal (unless it is of type float or double, in which case an error results).

ELSE IF one operand is a double, the second is promoted to double.

ELSE IF one operand is a float, the second is promoted to float.

ELSE IF one operand is a ulong, the second is promoted to ulong (unless it is of type sbyte, short, int, or long, in which case an error results).

ELSE IF one operand is a long, the second is promoted to long.

ELSE IF one operand is a uint and the second is of type sbyte, short, or int, both are promoted to long.

ELSE IF one operand is a uint, the second is promoted to uint.

ELSE both operands are promoted to int.

There are a couple of important points to be made about the type promotion rules. First, not all types can be mixed in an expression. Specifically, there is no implicit conversion from float or double to decimal, and it is not possible to mix ulong with any signed integer type. To mix these types requires the use of an explicit cast.

Second, pay special attention to the last rule. It states that if none of the preceding rules applies, then all other operands are promoted to int. Therefore, in an expression, all char, sbyte, byte, ushort, and short values are promoted to int for the purposes of calculation. This is called integer promotion. It also means that the outcome of all arithmetic operations will be no smaller than int.

It is important to understand that type promotions only apply to the values operated upon when an expression is evaluated. For example, if the value of a byte variable is promoted to int inside an expression, outside the expression, the variable is still a byte. Type promotion only affects the evaluation of an expression.

Type promotion can, however, lead to somewhat unexpected results. For example, when an arithmetic operation involves two byte values, the following sequence occurs. First, the byte operands are promoted to int. Then the operation takes place, yielding an int result. Thus, the outcome of an operation involving two byte values will be an int. This is not what you might intuitively expect. Consider the following program.

// A promotion surprise!

using System;

class PromDemo {
  static void Main() {
    byte b;

    b = 10;
    b = (byte) (b * b); // cast needed!!

    Console.WriteLine("b: "+ b);
  }
}

Somewhat counterintuitively, a cast to byte is needed when assigning b * b back to b! The reason is because in b * b, the value of b is promoted to int when the expression is evaluated. Thus, b * b results in an int value, which cannot be assigned to a byte variable without a cast. Keep this in mind if you get unexpected type-incompatibility error messages on expressions that would otherwise seem perfectly correct.

This same sort of situation also occurs when performing operations on chars. For example, in the following fragment, the cast back to char is needed because of the promotion of ch1 and ch2 to int within the expression

char ch1 = 'a', ch2 = 'b';

ch1 = (char) (ch1 + ch2);

Without the cast, the result of adding ch1 to ch2 would be int, which can’t be assigned to a char.

Type promotions also occur when a unary operation, such as the unary –, takes place. For the unary operations, operands smaller than int (byte, sbyte, short, and ushort) are promoted to int. Also, a char operand is converted to int. Furthermore, if a uint value is negated, it is promoted to long.

Using Casts in Expressions

A cast can be applied to a specific portion of a larger expression. This gives you fine-grained control over the way type conversions occur when an expression is evaluated. For example, consider the following program. It displays the square roots of the numbers from 1 to 10. It also displays the whole number portion and the fractional part of each result, separately. To do so, it uses a cast to convert the result of Math.Sqrt( ) to int.

// Using casts in an expression.

using System;

class CastExpr {
  static void Main() {
    double n;

     for(n = 1.0; n <= 10; n++) {
       Console.WriteLine("The square root of {0} is {1}",
                         n, Math.Sqrt(n));
       Console.WriteLine("Whole number part: {0}" ,
                         (int) Math.Sqrt(n));
       Console.WriteLine("Fractional part: {0}",
                         Math.Sqrt(n) - (int) Math.Sqrt(n) );
       Console.WriteLine();
    }
  }
}

Here is the output from the program:

The square root of 1 is 1
Whole number part: 1
Fractional part: 0

The square root of 2 is 1.4142135623731
Whole number part: 1
Fractional part: 0.414213562373095

The square root of 3 is 1.73205080756888
Whole number part: 1
Fractional part: 0.732050807568877

The square root of 4 is 2
Whole number part: 2
Fractional part: 0

The square root of 5 is 2.23606797749979
Whole number part: 2
Fractional part: 0.23606797749979

The square root of 6 is 2.44948974278318
Whole number part: 2
Fractional part: 0.449489742783178

The square root of 7 is 2.64575131106459
Whole number part: 2
Fractional part: 0.645751311064591

The square root of 8 is 2.82842712474619
Whole number part: 2
Fractional part: 0.82842712474619

The square root of 9 is 3
Whole number part: 3
Fractional part: 0

The square root of 10 is 3.16227766016838
Whole number part: 3
Fractional part: 0.16227766016838

As the output shows, the cast of Math.Sqrt( ) to int results in the whole number component of the value. In this expression

Math.Sqrt(n) - (int) Math.Sqrt(n)

the cast to int obtains the whole number component, which is then subtracted from the complete value, yielding the fractional component. Thus, the outcome of the expression is double. Only the value of the second call to Math.Sqrt( ) is cast to int.