This chapter is a terse but comprehensive introduction to Java syntax. It is written primarily for readers who are new to the language but have some previous programming experience. Determined novices with no prior programming experience may also find it useful. If you already know Java, you should find it a useful language reference. The chapter includes some comparisons of Java to C and C++ for the benefit of programmers coming from those languages.
This chapter documents the syntax of Java programs by starting at the very lowest level of Java syntax and building from there, moving on to increasingly higher orders of structure. It covers:
The characters used to write Java programs and the encoding of those characters.
Literal values, identifiers, and other tokens that comprise a Java program.
The data types that Java can manipulate.
The operators used in Java to group individual tokens into larger expressions.
Statements, which group expressions and other statements to form logical chunks of Java code.
Methods, which are named collections of Java statements that can be invoked by other Java code.
Classes, which are collections of methods and fields. Classes are the central program element in Java and form the basis for object-oriented programming. Chapter 3 is devoted entirely to a discussion of classes and objects.
Packages, which are collections of related classes.
Java programs, which consist of one or more interacting classes that may be drawn from one or more packages.
The syntax of most programming languages is complex, and Java is no exception. In general, it is not possible to document all elements of a language without referring to other elements that have not yet been discussed. For example, it is not really possible to explain in a meaningful way the operators and statements supported by Java without referring to objects. But it is also not possible to document objects thoroughly without referring to the operators and statements of the language. The process of learning Java, or any language, is therefore an iterative one.
Before we begin our bottom-up exploration of Java syntax, let’s take a
moment for a top-down overview of a Java program. Java programs consist
of one or more files, or compilation units, of Java source code. Near
the end of the chapter, we describe the structure of a Java file and
explain how to compile and run a Java program. Each compilation unit
begins with an optional package
declaration followed by zero or more
import
declarations. These declarations specify the namespace within
which the compilation unit will define names, and the namespaces from
which the compilation unit imports names. We’ll see package
and
import
again later in this chapter in
“Packages and the Java Namespace”.
The optional package
and import
declarations are followed by zero or
more reference type definitions. We will meet the full variety of
possible reference types in Chapters 3 and
4, but for now, we should note that these are most often either class
or interface
definitions.
Within the definition of a reference type, we will encounter members such as fields, methods, and constructors. Methods are the most important kind of member. Methods are blocks of Java code composed of statements.
With these basic terms defined, let’s start by approaching a Java program from the bottom up by examining the basic units of syntax—often referred to as lexical tokens.
This section explains the lexical structure of a Java program. It starts with a discussion of the Unicode character set in which Java programs are written. It then covers the tokens that comprise a Java program, explaining comments, identifiers, reserved words, literals, and so on.
Java programs are written using Unicode. You can use Unicode characters anywhere in a Java program, including comments and identifiers such as variable names. Unlike the 7-bit ASCII character set, which is useful only for English, and the 8-bit ISO Latin-1 character set, which is useful only for major Western European languages, the Unicode character set can represent virtually every written language in common use on the planet.
If you do not use a Unicode-enabled text editor, or if you do not want
to force other programmers who view or edit your code to use a
Unicode-enabled editor, you can embed Unicode characters into your Java
programs using the special Unicode escape sequence \uxxxx
—that is, a backslash and a lowercase u, followed by four hexadecimal
characters. For example, \u0020
is the space character, and \u03c0
is the character π.
Java has invested a large amount of time and engineering effort in ensuring that its Unicode support is first class. If your business application needs to deal with global users, especially in non-Western markets, then the Java platform is a great choice. Java also has support for multiple encodings and character sets, in case applications need to interact with non-Java applications that do not speak Unicode.
Java is a case-sensitive language. Its keywords are written in
lowercase and must always be used that way. That is, While
and WHILE
are not the same as the while
keyword. Similarly, if you declare a
variable named i
in your program, you may not refer to it as I
.
In general, relying on case sensitivity to distinguish identifiers is a terrible idea. Do not use it in your own code, and in particular never give an identifier the same name as a keyword but differently cased.
Java ignores spaces, tabs, newlines, and other whitespace, except when it appears within quoted characters and string literals. Programmers typically use whitespace to format and indent their code for easy readability, and you will see common indentation conventions in this book’s code examples.
Comments are natural-language text intended for human readers of a
program. They are ignored by the Java compiler. Java supports three
types of comments. The first type is a single-line comment, which
begins with the characters //
and continues until the end of the
current line. For example:
int
i
=
0
;
// Initialize the loop variable
The second kind of comment is a multiline comment. It begins with the
characters /*
and continues, over any number of lines, until the
characters */
. Any text between the /*
and the */
is ignored by
javac
. Although this style of comment is typically used for multiline
comments, it can also be used for single-line comments. This type of
comment cannot be nested (i.e., one /* */
comment cannot appear within
another). When writing multiline comments, programmers often use extra
*
characters to make the comments stand out. Here is a typical
multiline comment:
/*
* First, establish a connection to the server.
* If the connection attempt fails, quit right away.
*/
The third type of comment is a special case of the second. If a comment
begins with /**
, it is regarded as a special doc comment. Like
regular multiline comments, doc comments end with */
and cannot be
nested. When you write a Java class you expect other programmers to use,
provide doc comments to embed documentation about the class and each of its
methods directly into the source code. A program named javadoc
extracts these comments and processes them to create online
documentation for your class. A doc comment can contain HTML tags and
can use additional syntax understood by javadoc
. For example:
/**
* Upload a file to a web server.
*
* @param file The file to upload.
* @return <tt>true</tt> on success,
* <tt>false</tt> on failure.
* @author David Flanagan
*/
See Chapter 7 for more information on the doc comment syntax and Chapter 13 for more information on the javadoc
program.
Comments may appear between any tokens of a Java program, but may not appear within a token. In particular, comments may not appear within double-quoted string literals. A comment within a string literal simply becomes a literal part of that string.
The following words are reserved in Java (they are part of the syntax of the language and may not be used to name variables, classes, and so forth):
abstract const final int public throw assert continue finally interface return throws boolean default float long short transient break do for native static true byte double goto new strictfp try case else if null super void catch enum implements package switch volatile char extends import private synchronized while class false instanceof protected this
Of these, true
, false,
and null
are technically literals.
The sequence var
is not a keyword, but instead indicates that the type of a local variable should be type-inferred.
The character sequence consisting of a single underscore, _
, is also disallowed as an identifier.
There are also 10 restricted keywords which are only considered keywords within the context of declaring a Java platform module.
We’ll meet each of these reserved words again later in this book. Some of them are the names of primitive types and others are the names of Java statements, both of which are discussed later in this chapter. Still others are used to define classes and their members (see Chapter 3).
Note that const
and goto
are reserved but aren’t actually used in
the language, and that interface
has an additional variant
form—@interface
, which is used when defining types known as
annotations. Some of the reserved words (notably final
and default
)
have a variety of meanings depending on context.
An identifier is simply a name given to some part of a Java program, such as a class, a method within a class, or a variable declared within a method. Identifiers may be of any length and may contain letters and digits drawn from the entire Unicode character set. An identifier may not begin with a digit.
In general, identifiers may not contain punctuation characters.
Exceptions include the dollar sign ($
) as well as other Unicode currency symbols such as £
and ¥
.
The ASCII underscore (_
) also deserves special mention.
Originally, the underscore could be freely used as an identifier, or part of one.
However, in recent versions of Java, including Java 11, the underscore may not be used as an identifier.
The underscore character can still appear in a Java identifier, but it is no longer legal as a complete identifier by itself. This is to support an expected forthcoming language feature whereby the underscore will acquire a special new syntactic meaning.
Currency symbols are intended for use in automatically generated source
code, such as code produced by javac
. By avoiding the use of currency
symbols in your own identifiers, you don’t have to worry about
collisions with automatically generated identifiers.
The usual Java convention is to name variables using camel case. This means that the first letter of a variable should be lowerase, but that the first letter of any other words in the identifier should be uppercase.
Formally, the characters allowed at the beginning of and within an
identifier are defined by the methods isJavaIdentifierStart()
and
isJavaIdentifierPart()
of the class java.lang.Character
.
The following are examples of legal identifiers:
i
x1
theCurrentTime
current
獺
Note in particular the example of a UTF-8 identifier, 獺
. This is the
Kanji character for “otter” and is perfectly legal as a Java identifier.
The usage of non-ASCII identifiers is unusual in programs predominantly
written by Westerners, but is sometimes seen.
Literals are sequences of source characters that directly represent constant values that appear as-is in Java source code.
They include integer and floating-point numbers, single characters within
single quotes, strings of characters within double quotes, and the
reserved words true
, false
, and null
. For example, the following are all literals:
1
1.0
'1'
1L
"one"
true
false
null
The syntax for expressing numeric, character, and string literals is detailed in “Primitive Data Types”.
Java also uses a number of punctuation characters as tokens. The Java Language Specification divides these characters (somewhat arbitrarily) into two categories, separators and operators. The 12 separators are:
(
)
{
}
[
]
...
@
::
;
,
.
+
—
*
/
%
&
|
^
<<
>>
>>>
+=
-=
*=
/=
%=
&=
|=
^=
<<=
>>=
>>>=
=
==
!=
<
<=
>
>=
!
~
&&
||
++
--
?
:
->
We’ll see separators throughout the book, and will cover each operator individually in “Expressions and Operators”.
Java supports eight basic data types known as primitive types as described in Table 2-1. The primitive types include a Boolean type, a character type, four integer types, and two floating-point types. The four integer types and the two floating-point types differ in the number of bits that represent them and therefore in the range of numbers they can represent.
Type | Contains | Default | Size | Range |
---|---|---|---|---|
|
|
|
1 bit |
NA |
|
Unicode character |
|
16 bits |
|
|
Signed integer |
0 |
8 bits |
–128 to 127 |
|
Signed integer |
0 |
16 bits |
–32768 to 32767 |
|
Signed integer |
0 |
32 bits |
–2147483648 to 2147483647 |
|
Signed integer |
0 |
64 bits |
–9223372036854775808 to 9223372036854775807 |
|
IEEE 754 floating point |
0.0 |
32 bits |
1.4E–45 to 3.4028235E+38 |
|
IEEE 754 floating point |
0.0 |
64 bits |
4.9E–324 to 1.7976931348623157E+308 |
The next section summarizes these primitive data types. In addition to these primitive types, Java supports nonprimitive data types known as reference types, which are introduced in “Reference Types”.
The boolean
type represents truth values. This type has only two
possible values, representing the two Boolean states: on or off, yes or
no, true or false. Java reserves the words true
and false
to
represent these two Boolean values.
Programmers coming to Java from other languages (especially JavaScript and C) should note that Java is much stricter about its Boolean values than
other languages; in particular, a boolean
is neither an integral nor an
object type, and incompatible values cannot be used in place of a
boolean
. In other words, you cannot take shortcuts such as the
following in Java:
Object
o
=
new
Object
();
int
i
=
1
;
if
(
o
)
{
while
(
i
)
{
//...
}
}
Instead, Java forces you to write cleaner code by explicitly stating the comparisons you want:
if
(
o
!=
null
)
{
while
(
i
!=
0
)
{
// ...
}
}
The char
type represents Unicode characters. Java has a slightly
unique approach to representing characters—javac
accepts identifiers
and literals as UTF-8 (a variable-width encoding) in input.
However, internally, Java represents chars in a fixed-width encoding—either a 16-bit encoding (before Java 9) or as ISO-8859-1 (an 8-bit encoding, used for Western European languages, also called Latin-1) if possible (Java 9 and later).
This distinction between external and internal representation does not normally need to concern the developer. In most cases, all that is required is to remember the rule that to include a character literal in a Java program, simply place it between single quotes (apostrophes):
char
c
=
'A'
;
You can, of course, use any Unicode character as a character literal,
and you can use the \u
Unicode escape sequence. In addition, Java
supports a number of other escape sequences that make it easy both to
represent commonly used nonprinting ASCII characters, such as newline,
and to escape certain punctuation characters that have special meaning
in Java. For example:
char
tab
=
'\t'
,
nul
=
'\
000
'
,
aleph
=
'\u05D0'
,
slash
=
'\\'
;
Table 2-2 lists the escape characters that can be used in char
literals. These characters can also be used in string literals, which are covered in the next section.
Escape sequence | Character value |
---|---|
|
Backspace |
|
Horizontal tab |
|
Newline |
|
Form feed |
|
Carriage return |
|
Double quote |
|
Single quote |
|
Backslash |
|
The Latin-1 character with the encoding |
|
The Unicode character with encoding |
char
values can be converted to and from the various integral types,
and the char
data type is a 16-bit integral type. Unlike byte
,
short
, int
, and long
, however, char
is an unsigned type. The
Character
class defines a number of useful static
methods for
working with characters, including isDigit()
, isJavaLetter()
,
isLowerCase()
, and toUpperCase()
.
The Java language and its char
type were designed with Unicode in
mind. The Unicode standard is evolving, however, and each new version of
Java adopts a new version of Unicode. Java 7 uses Unicode 6.0 and Java 8
uses Unicode 6.2.
Recent releases of Unicode include characters whose encodings, or
codepoints, do not fit in 16 bits. These supplementary characters,
which are mostly infrequently used Han (Chinese) ideographs, occupy 21
bits and cannot be represented in a single char
value. Instead, you
must use an int
value to hold the codepoint of a supplementary
character, or you must encode it into a so-called “surrogate pair” of
two char
values.
Unless you commonly write programs that use Asian languages, you are
unlikely to encounter any supplementary characters. If you do anticipate
having to process characters that do not fit into a char
, methods have
been added to the Character
, String
, and related classes for working
with text using int
codepoints.
In addition to the char
type, Java also has a data type for working
with strings of text (usually simply called strings). The String
type is a class, however, and is not one of the primitive types of the
language. Because strings are so commonly used, though, Java does have a
syntax for including string values literally in a program. A String
literal consists of arbitrary text within double quotes (as opposed to
the single quotes for char
literals). For example:
"Hello World"
"'This' is a string!"
String literals can contain any of the escape sequences that can appear
as char
literals (see Table 2-2). Use
the \
" sequence to include a double quote within a String
literal.
Because String
is a reference type, string literals are described in
more detail later in this chapter in “Object Literals”.
Chapter 9 contains more details on some of the
ways you can work with String
objects in Java.
The integer types in Java are byte
, short
, int
, and long
. As
shown in Table 2-1, these four types
differ only in the number of bits and, therefore, in the range of
numbers each type can represent. All integral types represent signed
numbers; there is no unsigned
keyword as there is in C and C++.
Literals for each of these types are written exactly as you would expect: as a sequence of decimal digits, optionally preceded by a minus sign.1 Here are some legal integer literals:
0
1
123
-
42000
Integer literals are 32-bit values (and so are taken to be the Java type int
) unless they end with the character L
or l
, in which case they are 64-bit values (and are understood to be the Java type long
):
1234
// An int value
1234L
// A long value
0xff
L
// Another long value
Integer literals can also be expressed in hexadecimal, binary, or octal
notation. A literal that begins with 0x
or 0X
is taken as a
hexadecimal number, using the letters A
to F
(or a
to f
) as the
additional digits required for base-16 numbers.
Integer binary literals start with 0b
and may, of course, only feature
the digits 1 or 0. As binary literals can be very long, underscores are
often used as part of a binary literal. The underscore character is
ignored whenever it is encountered in any numerical literal—it’s allowed
purely to help with readability of literals.
Java also supports octal (base-8) integer literals. These literals begin
with a leading 0
and cannot include the digits 8 or 9. They are not
often used and should be avoided unless needed. Legal hexadecimal,
binary, and octal literals include:
0xff
// Decimal 255, expressed in hexadecimal
0377
// The same number, expressed in octal (base 8)
0
b0010_1111
// Decimal 47, expressed in binary
0xCAFEBABE
// A magic number used to identify Java class files
Integer arithmetic in Java never produces an overflow or an underflow when you exceed the range of a given integer type. Instead, numbers just wrap around. For example, let’s look at an overflow:
byte
b1
=
127
,
b2
=
1
;
// Largest byte is 127
byte
sum
=
(
byte
)(
b1
+
b2
);
// Sum wraps to -128, the smallest byte
and the corresponding underflow behavior:
byte
b3
=
-
128
,
b4
=
5
;
// Smallest byte is -128
byte
sum2
=
(
byte
)(
b3
-
b4
);
// Sum wraps to a large byte value, 123
Neither the Java compiler nor the Java interpreter warns you in any way
when this occurs. When doing integer arithmetic, you simply must ensure
that the type you are using has a sufficient range for the purposes you
intend. Integer division by zero and modulo by zero are illegal and cause an ArithmeticException
to be thrown.
Each integer type has a corresponding wrapper class: Byte
, Short
,
Integer
, and Long
. Each of these classes defines MIN_VALUE
and
MAX_VALUE
constants that describe the range of the type. The classes
also define useful static methods, such as Byte.parseByte()
and
Integer.parseInt()
, for converting strings to integer values.
Real numbers in Java are represented by the float
and double
data
types. As shown in Table 2-1, float
is
a 32-bit, single-precision floating-point value, and double
is a
64-bit, double-precision floating-point value. Both types adhere to
the IEEE 754-1985 standard, which specifies both the format of the
numbers and the behavior of arithmetic for the numbers.
Floating-point values can be included literally in a Java program as an optional string of digits, followed by a decimal point and another string of digits. Here are some examples:
123.45
0.0
.
01
Floating-point literals can also use exponential, or scientific,
notation, in which a number is followed by the letter e
or E
(for
exponent) and another number. This second number represents the power of
10 by which the first number is multiplied. For example:
1.2345E02
// 1.2345 * 10^2 or 123.45
1
e
-
6
// 1 * 10^-6 or 0.000001
6.02e23
// Avogadro's Number: 6.02 * 10^23
Floating-point literals are double
values by default. To include a
float
value literally in a program, follow the number with f
or F
:
double
d
=
6.02E23
;
float
f
=
6.02e23f
;
Floating-point literals cannot be expressed in hexadecimal, binary, or octal notation.
In addition to representing ordinary numbers, the float
and double
types can also represent four special values: positive and negative
infinity, zero, and NaN. The infinity values result when a
floating-point computation produces a value that overflows the
representable range of a float
or double
.
When a floating-point computation underflows the representable range of a float
or a double
, a zero value results.
We can imagine repeatedly dividing the double value 1.0
by 2.0
(e.g., in a while
loop). In mathematics, no matter how often we perform the division, the result will never become equal to zero. However, in a floating-point representation, after enough divisions, the result will eventually be so small as to be indistinguishable from zero.
The Java floating-point types make a distinction between positive zero and negative zero, depending on the direction from which the underflow occurred. In practice, positive and negative zero behave pretty much the same. Finally, the last special floating-point value is NaN, which stands for “Not-a-Number.” The NaN value results when an illegal floating-point operation, such as 0.0/0.0, is performed. Here are examples of statements that result in these special values:
double
inf
=
1.0
/
0.0
;
// Infinity
double
neginf
=
-
1.0
/
0.0
;
// Negative infinity
double
negzero
=
-
1.0
/
inf
;
// Negative zero
double
NaN
=
0.0
/
0.0
;
// Not a Number
The float
and double
primitive types have corresponding classes,
named Float
and Double
. Each of these classes defines the
following useful constants: MIN_VALUE
, MAX_VALUE
,
NEGATIVE_INFINITY
, POSITIVE_INFINITY
, and NaN
.
Java floating-point types can handle overflow to infinity and underflow to zero and have a special NaN value. This means floating-point arithmetic never throws exceptions, even when performing illegal operations, like dividing zero by zero or taking the square root of a negative number.
The infinite floating-point values behave as you would expect. Adding
or subtracting any finite value to or from infinity, for example, yields
infinity. Negative zero behaves almost identically to positive zero,
and, in fact, the ==
equality operator reports that negative zero is
equal to positive zero. One way to distinguish negative zero from
positive, or regular, zero is to divide by it: 1.0/0.0
yields positive
infinity, but 1.0
divided by negative zero yields negative infinity.
Finally, because NaN is Not a Number, the ==
operator says that it is
not equal to any other number, including itself!
double
NaN
=
0.0
/
0.0
;
// Not a Number
NaN
==
NaN
;
// false
Double
.
isNaN
(
NaN
);
// true
To check whether a float
or double
value is NaN, you must use the Float.isNaN()
and Double.isNaN()
methods.
Java allows conversions between integer values and floating-point
values. In addition, because every character corresponds to a number in
the Unicode encoding, char
values can be converted to and from the
integer and floating-point types. In fact, boolean
is the only
primitive type that cannot be converted to or from another primitive
type in Java.
There are two basic types of conversions. A widening conversion
occurs when a value of one type is converted to a wider type—one that
has a larger range of legal values. For example, Java performs widening
conversions automatically when you assign an int
literal to a
double
variable or a char
literal to an int
variable.
Narrowing conversions are another matter, however. A narrowing
conversion occurs when a value is converted to a type that is not wider
than it is. Narrowing conversions are not always safe: it is reasonable
to convert the integer value 13 to a byte
, for example, but it is not
reasonable to convert 13,000 to a byte
, because byte
can hold only
numbers between –128 and 127. Because you can lose data in a narrowing
conversion, the Java compiler complains when you attempt any narrowing
conversion, even if the value being converted would in fact fit in the
narrower range of the specified type:
int
i
=
13
;
// byte b = i; // Incompatible types: possible lossy conversion
// from int to byte
The one exception to this rule is that you can assign an integer literal
(an int
value) to a byte
or short
variable if the literal falls
within the range of the variable.
byte
b
=
13
;
If you need to perform a narrowing conversion and are confident you can do so without losing data or precision, you can force Java to perform the conversion using a language construct known as a cast. Perform a cast by placing the name of the desired type in parentheses before the value to be converted. For example:
int
i
=
13
;
byte
b
=
(
byte
)
i
;
// Force the int to be converted to a byte
i
=
(
int
)
13.456
;
// Force this double literal to the int 13
Casts of primitive types are most often used to convert floating-point
values to integers. When you do this, the fractional part of the
floating-point value is simply truncated (i.e., the floating-point value
is rounded toward zero, not toward the nearest integer). The static
methods Math.round()
, Math.floor()
, and Math.ceil()
perform other
types of rounding.
The char
type acts like an integer type in most ways, so a char
value can be used anywhere an int
or long
value is required. Recall, however, that the char
type is unsigned, so it behaves
differently than the short
type, even though both are 16 bits wide:
short
s
=
(
short
)
0xffff
;
// These bits represent the number -1
char
c
=
'\uffff'
;
// The same bits, as a Unicode character
int
i1
=
s
;
// Converting the short to an int yields -1
int
i2
=
c
;
// Converting the char to an int yields 65535
Table 2-3 shows which primitive types can be converted to which other types and how the conversion is performed. The letter N in the table means that the conversion cannot be performed. The letter Y means that the conversion is a widening conversion and is therefore performed automatically and implicitly by Java. The letter C means that the conversion is a narrowing conversion and requires an explicit cast.
Finally, the notation Y* means that the conversion is an automatic
widening conversion, but that some of the least significant digits of
the value may be lost in the conversion. This can happen when you are converting
an int
or long
to a floating-point type—see the table for details.
The floating-point types have a larger range than the integer types, so
any int
or long
can be represented by a float
or double
.
However, the floating-point types are approximations of numbers and
cannot always hold as many significant digits as the integer types (see
Chapter 9 for some more detail about floating-point numbers).
Convert to: | ||||||||
---|---|---|---|---|---|---|---|---|
Convert from: | boolean |
byte |
short |
char |
int |
long |
float |
double |
|
- |
N |
N |
N |
N |
N |
N |
N |
|
N |
- |
Y |
C |
Y |
Y |
Y |
Y |
|
N |
C |
- |
C |
Y |
Y |
Y |
Y |
|
N |
C |
C |
- |
Y |
Y |
Y |
Y |
|
N |
C |
C |
C |
- |
Y |
Y* |
Y |
|
N |
C |
C |
C |
C |
- |
Y* |
Y* |
|
N |
C |
C |
C |
C |
C |
- |
Y |
double |
N | C | C | C | C | C | C | - |
So far in this chapter, we’ve learned about the primitive types that Java programs can manipulate and seen how to include primitive values as literals in a Java program. We’ve also used variables as symbolic names that represent, or hold, values. These literals and variables are the tokens out of which Java programs are built.
An expression is the next higher level of structure in a Java program. The Java interpreter evaluates an expression to compute its value. The very simplest expressions are called primary expressions and consist of literals and variables. So, for example, the following are all expressions:
1.7
// A floating-point literal
true
// A Boolean literal
sum
// A variable
When the Java interpreter evaluates a literal expression, the resulting value is the literal itself. When the interpreter evaluates a variable expression, the resulting value is the value stored in the variable.
Primary expressions are not very interesting. More complex expressions are made by using operators to combine primary expressions. For example, the following expression uses the assignment operator to combine two primary expressions—a variable and a floating-point literal—into an assignment expression:
sum
=
1.7
But operators are used not only with primary expressions; they can also be used with expressions at any level of complexity. The following are all legal expressions:
sum
=
1
+
2
+
3
*
1.2
+
(
4
+
8
)/
3.0
sum
/
Math
.
sqrt
(
3.0
*
1.234
)
(
int
)(
sum
+
33
)
The kinds of expressions you can write in a programming language depend entirely on the set of operators available to you. Java has a wealth of operators, but to work effectively with them, you must understand two important concepts: precedence and associativity. These concepts—and the operators themselves—are explained in more detail in the following sections.
The P column of Table 2-4 specifies the precedence of each operator. Precedence specifies the order in which operations are performed. Operations that have higher precedence are performed before those with lower precedence. For example, consider this expression:
a
+
b
*
c
The multiplication operator has higher precedence than the addition
operator, so a
is added to the product of b
and c
, just as we
expect from elementary mathematics. Operator precedence can be thought
of as a measure of how tightly operators bind to their operands. The
higher the number, the more tightly they bind.
Default operator precedence can be overridden through the use of parentheses that explicitly specify the order of operations. The previous expression can be rewritten to specify that the addition should be performed before the multiplication:
(
a
+
b
)
*
c
The default operator precedence in Java was chosen for compatibility with C; the designers of C chose this precedence so that most expressions can be written naturally without parentheses. There are only a few common Java idioms for which parentheses are required. Examples include:
// Class cast combined with member access
((
Integer
)
o
).
intValue
();
// Assignment combined with comparison
while
((
line
=
in
.
readLine
())
!=
null
)
{
...
}
// Bitwise operators combined with comparison
if
((
flags
&
(
PUBLIC
|
PROTECTED
))
!=
0
)
{
...
}
Associativity is a property of operators that defines how to evaluate expressions that would otherwise be ambiguous. This is particularly important when an expression involves several operators that have the same precedence.
Most operators are left-to-right associative, which means that the operations are performed from left to right. The assignment and unary operators, however, have right-to-left associativity. The A column of Table 2-4 specifies the associativity of each operator or group of operators. The value L means left to right, and R means right to left.
The additive operators are all left-to-right associative, so the
expression a+b-c
is evaluated from left to right: (a+b)-c
. Unary
operators and assignment operators are evaluated from right to left.
Consider this complex expression:
a
=
b
+=
c
=
-~
d
This is evaluated as follows:
a
=
(
b
+=
(
c
=
-(~
d
)))
As with operator precedence, operator associativity establishes a default order of evaluation for an expression. This default order can be overridden through the use of parentheses. However, the default operator associativity in Java has been chosen to yield a natural expression syntax, and you should rarely need to alter it.
Table 2-4 summarizes the operators available in Java. The P and A columns of the table specify the precedence and associativity of each group of related operators, respectively. You should use this table as a quick reference for operators (especially their precedence) when required.
P | A | Operator | Operand type(s) | Operation performed |
---|---|---|---|---|
16 |
L |
. |
object, member |
Object member access |
|
array, int |
Array element access |
||
|
method, arglist |
Method invocation |
||
|
variable |
Post-increment, post-decrement |
||
15 |
R |
|
variable |
Pre-increment, pre-decrement |
|
number |
Unary plus, unary minus |
||
|
integer |
Bitwise complement |
||
|
boolean |
Boolean NOT |
||
14 |
R |
|
class, arglist |
Object creation |
|
type, any |
Cast (type conversion) |
||
13 |
L |
|
number, number |
Multiplication, division, remainder |
12 |
L |
|
number, number |
Addition, subtraction |
|
string, any |
String concatenation |
||
11 |
L |
|
integer, integer |
Left shift |
|
integer, integer |
Right shift with sign extension |
||
|
integer, integer |
Right shift with zero extension |
||
10 |
L |
|
number, number |
Less than, less than or equal |
|
number, number |
Greater than, greater than or equal |
||
|
reference, type |
Type comparison |
||
9 |
L |
|
primitive, primitive |
Equal (have identical values) |
|
primitive, primitive |
Not equal (have different values) |
||
|
reference, reference |
Equal (refer to same object) |
||
|
reference, reference |
Not equal (refer to different objects) |
||
8 |
L |
|
integer, integer |
Bitwise AND |
|
boolean, boolean |
Boolean AND |
||
7 |
L |
|
integer, integer |
Bitwise XOR |
|
boolean, boolean |
Boolean XOR |
||
6 |
L |
|
integer, integer |
Bitwise OR |
|
boolean, boolean |
Boolean OR |
||
5 |
L |
|
boolean, boolean |
Conditional AND |
4 |
L |
|
boolean, boolean |
Conditional OR |
3 |
R |
|
boolean, any |
Conditional (ternary) operator |
2 |
R |
|
variable, any |
Assignment |
|
variable, any |
Assignment with operation |
||
|
||||
|
||||
|
||||
1 |
R |
|
arglist, method body |
lambda expression |
The fourth column of Table 2-4 specifies the number and type of the operands expected by each operator. Some operators operate on only one operand; these are called unary operators. For example, the unary minus operator changes the sign of a single number:
-
n
// The unary minus operator
Most operators, however, are binary operators that operate on two
operand values. The –
operator actually comes in both forms:
a
–
b
// The subtraction operator is a binary operator
Java also defines one ternary operator, often called the conditional
operator. It is like an if
statement inside an expression. Its three
operands are separated by a question mark and a colon; the second and
third operands must be convertible to the same type:
x
>
y
?
x
:
y
// Ternary expression; evaluates to larger of x and y
In addition to expecting a certain number of operands, each operator also expects particular types of operands. The fourth column of the table lists the operand types. Some of the codes used in that column require further explanation:
An integer, floating-point value, or character (i.e., any primitive
type except boolean
). Auto-unboxing (see
“Boxing and Unboxing Conversions”) means that
the wrapper classes (such as Character
, Integer
, and Double
) for
these types can be used in this context as well.
A byte
, short
, int
, long
, or char
value (long
values are
not allowed for the array access operator [ ]
). With auto-unboxing,
Byte
, Short
, Integer
, Long
, and Character
values are also
allowed.
An object or array.
A variable or anything else, such as an array element, to which a value can be assigned.
Just as every operator expects its operands to be of specific types,
each operator produces a value of a specific type. The arithmetic,
increment and decrement, bitwise, and shift operators return a double
if at least one of the operands is a double
. They return a float
if at least one of the operands is a float
. They return a long
if
at least one of the operands is a long
. Otherwise, they return an
int
, even if both operands are byte
, short
, or char
types that
are narrower than int
.
The comparison, equality, and Boolean operators always return boolean
values. Each assignment operator returns whatever value it assigned,
which is of a type compatible with the variable on the left side of the
expression. The conditional operator returns the value of its second or
third argument (which must both be of the same type).
Every operator computes a value based on one or more operand values. Some operators, however, have side effects in addition to their basic evaluation. If an expression contains side effects, evaluating it changes the state of a Java program in such a way that evaluating the expression again may yield a different result.
For example, the ++
increment operator has the side effect of
incrementing a variable. The expression ++a
increments the variable
a
and returns the newly incremented value. If this expression is
evaluated again, the value will be different. The various assignment
operators also have side effects. For example, the expression a*=2
can
also be written as a=a*2
. The value of the expression is the value of
a
multiplied by 2, but the expression has the side effect of storing
that value back into a
.
The method invocation operator ()
has side effects if the invoked
method has side effects. Some methods, such as Math.sqrt()
, simply
compute and return a value without side effects of any kind. Typically,
however, methods do have side effects. Finally, the new
operator has
the profound side effect of creating a new object.
When the Java interpreter evaluates an expression, it performs the
various operations in an order specified by the parentheses in the
expression, the precedence of the operators, and the associativity of
the operators. Before any operation is performed, however, the
interpreter first evaluates the operands of the operator. (The
exceptions are the &&
, ||
, and ?
: operators, which do not
always evaluate all their operands.) The interpreter always evaluates
operands in order from left to right. This matters if any of the
operands are expressions that contain side effects. Consider this code,
for example:
int
a
=
2
;
int
v
=
++
a
+
++
a
*
++
a
;
Although the multiplication is performed before the addition, the
operands of the +
operator are evaluated first. As the operands of +
are both +a
, these are evaluated to 3
and 4
, and so the expression
evaluates to 3 + 4 * 5
, or 23
.
The arithmetic operators can be used with integers, floating-point
numbers, and even characters (i.e., they can be used with any primitive
type other than boolean
). If either of the operands is a
floating-point number, floating-point arithmetic is used; otherwise,
integer arithmetic is used. This matters because integer arithmetic and
floating-point arithmetic differ in the way division is performed and in
the way underflows and overflows are handled, for example. The
arithmetic operators are:
+
)The +
operator adds two numbers. As we’ll see shortly, the +
operator can also be used to concatenate strings. If either operand of +
is a string, the other one is converted to a string as well. Be sure to use parentheses when you want to combine addition with concatenation. For example:
System
.
out
.
println
(
"Total: "
+
3
+
4
);
// Prints "Total: 34", not 7!
-
)When the -
operator is used as a binary operator, it subtracts its second operand from its first. For example, 7-3
evaluates to 4
. The -
operator can also perform unary negation.
*
)The *
operator multiplies its two operands. For example, 7*3
evaluates to 21
.
/
)The /
operator divides its first operand by its second. If both operands are integers, the result is an integer, and any remainder is lost. If either operand is a floating-point value, however, the result is a floating-point value. When you divide two integers, division by zero throws an ArithmeticException
. For floating-point calculations, however, division by zero simply yields an infinite result or NaN
:
7
/
3
// Evaluates to 2
7
/
3.0f
// Evaluates to 2.333333f
7
/
0
// Throws an ArithmeticException
7
/
0.0
// Evaluates to positive infinity
0.0
/
0.0
// Evaluates to NaN
%
)The %
operator computes the first operand modulo the second operand (i.e., it returns the remainder when the first operand is divided by the second operand an integral number of times). For example, 7%3
is 1
. The sign of the result is the same as the sign of the first operand. While the modulo operator is typically used with integer operands, it also works for floating-point values. For example, 4.3%2.1
evaluates to 0.1
. When you are operating with integers, trying to compute a value modulo zero causes an ArithmeticException
. When you are working with floating-point values, anything modulo 0.0
evaluates to NaN
, as does infinity modulo anything.
-
)When the -
operator is used as a unary operator—that is, before a single operand—it performs unary negation. In other words, it converts a positive value to an equivalently negative value, and vice versa.
In addition to adding numbers, the +
operator (and the related +=
operator) also concatenates, or joins, strings. If either of the
operands to +
is a string, the operator converts the other operand to
a string. For example:
// Prints "Quotient: 2.3333333"
System
.
out
.
println
(
"Quotient: "
+
7
/
3.0f
);
As a result, you must be careful to put any addition expressions in parentheses when combining them with string concatenation. If you do not, the addition operator is interpreted as a concatenation operator.
Java has built-in string conversions for all primitive types. An object is converted to a string by invoking its toString()
method. Some classes define custom toString()
methods so that objects of that class can easily be converted to strings in this way.
An array is converted to a string by invoking the built-in toString()
method, which, unfortunately, does not return a useful string representation of the array contents.
The ++
operator increments its single operand, which must be a
variable, an element of an array, or a field of an object, by 1. The
behavior of this operator depends on its position relative to the
operand. When used before the operand, where it is known as the
pre-increment operator, it increments the operand and evaluates to
the incremented value of that operand. When used after the operand,
where it is known as the post-increment operator, it increments its
operand, but evaluates to the value of that operand before it was
incremented.
For example, the following code sets both i
and j
to 2:
i
=
1
;
j
=
++
i
;
But these lines set i
to 2 and j
to 1:
i
=
1
;
j
=
i
++;
Similarly, the --
operator decrements its single numeric operand,
which must be a variable, an element of an array, or a field of an
object, by one. Like the ++
operator, the behavior of --
depends on
its position relative to the operand. When used before the operand, it
decrements the operand and returns the decremented value. When used
after the operand, it decrements the operand, but returns the
undecremented value.
The expressions x++
and x--
are equivalent to x=x+1
and x=x-1
,
respectively, except that when you are using the increment and decrement
operators, x
is only evaluated once. If x
is itself an expression
with side effects, this makes a big difference. For example, these two
expressions are not equivalent:
a
[
i
++]++;
// Increments an element of an array
// Adds 1 to an array element and stores new value in another element
a
[
i
++]
=
a
[
i
++]
+
1
;
These operators, in both prefix and postfix forms, are most commonly used to increment or decrement the counter that controls a loop.
The comparison operators consist of the equality operators that test
values for equality or inequality and the relational operators used
with ordered types (numbers and characters) to test for greater than and
less than relationships. Both types of operators yield a boolean
result, so they are typically used with if
statements and while
and
for
loops to make branching and looping decisions. For example:
if
(
o
!=
null
)
...;
// The not equals operator
while
(
i
<
a
.
length
)
...;
// The less than operator
Java provides the following equality operators:
==
)The ==
operator evaluates to true
if its two operands are equal and false
otherwise. With primitive operands, it tests whether the operand values themselves are identical. For operands of reference types, however, it tests whether the operands refer to the same object or array. In other words, it does not test the equality of two distinct objects or arrays. In particular, note that you cannot test two distinct strings for equality with this operator.
If ==
is used to compare two numeric or character operands that are
not of the same type, the narrower operand is converted to the type of
the wider operand before the comparison is done. For example, when you are
comparing a short
to a float
, the short
is first converted to a
float
before the comparison is performed. For floating-point
numbers, the special negative zero value tests equal to the regular,
positive zero value. Also, the special NaN
(Not a Number) value is
not equal to any other number, including itself. To test whether a
floating-point value is NaN
, use the Float.isNan()
or
Double.isNan()
method.
!=
)The !=
operator is exactly the opposite of the ==
operator. It evaluates to true
if its two primitive operands have different values or if its two reference operands refer to different objects or arrays. Otherwise, it evaluates to false
.
The relational operators can be used with numbers and characters, but not with boolean
values, objects, or arrays because those types are not ordered.
Java provides the following relational operators:
<
)Evaluates to true
if the first operand is less than the second.
<=
)Evaluates to true
if the first operand is less than or equal to the second.
>
)Evaluates to true
if the first operand is greater than the second.
>=
)Evaluates to true
if the first operand is greater than or equal to the second.
As we’ve just seen, the comparison operators compare their operands
and yield a boolean
result, which is often used in branching and
looping statements. In order to make branching and looping decisions
based on conditions more interesting than a single comparison, you can
use the Boolean (or logical) operators to combine multiple comparison
expressions into a single, more complex expression. The Boolean
operators require their operands to be boolean
values and they
evaluate to boolean
values. The operators are:
&&
) This operator performs a Boolean AND operation on its operands. It evaluates to true
if and only if both its operands are true
. If either or both operands are false
, it evaluates to false
. For example:
if
(
x
<
10
&&
y
>
3
)
...
// If both comparisons are true
This operator (and all the Boolean operators except the unary !
operator) have a lower precedence than the comparison operators. Thus,
it is perfectly legal to write a line of code like the one just shown.
However, some programmers prefer to use parentheses to make the order
of evaluation explicit:
if
((
x
<
10
)
&&
(
y
>
3
))
...
You should use whichever style you find easier to read.
This operator is called a conditional AND because it conditionally
evaluates its second operand. If the first operand evaluates to
false
, the value of the expression is false
, regardless of the
value of the second operand. Therefore, to increase efficiency, the
Java interpreter takes a shortcut and skips the second operand. The
second operand is not guaranteed to be evaluated, so you must use
caution when using this operator with expressions that have side
effects. On the other hand, the conditional nature of this operator
allows us to write Java expressions such as the following:
if
(
data
!=
null
&&
i
<
data
.
length
&&
data
[
i
]
!=
-
1
)
...
The second and third comparisons in this expression would cause errors
if the first or second comparisons evaluated to false
. Fortunately,
we don’t have to worry about this because of the conditional behavior
of the &&
operator.
||
)This operator performs a Boolean OR operation on its two boolean
operands. It evaluates to true
if either or both of its operands are true
. If both operands are false
, it evaluates to false
. Like the &&
operator, ||
does not always evaluate its second operand. If the first operand evaluates to true
, the value of the expression is true
, regardless of the value of the second operand. Thus, the operator simply skips the second operand in that case.
!
)This unary operator changes the boolean
value of its operand. If applied to a true
value, it evaluates to false
, and if applied to a false
value, it evaluates to true
. It is useful in expressions like these:
if
(!
found
)
...
// found is a boolean declared somewhere
while
(!
c
.
isEmpty
())
...
// The isEmpty() method returns a boolean
Because !
is a unary operator, it has a high precedence and often
must be used with parentheses:
if
(!(
x
>
y
&&
y
>
z
))
&
)When used with boolean
operands, the &
operator behaves like the &&
operator, except that it always evaluates both operands, regardless of the value of the first operand. This operator is almost always used as a bitwise operator with integer operands, however, and many Java programmers would not even recognize its use with boolean
operands as legal Java code.
|
)This operator performs a Boolean OR operation on its two boolean
operands. It is like the ||
operator, except that it always evaluates both operands, even if the first one is true
. The |
operator is almost always used as a bitwise operator on integer operands; its use with boolean
operands is very rare.
^
)When used with boolean
operands, this operator computes the exclusive OR (XOR) of its operands. It evaluates to true
if exactly one of the two operands is true
. In other words, it evaluates to false
if both operands are false
or if both operands are true
. Unlike the &&
and ||
operators, this one must always evaluate both operands. The ^
operator is much more commonly used as a bitwise operator on integer operands. With boolean
operands, this operator is equivalent to the !=
operator.
The bitwise and shift operators are low-level operators that manipulate the individual bits that make up an integer value. The bitwise operators are not commonly used in modern Java except for low-level work (e.g., network programming). They are used for testing and setting individual flag bits in a value. In order to understand their behavior, you must understand binary (base-2) numbers and the two’s complement format used to represent negative integers.
You cannot use these operators with floating-point, boolean
, array, or
object operands. When used with boolean
operands, the &
, |
, and
^
operators perform a different operation, as described in the
previous section.
If either of the arguments to a bitwise operator is a long
, the result
is a long
. Otherwise, the result is an int
. If the left operand of a
shift operator is a long
, the result is a long
; otherwise, the
result is an int
. The operators are:
~
)The unary ~
operator is known as the bitwise complement, or bitwise NOT, operator. It inverts each bit of its single operand, converting 1s to 0s and 0s to 1s. For example:
byte
b
=
~
12
;
// ~00001100 = => 11110011 or -13 decimal
flags
=
flags
&
~
f
;
// Clear flag f in a set of flags
&
)This operator combines its two integer operands by performing a Boolean AND operation on their individual bits. The result has a bit set only if the corresponding bit is set in both operands. For example:
10
&
7
// 00001010 & 00000111 = => 00000010 or 2
if
((
flags
&
f
)
!=
0
)
// Test whether flag f is set
When used with boolean
operands, &
is the infrequently used
Boolean AND operator described earlier.
|
)This operator combines its two integer operands by performing a Boolean OR operation on their individual bits. The result has a bit set if the corresponding bit is set in either or both of the operands. It has a zero bit only where both corresponding operand bits are zero. For example:
10
|
7
// 00001010 | 00000111 = => 00001111 or 15
flags
=
flags
|
f
;
// Set flag f
When used with boolean
operands, |
is the infrequently used
Boolean OR operator described earlier.
^
)This operator combines its two integer operands by performing a Boolean XOR (exclusive OR) operation on their individual bits. The result has a bit set if the corresponding bits in the two operands are different. If the corresponding operand bits are both 1s or both 0s, the result bit is a 0. For example:
10
^
7
// 00001010 ^ 00000111 = => 00001101 or 13
When used with boolean
operands, ^
is the seldom used Boolean
XOR operator.
<<
)The <<
operator shifts the bits of the left operand left by the number of places specified by the right operand. High-order bits of the left operand are lost, and zero bits are shifted in from the right. Shifting an integer left by n places is equivalent to multiplying that number by 2n. For example:
10
<<
1
// 0b00001010 << 1 = 00010100 = 20 = 10*2
7
<<
3
// 0b00000111 << 3 = 00111000 = 56 = 7*8
-
1
<<
2
// 0xFFFFFFFF << 2 = 0xFFFFFFFC = -4 = -1*4
// 0xFFFF_FFFC == 0b1111_1111_1111_1111_1111_1111_1111_1100
If the left operand is a long
, the right operand should be between 0
and 63. Otherwise, the left operand is taken to be an int
, and the
right operand should be between 0 and 31.
>>
)The >>
operator shifts the bits of the left operand to the right by the number of places specified by the right operand. The low-order bits of the left operand are shifted away and are lost. The high-order bits shifted in are the same as the original high-order bit of the left operand. In other words, if the left operand is positive, 0s are shifted into the high-order bits. If the left operand is negative, 1s are shifted in instead. This technique is known as sign extension; it is used to preserve the sign of the left operand. For example:
10
>>
1
// 00001010 >> 1 = 00000101 = 5 = 10/2
27
>>
3
// 00011011 >> 3 = 00000011 = 3 = 27/8
-
50
>>
2
// 11001110 >> 2 = 11110011 = -13 != -50/4
If the left operand is positive and the right operand is n, the >>
operator is the same as integer division by 2n.
>>>
)This operator is like the >>
operator, except that it always shifts zeros into the high-order bits of the result, regardless of the sign of the lefthand operand. This technique is called zero extension; it is appropriate when the left operand is being treated as an unsigned value (despite the fact that Java integer types are all signed). These are examples:
0xff
>>>
4
// 11111111 >>> 4 = 00001111 = 15 = 255/16
-
50
>>>
2
// 0xFFFFFFCE >>> 2 = 0x3FFFFFF3 = 1073741811
The assignment operators store, or assign, a value into a piece of the computer’s memory—often referred to as a storage location. The left operand must evaluate to an appropriate local variable, array element, or object field.
The lefthand side of an assignment expression is sometimes called an lvalue
. In Java it must refer to some assignable storage (i.e., memory that can be written to).
The righthand side (the rvalue
) can be any value of a type compatible with the variable.
An assignment expression evaluates to the value that is assigned to the variable.
More importantly, however, the expression has the side effect of actually performing the assignment—storing the rvalue
in the lvalue
.
Unlike all other binary operators, the assignment operators are right-associative, which means that the assignments in a=b=c
are performed right to left, as follows:
a=(b=c)
.
The basic assignment operator is =
. Do not confuse it with the
equality operator, ==
. In order to keep these two operators distinct,
we recommend that you read =
as “is assigned the value.”
In addition to this simple assignment operator, Java also defines 11
other operators that combine assignment with the 5 arithmetic operators
and the 6 bitwise and shift operators. For example, the +=
operator reads the value of the left variable, adds the value of the right
operand to it, stores the sum back into the left variable as a side
effect, and returns the sum as the value of the expression. Thus, the
expression x+=2
is almost the same as x=x+2
. The difference between
these two expressions is that when you use the +=
operator, the left
operand is evaluated only once. This makes a difference when that
operand has a side effect. Consider the following two expressions, which
are not equivalent:
a
[
i
++]
+=
2
;
a
[
i
++]
=
a
[
i
++]
+
2
;
The general form of these combination assignment operators is:
lvalue
op
=
rvalue
This is equivalent (unless there are side effects in lvalue
) to:
lvalue
=
lvalue
op
rvalue
+=
-=
*=
/=
%=
// Arithmetic operators plus assignment
&=
|=
^=
// Bitwise operators plus assignment
<<=
>>=
>>>=
// Shift operators plus assignment
The most commonly used operators are +=
and -=
, although &=
and |=
can also be useful when you are working with boolean
flags. For
example:
i
+=
2
;
// Increment a loop counter by 2
c
-=
5
;
// Decrement a counter by 5
flags
|=
f
;
// Set a flag f in an integer set of flags
flags
&=
~
f
;
// Clear a flag f in an integer set of flags
The conditional operator ?
: is a somewhat obscure ternary
(three-operand) operator inherited from C. It allows you to embed a
conditional within an expression. You can think of it as the operator
version of the if/else
statement. The first and second operands of
the conditional operator are separated by a question mark (?
), while
the second and third operands are separated by a colon (:). The first
operand must evaluate to a boolean
value. The second and third
operands can be of any type, but they must be convertible to the same
type.
The conditional operator starts by evaluating its first operand. If it
is true
, the operator evaluates its second operand and uses that as
the value of the expression. On the other hand, if the first operand is
false
, the conditional operator evaluates and returns its third
operand. The conditional operator never evaluates both its second and
third operand, so be careful when using expressions with side effects
with this operator. Examples of this operator are:
int
max
=
(
x
>
y
)
?
x
:
y
;
String
name
=
(
name
!=
null
)
?
name
:
"unknown"
;
Note that the ?
: operator has lower precedence than all other
operators except the assignment operators, so parentheses are not
usually necessary around the operands of this operator. Many programmers
find conditional expressions easier to read if the first operand is
placed within parentheses, however. This is especially true because the
conditional if
statement always has its conditional expression written
within parentheses.
The instanceof
operator is intimately bound up with objects and the
operation of the Java type system. If this is your first look at Java,
it may be preferable to skim this definition and return to this section
after you have a decent grasp of Java’s objects.
instanceof
requires an object or array value as its left operand and
the name of a reference type as its right operand. It evaluates to
true
if the object or array is an instance of the specified type; it
returns false
otherwise. If the left operand is null
, instanceof
always evaluates to false
. If an instanceof
expression evaluates to
true
, it means that you can safely cast and assign the left operand to
a variable of the type of the right operand.
The instanceof
operator can be used only with reference types and
objects, not primitive types and values. Examples of instanceof
are:
// True: all strings are instances of String
"string"
instanceof
String
// True: strings are also instances of Object
""
instanceof
Object
// False: null is never an instance of anything
null
instanceof
String
Object
o
=
new
int
[]
{
1
,
2
,
3
};
o
instanceof
int
[]
// True: the array value is an int array
o
instanceof
byte
[]
// False: the array value is not a byte array
o
instanceof
Object
// True: all arrays are instances of Object
// Use instanceof to make sure that it is safe to cast an object
if
(
object
instanceof
Point
)
{
Point
p
=
(
Point
)
object
;
}
In general, the use of instanceof
is discouraged among Java programmers.
It is often a sign of questionable program design.
Under normal circumstances, the usage of instanceof
can be avoided; it is only needed on rare occasions (but note that there are some cases where it is needed).
Java has six language constructs that are sometimes considered operators and sometimes considered simply part of the basic language syntax. These “operators” were included in Table 2-4 in order to show their precedence relative to the other true operators. The use of these language constructs is detailed elsewhere in this book, but is described briefly here so that you can recognize them in code examples:
.
)An object is a collection of data and methods that operate on that data; the data fields and methods of an object are called its members. The dot (.) operator accesses these members. If o
is an expression that evaluates to an object reference (or a class name), and f
is the name of a field of the class, o.f
evaluates to the value contained in that field. If m
is the name of a method, o.m
refers to that method and allows it to be invoked using the ()
operator shown later.
[]
)An array is a numbered list of values. Each element of an array can be referred to by its number, or index. The [ ]
operator allows you to refer to the individual elements of an array. If a
is an array, and i
is an expression that evaluates to an int
, a[i]
refers to one of the elements of a
. Unlike other operators that work with integer values, this operator restricts array index values to be of type int
or narrower.
()
)A method is a named collection of Java code that can be run, or invoked, by following the name of the method with zero or more comma-separated expressions contained within parentheses. The values of these expressions are the arguments to the method. The method processes the arguments and optionally returns a value that becomes the value of the method invocation expression. If o.m
is a method that expects no arguments, the method can be invoked with o.m()
. If the method expects three arguments, for example, it can be invoked with an expression such as o.m(x,y,z)
. o is referred to as the receiver of the method—if o
is an object, then it is said to be the receiver object. Before the Java interpreter invokes a method, it evaluates each of the arguments to be passed to the method. These expressions are guaranteed to be evaluated in order from left to right (which matters if any of the arguments have side effects).
->
)A lambda expression is an anonymous collection of executable Java code, essentially a method body. It consists of a method argument list (zero or more comma-separated expressions contained within parentheses) followed by the lambda arrow operator followed by a block of Java code. If the block of code comprises just a single statement, then the usual curly braces to denote block boundaries can be omitted. If the lambda takes only a single argument, the parentheses around the argument can be omitted.
new
)In Java, objects are created with the new
operator, which is followed by the type of the object to be created and a parenthesized list of arguments to be passed to the object constructor. A constructor is a special block of code that initializes a newly created object, so the object creation syntax is similar to the Java method invocation syntax. For example:
new
ArrayList
<
String
>();
new
Point
(
1
,
2
)
new
)Arrays are a special case of objects and they too are created with the new
operator, with a slightly different syntax. The keyword is followed by the type of the array to be created and the size of the array encased in square brackets—for example, as new int[5]
. In some circumstances arrays can also be created using the array literal syntax.
()
)As we’ve already seen, parentheses can also be used as an operator to perform narrowing type conversions, or casts. The first operand of this operator is the type to be converted to; it is placed between the parentheses. The second operand is the value to be converted; it follows the parentheses. For example:
(
byte
)
28
// An integer literal cast to a byte type
(
int
)
(
x
+
3.14f
)
// A floating-point sum value cast to an integer
(
String
)
h
.
get
(
k
)
// A generic object cast to a string
A statement is a basic unit of execution in the Java language—it expresses a single piece of intent by the programmer. Unlike expressions, Java statements do not have a value. Statements also typically contain expressions and operators (especially assignment operators) and are frequently executed for the side effects that they cause.
Many of the statements defined by Java are flow-control statements, such as conditionals and loops, that can alter the default, linear order of execution in well-defined ways. Table 2-5 summarizes the statements defined by Java.
Statement | Purpose | Syntax |
---|---|---|
expression |
side effects |
|
compound |
group statements |
|
empty |
do nothing |
|
labeled |
name a statement |
|
variable |
declare a variable |
|
|
conditional |
|
|
conditional |
|
|
loop |
|
|
loop |
|
|
simplified loop |
|
foreach |
collection iteration |
|
|
exit block |
|
|
restart loop |
|
|
end method |
|
|
critical section |
|
|
throw exception |
|
|
handle exception |
|
|
verify invariant |
|
As we saw earlier in the chapter, certain types of Java expressions have side effects. In other words, they do not simply evaluate to some value; they also change the program state in some way. You can use any expression with side effects as a statement simply by following it with a semicolon. The legal types of expression statements are assignments, increments and decrements, method calls, and object creation. For example:
a
=
1
;
// Assignment
x
*=
2
;
// Assignment with operation
i
++;
// Post-increment
--
c
;
// Pre-decrement
System
.
out
.
println
(
"statement"
);
// Method invocation
A compound statement is any number and kind of statements grouped together within curly braces. You can use a compound statement anywhere a statement is required by Java syntax:
for
(
int
i
=
0
;
i
<
10
;
i
++)
{
a
[
i
]++;
// Body of this loop is a compound statement.
b
[
i
]--;
// It consists of two expression statements
}
// within curly braces.
An empty statement in Java is written as a single semicolon. The
empty statement doesn’t do anything, but the syntax is occasionally
useful. For example, you can use it to indicate an empty loop body in a
for
loop:
for
(
int
i
=
0
;
i
<
10
;
a
[
i
++]++)
// Increment array elements
/* empty */
;
// Loop body is empty statement
A labeled statement is simply a statement that you have given a name
by prepending an identifier and a colon to it. Labels are used by the
break
and continue
statements. For example:
rowLoop:
for
(
int
r
=
0
;
r
<
rows
.
length
;
r
++)
{
// Labeled loop
colLoop:
for
(
int
c
=
0
;
c
<
columns
.
length
;
c
++)
{
// Another one
break
rowLoop
;
// Use a label
}
}
A local variable, often simply called a variable, is a symbolic name for a location to store a value that is defined within a method or compound statement. All variables must be declared before they can be used; this is done with a variable declaration statement. Because Java is a statically typed language, a variable declaration specifies the type of the variable, and only values of that type can be stored in the variable.
In its simplest form, a variable declaration specifies a variable’s type and name:
int
counter
;
String
s
;
A variable declaration can also include an initializer: an expression that specifies an initial value for the variable. For example:
int
i
=
0
;
String
s
=
readLine
();
int
[]
data
=
{
x
+
1
,
x
+
2
,
x
+
3
};
// Array initializers are discussed later
The Java compiler does not allow you to use a local variable that has not been initialized, so it is usually convenient to combine variable declaration and initialization into a single statement. The initializer expression need not be a literal value or a constant expression that can be evaluated by the compiler; it can be an arbitrarily complex expression whose value is computed when the program is run.
If a variable has an initializer then the programmer can use a special syntax to ask the compiler to automatically work out the type, if it is possible to do so:
var
i
=
0
;
// type of i inferred as int
var
s
=
readLine
();
// type of s inferred as String
This can be a useful syntax, but when learning the Java language it is probably better to avoid it at first while you become familiar with the Java type system.
A single variable declaration statement can declare and initialize more than one variable, but all variables must be of the same explicitly declared type. Variable names and optional initializers are separated from each other with commas:
int
i
,
j
,
k
;
float
x
=
1.0f
,
y
=
1.0f
;
String
question
=
"Really Quit?"
,
response
;
Variable declaration statements can begin with the final
keyword.
This modifier specifies that once an initial value is defined for the
variable, that value is never allowed to change:
final
String
greeting
=
getLocalLanguageGreeting
();
We will have more to say about the final
keyword later on, especially
when talking about the immutable style of programming.
Java variable declaration statements can
appear anywhere in Java code; they are not restricted to the beginning
of a method or block of code. Local variable declarations can also be
integrated with the initialize portion of a for
loop, as we’ll
2discuss shortly.
Local variables can be used only within the method or block of code in which they are defined. This is called their scope or lexical scope:
void
method
()
{
// A method definition
int
i
=
0
;
// Declare variable i
while
(
i
<
10
)
{
// i is in scope here
int
j
=
0
;
// Declare j; the scope of j begins here
i
++;
// i is in scope here; increment it
}
// j is no longer in scope;
System
.
out
.
println
(
i
);
// i is still in scope here
}
// The scope of i ends here
The if
statement is a fundamental control statement that allows Java
to make decisions or, more precisely, to execute statements
conditionally. The if
statement has an associated expression and
statement. If the expression evaluates to true
, the interpreter
executes the statement. If the expression evaluates to false
, the
interpreter skips the statement.
Java allows the expression to be of the wrapper type Boolean
instead
of the primitive type boolean
. In this case, the wrapper object is
automatically unboxed.
Here is an example if
statement:
if
(
username
==
null
)
// If username is null,
username
=
"John Doe"
;
// use a default value
Although they look extraneous, the parentheses around the expression are
a required part of the syntax for the if
statement. As we already
saw, a block of statements enclosed in curly braces is itself a
statement, so we can write if
statements that look like this as well:
if
((
address
==
null
)
||
(
address
.
equals
(
""
)))
{
address
=
"[undefined]"
;
System
.
out
.
println
(
"WARNING: no address specified."
);
}
An if
statement can include an optional else
keyword that is
followed by a second statement. In this form of the statement, the
expression is evaluated, and, if it is true
, the first statement is
executed. Otherwise, the second statement is executed. For example:
if
(
username
!=
null
)
System
.
out
.
println
(
"Hello "
+
username
);
else
{
username
=
askQuestion
(
"What is your name?"
);
System
.
out
.
println
(
"Hello "
+
username
+
". Welcome!"
);
}
When you use nested if/else
statements, some caution is required to
ensure that the else
clause goes with the appropriate if
statement.
Consider the following lines:
if
(
i
==
j
)
if
(
j
==
k
)
System
.
out
.
println
(
"i equals k"
);
else
System
.
out
.
println
(
"i doesn't equal j"
);
// WRONG!!
In this example, the inner if
statement forms the single statement
allowed by the syntax of the outer if
statement. Unfortunately, it is
not clear (except from the hint given by the indentation) which if
the
else
goes with. And in this example, the indentation hint is wrong.
The rule is that an else
clause like this is associated with the
nearest if
statement. Properly indented, this code looks like this:
if
(
i
==
j
)
if
(
j
==
k
)
System
.
out
.
println
(
"i equals k"
);
else
System
.
out
.
println
(
"i doesn't equal j"
);
// WRONG!!
This is legal code, but it is clearly not what the programmer had in
mind. When working with nested if
statements, you should use curly
braces to make your code easier to read. Here is a better way to write
the code:
if
(
i
==
j
)
{
if
(
j
==
k
)
System
.
out
.
println
(
"i equals k"
);
}
else
{
System
.
out
.
println
(
"i doesn't equal j"
);
}
The if/else
statement is useful for testing a condition and choosing
between two statements or blocks of code to execute. But what about when
you need to choose between several blocks of code? This is typically
done with an else
if
clause, which is not really new syntax, but a
common idiomatic usage of the standard if/else
statement. It looks
like this:
if
(
n
==
1
)
{
// Execute code block #1
}
else
if
(
n
==
2
)
{
// Execute code block #2
}
else
if
(
n
==
3
)
{
// Execute code block #3
}
else
{
// If all else fails, execute block #4
}
There is nothing special about this code. It is just a series of if
statements, where each if
is part of the else
clause of the previous
statement. Using the else
if
idiom is preferable to, and more
legible than, writing these statements out in their fully nested form:
if
(
n
==
1
)
{
// Execute code block #1
}
else
{
if
(
n
==
2
)
{
// Execute code block #2
}
else
{
if
(
n
==
3
)
{
// Execute code block #3
}
else
{
// If all else fails, execute block #4
}
}
}
An if
statement causes a branch in the flow of a program’s execution.
You can use multiple if
statements, as shown in the previous section,
to perform a multiway branch. This is not always the best solution,
however, especially when all of the branches depend on the value of a
single variable.
In this case, the repeated if
statements may seriously hamper readability, especially if the code has been refactored over time or features multiple levels of nested if
.
A better solution is to use a switch
statement, which is inherited
from the C programming language. Note, however, that the syntax of this statement is not nearly as elegant as other parts of Java,
and the failure to revisit the design of the feature is widely regarded as a mistake.
A switch
statement starts with an expression whose type is an int
,
short
, char
, byte
(or their wrapper type), String
, or an enum
(see Chapter 4 for more on enumerated types).
This expression is followed by a block of code in curly braces that
contains various entry points that correspond to possible values for the
expression. For example, the following switch
statement is equivalent
to the repeated if
and else/if
statements shown in the previous
section:
switch
(
n
)
{
case
1
:
// Start here if n == 1
// Execute code block #1
break
;
// Stop here
case
2
:
// Start here if n == 2
// Execute code block #2
break
;
// Stop here
case
3
:
// Start here if n == 3
// Execute code block #3
break
;
// Stop here
default
:
// If all else fails...
// Execute code block #4
break
;
// Stop here
}
As you can see from the example, the various entry points into a
switch
statement are labeled either with the keyword case
,
followed by an integer value and a colon, or with the special default
keyword, followed by a colon. When a switch
statement executes, the
interpreter computes the value of the expression in parentheses and then
looks for a case
label that matches that value. If it finds one, the
interpreter starts executing the block of code at the first statement
following the case
label. If it does not find a case
label with a
matching value, the interpreter starts execution at the first statement
following a special-case default
: label. Or, if there is no
default
: label, the interpreter skips the body of the switch
statement altogether.
Note the use of the break
keyword at the end of each case
in the
previous code. The break
statement is described later in this chapter,
but, in this example, it causes the interpreter to exit the body of the
switch
statement. The case
clauses in a switch
statement specify
only the starting point of the desired code. The individual cases
are
not independent blocks of code, and they do not have any implicit ending
point.
You must explicitly specify the end of each case
with a break
or related statement. In the absence of break
statements, a switch
statement begins executing code at the first statement after the matching case
label and continues executing statements until it reaches the end of the block. The control flow will fall through into the next case
label and continue executing, rather than exit the block.
On rare occasions, it is useful to write
code like this that falls through from one case
label to the next, but
99% of the time you should be careful to end every case
and default
section with a statement that causes the switch
statement to stop
executing. Normally you use a break
statement, but return
and
throw
also work.
As a consequence of this default fall-through, a switch
statement can have more than one case
clause labeling the same statement. Consider the switch
statement in the following method:
boolean
parseYesOrNoResponse
(
char
response
)
{
switch
(
response
)
{
case
'y'
:
case
'Y'
:
return
true
;
case
'n'
:
case
'N'
:
return
false
;
default
:
throw
new
IllegalArgumentException
(
"Response must be Y or N"
);
}
}
The switch
statement and its case
labels have some important
restrictions. First, the expression associated with a switch
statement must have an appropriate type—either byte
, char
, short
,
int
(or their wrappers), or an enum type or a String
. The
floating-point and boolean
types are not supported, and neither is
long
, even though long
is an integer type. Second, the value
associated with each case
label must be a constant value or a constant
expression the compiler can evaluate. A case
label cannot contain a
runtime expression involving variables or method calls, for example.
Third, the case
label values must be within the range of the data type
used for the switch
expression. And finally, it is not legal to have
two or more case
labels with the same value or more than one default
label.
The while
statement is a basic statement that allows Java to perform
repetitive actions—or, to put it another way, it is one of Java’s
primary looping constructs. It has the following syntax:
while
(
expression
)
statement
The while
statement works by first evaluating the expression
,
which must result in a boolean
or Boolean
value. If the value is
false
, the interpreter skips the statement
associated with the
loop and moves to the next statement in the program. If it is true
,
however, the statement
that forms the body of the loop is executed,
and the expression
is reevaluated. Again, if the value of
expression
is false
, the interpreter moves on to the next
statement in the program; otherwise, it executes the statement
again. This cycle continues while the expression
remains true
(i.e., until it evaluates to false
), at which point the while
statement ends, and the interpreter moves on to the next statement. You
can create an infinite loop with the syntax while(true)
.
Here is an example while
loop that prints the numbers 0 to 9:
int
count
=
0
;
while
(
count
<
10
)
{
System
.
out
.
println
(
count
);
count
++;
}
As you can see, the variable count
starts off at 0 in this example and
is incremented each time the body of the loop runs. Once the loop has
executed 10 times, the expression becomes false
(i.e., count
is no
longer less than 10), the while
statement finishes, and the Java
interpreter can move to the next statement in the program. Most loops
have a counter variable like count
. The variable names i
, j
, and
k
are commonly used as loop counters, although you should use more
descriptive names if it makes your code easier to understand.
A do
loop is much like a while
loop, except that the loop
expression is tested at the bottom of the loop rather than at the top.
This means that the body of the loop is always executed at least once.
The syntax is:
do
statement
while
(
expression
)
;
Notice a couple of differences between the do
loop and the more
ordinary while
loop. First, the do
loop requires both the do
keyword to mark the beginning of the loop and the while
keyword to
mark the end and introduce the loop condition. Also, unlike the while
loop, the do
loop is terminated with a semicolon. This is because the
do
loop ends with the loop condition rather than simply ending with a
curly brace that marks the end of the loop body. The following do
loop
prints the same output as the while
loop just discussed:
int
count
=
0
;
do
{
System
.
out
.
println
(
count
);
count
++;
}
while
(
count
<
10
);
The do
loop is much less commonly used than its while
cousin
because, in practice, it is unusual to encounter a situation where you
are sure you always want a loop to execute at least once.
The for
statement provides a looping construct that is often more
convenient than the while
and do
loops. The for
statement takes
advantage of a common looping pattern. Most loops have a counter, or
state variable of some kind, that is initialized before the loop starts,
tested to determine whether to execute the loop body, and then
incremented or updated somehow at the end of the loop body before the
test expression is evaluated again. The initialize
, test
, and update
steps are the three crucial manipulations of a loop variable, and the
for
statement makes these three steps an explicit part of the loop
syntax:
for
(
initialize
;
test
;
update
)
{
statement
}
This for
loop is basically equivalent to the following while
loop:
initialize
;
while
(
test
)
{
statement
;
update
;
}
Placing the initialize
, test
, and update
expressions at the top of a for
loop makes it especially easy to understand what
the loop is doing, and it prevents mistakes such as forgetting to
initialize or update the loop variable. The interpreter discards the
values of the initialize
and update
expressions, so to be
useful, these expressions must have side effects. initialize
is
typically an assignment expression, while update
is usually an
increment, decrement, or some other assignment.
The following for
loop prints the numbers 0 to 9, just as the previous
while
and do
loops have done:
int
count
;
for
(
count
=
0
;
count
<
10
;
count
++)
System
.
out
.
println
(
count
);
Notice how this syntax places all the important information about the
loop variable on a single line, making it very clear how the loop
executes. Placing the update
expression in the for
statement itself
also simplifies the body of the loop to a single statement; we don’t
even need to use curly braces to produce a statement block.
The for
loop supports some additional syntax that makes it even more
convenient to use. Because many loops use their loop variables only
within the loop, the for
loop allows the initialize
expression to
be a full variable declaration, so that the variable is scoped to the
body of the loop and is not visible outside of it. For example:
for
(
int
count
=
0
;
count
<
10
;
count
++)
System
.
out
.
println
(
count
);
Furthermore, the for
loop syntax does not restrict you to writing
loops that use only a single variable. Both the initialize
and
update
expressions of a for
loop can use a comma to separate
multiple initializations and update expressions. For example:
for
(
int
i
=
0
,
j
=
10
;
i
<
10
;
i
++,
j
--)
sum
+=
i
*
j
;
Even though all the examples so far have counted numbers, for
loops
are not restricted to loops that count numbers. For example, you might
use a for
loop to iterate through the elements of a linked list:
for
(
Node
n
=
listHead
;
n
!=
null
;
n
=
n
.
nextNode
())
process
(
n
);
The initialize
, test
, and update
expressions of a for
loop
are all optional; only the semicolons that separate the expressions are
required. If the test
expression is omitted, it is assumed to be
true
. Thus, you can write an infinite loop as for(;;)
.
Java’s for
loop works well for primitive types, but it is needlessly
clunky for handling collections of objects. Instead, an alternative
syntax known as a foreach loop is used for handling collections of
objects that need to be looped over.
The foreach loop uses the keyword for
followed by an opening
parenthesis, a variable declaration (without initializer), a colon, an
expression, a closing parenthesis, and finally the statement (or block)
that forms the body of the loop:
for
(
declaration
:
expression
)
statement
Despite its name, the foreach loop does not have a keyword
foreach
—instead, it is common to read the colon as “in”—as in “foreach
name in studentNames.”
For the while
, do
, and for
loops, we’ve shown an example that
prints 10 numbers. The foreach loop can do this too, but it needs a collection to iterate over. In order to loop 10 times (to print out 10
numbers), we need an array or other collection with 10 elements. Here’s
code we can use:
// These are the numbers we want to print
int
[]
primes
=
new
int
[]
{
2
,
3
,
5
,
7
,
11
,
13
,
17
,
19
,
23
,
29
};
// This is the loop that prints them
for
(
int
n
:
primes
)
System
.
out
.
println
(
n
);
The foreach is different from the while
, for
, or do
loops, because it
hides the loop counter or Iterator
from you. This is a very powerful
idea, as we’ll see when we discuss lambda expressions, but there are
some algorithms that cannot be expressed very naturally with a foreach
loop.
For example, suppose you want to print the elements of an array as a
comma-separated list. To do this, you need to print a comma after every
element of the array except the last, or equivalently, before every
element of the array except the first. With a traditional for
loop,
the code might look like this:
for
(
int
i
=
0
;
i
<
words
.
length
;
i
++)
{
if
(
i
>
0
)
System
.
out
.
(
", "
);
System
.
out
.
(
words
[
i
]);
}
This is a very straightforward task, but you simply cannot do it with foreach without keeping track of additional state. The problem is that the foreach loop doesn’t give you a loop counter or any other way to tell if you’re on the first iteration, the last iteration, or somewhere in between.
A similar issue exists when you’re using foreach to iterate through the
elements of a collection. Just as a foreach loop over an array has no
way to obtain the array index of the current element, a foreach loop
over a collection has no way to obtain the Iterator
object that is
being used to itemize the elements of the collection.
Here are some other things you cannot do with a foreach-style loop:
Iterate backward through the elements of an array or List
.
Use a single loop counter to access the same-numbered elements of two distinct arrays.
Iterate through the elements of a List
using calls to its get()
method rather than calls to its iterator.
A break
statement causes the Java interpreter to skip immediately to
the end of a containing statement. We have already seen the break
statement used with the switch
statement. The break
statement is
most often written as simply the keyword break
followed by a
semicolon:
break
;
When used in this form, it causes the Java interpreter to immediately
exit the innermost containing while
, do
, for
, or switch
statement. For example:
for
(
int
i
=
0
;
i
<
data
.
length
;
i
++)
{
if
(
data
[
i
]
==
target
)
{
// When we find what we're looking for,
index
=
i
;
// remember where we found it
break
;
// and stop looking!
}
}
// The Java interpreter goes here after executing break
The break
statement can also be followed by the name of a containing
labeled statement. When used in this form, break
causes the Java
interpreter to immediately exit the named block, which can be any kind
of statement, not just a loop or switch
. For example:
TESTFORNULL:
if
(
data
!=
null
)
{
for
(
int
row
=
0
;
row
<
numrows
;
row
++)
{
for
(
int
col
=
0
;
col
<
numcols
;
col
++)
{
if
(
data
[
row
][
col
]
==
null
)
break
TESTFORNULL
;
// treat the array as undefined.
}
}
}
// Java interpreter goes here after executing break TESTFORNULL
While a break
statement exits a loop, a continue
statement quits
the current iteration of a loop and starts the next one. continue
, in
both its unlabeled and labeled forms, can be used only within a while
,
do
, or for
loop. When used without a label, continue
causes the
innermost loop to start a new iteration. When used with a label that is
the name of a containing loop, it causes the named loop to start a new
iteration. For example:
for
(
int
i
=
0
;
i
<
data
.
length
;
i
++)
{
// Loop through data.
if
(
data
[
i
]
==
-
1
)
// If a data value is missing,
continue
;
// skip to the next iteration.
process
(
data
[
i
]);
// Process the data value.
}
while
, do
, and for
loops differ slightly in the way that
continue
starts a new iteration:
With a while
loop, the Java interpreter simply returns to the top of
the loop, tests the loop condition again, and, if it evaluates to
true
, executes the body of the loop again.
With a do
loop, the interpreter jumps to the bottom of the loop,
where it tests the loop condition to decide whether to perform another
iteration of the loop.
With a for
loop, the interpreter jumps to the top of the loop,
where it first evaluates the update
expression and then evaluates
the test
expression to decide whether to loop again. As you can see
from the examples, the behavior of a for
loop with a continue
statement is different from the behavior of the “basically equivalent”
while
loop presented earlier; update
gets evaluated in the for
loop but not in the equivalent while
loop.
A return
statement tells the Java interpreter to stop executing the
current method. If the method is declared to return a value, the
return
statement must be followed by an expression. The value of the
expression becomes the return value of the method. For example, the
following method computes and returns the square of a number:
double
square
(
double
x
)
{
// A method to compute x squared
return
x
*
x
;
// Compute and return a value
}
Some methods are declared void
to indicate that they do not return
any value. The Java interpreter runs methods like this by executing
their statements one by one until it reaches the end of the method.
After executing the last statement, the interpreter returns implicitly.
Sometimes, however, a void
method has to return explicitly before
reaching the last statement. In this case, it can use the return
statement by itself, without any expression. For example, the following
method prints, but does not return, the square root of its argument. If
the argument is a negative number, it returns without printing anything:
// A method to print square root of x
void
printSquareRoot
(
double
x
)
{
if
(
x
<
0
)
return
;
// If x is negative, return
System
.
out
.
println
(
Math
.
sqrt
(
x
));
// Print the square root of x
}
// Method end: return implicitly
Java has always provided support for multithreaded programming. We cover this in some detail later on (especially in “Java’s Support for Concurrency”); however, be aware that concurrency is difficult to get right, and has a number of subtleties.
In particular, when working with multiple threads, you must often take
care to prevent multiple threads from modifying an object simultaneously
in a way that might corrupt the object’s state. Java provides the
synchronized
statement to help the programmer prevent corruption. The
syntax is:
synchronized
(
expression
)
{
statements
}
expression
is an expression that must evaluate to an object (including
arrays). statements
constitute the code of the section that could
cause damage and must be enclosed in curly braces.
In Java, the protection of object state (i.e., data) is the primary concern of the concurrency primitives. This is unlike some other languages, where the exclusion of threads from critical sections (i.e., code) is the main focus.
Before executing the statement block, the Java interpreter first obtains
an exclusive lock on the object or array specified by expression
. It
holds the lock until it is finished running the block, then releases it.
While a thread holds the lock on an object, no other thread can obtain
that lock.
As well as the block form, synchronized
can also be used as a method modifier in Java.
When applied to a method, the keyword indicates that the entire method is treated as synchronized
.
For a synchronized
instance method, Java obtains an exclusive lock on the class instance. (Class and instance methods are discussed in Chapter 3.)
It can be thought of as a synchronized (this) { ... }
block that covers the entire method.
A static synchronized
method (a class method) causes Java to obtain an exclusive lock on the class (technically the class object corresponding to the type) before executing the method.
An exception is a signal that indicates some sort of exceptional condition or error has occurred. To throw an exception is to signal an exceptional condition. To catch an exception is to handle it—to take whatever actions are necessary to recover from it.
In Java, the throw
statement is used to throw an exception:
throw
expression
;
The expression
must evaluate to an exception object that describes
the exception or error that has occurred. We’ll talk more about types
of exceptions shortly; for now, all you need to know is that an exception:
Is represented by an object
Has a type that is a subclass of Exception
Has a slightly specialized role in Java’s syntax
Can be of two different types: checked or unchecked
Here is some example code that throws an exception:
public
static
double
factorial
(
int
x
)
{
if
(
x
<
0
)
throw
new
IllegalArgumentException
(
"x must be >= 0"
);
double
fact
;
for
(
fact
=
1.0
;
x
>
1
;
fact
*=
x
,
x
--)
/* empty */
;
// Note use of the empty statement
return
fact
;
}
When the Java interpreter executes a throw
statement, it immediately
stops normal program execution and starts looking for an exception
handler that can catch, or handle, the exception. Exception handlers
are written with the try/catch/finally
statement, which is described
in the next section. The Java interpreter first looks at the enclosing
block of code to see if it has an associated exception handler. If so,
it exits that block of code and starts running the exception-handling
code associated with the block. After running the exception handler, the
interpreter continues execution at the statement immediately following
the handler code.
If the enclosing block of code does not have an appropriate exception
handler, the interpreter checks the next higher enclosing block of code
in the method. This continues until a handler is found. If the method
does not contain an exception handler that can handle the exception
thrown by the throw
statement, the interpreter stops running the
current method and returns to the caller. Now the interpreter starts
looking for an exception handler in the blocks of code of the calling
method. In this way, exceptions propagate up through the lexical
structure of Java methods, up the call stack of the Java interpreter. If
the exception is never caught, it propagates all the way up to the
main()
method of the program. If it is not handled in that method,
the Java interpreter prints an error message, prints a stack trace to
indicate where the exception occurred, and then exits.
Java has two slightly different exception-handling mechanisms. The
classic form is the try/catch/finally
statement. The try
clause of
this statement establishes a block of code for exception handling. This
try
block is followed by zero or more catch
clauses, each of which
is a block of statements designed to handle specific exceptions. Each
catch
block can handle more than one different exception—to indicate
that a catch
block should handle multiple exceptions, we use the |
symbol to separate the different exceptions a catch
block should handle.
The catch
clauses are followed by an optional finally
block that
contains cleanup code guaranteed to be executed regardless of what happens in the try
block.
The following code illustrates the syntax and purpose of the
try/catch/finally
statement:
try
{
// Normally this code runs from the top of the block to the bottom
// without problems. But it can sometimes throw an exception,
// either directly with a throw statement or indirectly by calling
// a method that throws an exception.
}
catch
(
SomeException
e1
)
{
// This block contains statements that handle an exception object
// of type SomeException or a subclass of that type. Statements in
// this block can refer to that exception object by the name e1.
}
catch
(
AnotherException
|
YetAnotherException
e2
)
{
// This block contains statements that handle an exception of
// type AnotherException or YetAnotherException, or a subclass of
// either of those types. Statements in this block refer to the
// exception object they receive by the name e2.
}
finally
{
// This block contains statements that are always executed
// after we leave the try clause, regardless of whether we leave it:
// 1) normally, after reaching the bottom of the block;
// 2) because of a break, continue, or return statement;
// 3) with an exception that is handled by a catch clause above;
// 4) with an uncaught exception that has not been handled.
// If the try clause calls System.exit(), however, the interpreter
// exits before the finally clause can be run.
}
The try
clause simply establishes a block of code that either has its
exceptions handled or needs special cleanup code to be run when it
terminates for any reason. The try
clause by itself doesn’t do
anything interesting; it is the catch
and finally
clauses that do
the exception-handling and cleanup operations.
A try
block can be followed by zero or more catch
clauses that
specify code to handle various types of exceptions. Each catch
clause
is declared with a single argument that specifies the types of
exceptions the clause can handle (possibly using the special |
syntax
to indicate that the catch
block can handle more than one type of
exception) and also provides a name the clause can use to refer to the
exception object it is currently handling. Any type that a catch
block
wishes to handle must be some subclass of Throwable
.
When an exception is thrown, the Java interpreter looks for a catch
clause with an argument that matches the same type as the exception
object or a superclass of that type. The interpreter invokes the first
such catch
clause it finds. The code within a catch
block should
take whatever action is necessary to cope with the exceptional
condition. If the exception is a java.io.FileNotFoundException
exception, for example, you might handle it by asking the user to check
his spelling and try again.
It is not required to have a catch
clause for every possible
exception; in some cases, the correct response is to allow the exception
to propagate up and be caught by the invoking method. In other cases,
such as a programming error signaled by NullPointerException
, the
correct response is probably not to catch the exception at all, but
allow it to propagate and have the Java interpreter exit with a stack
trace and an error message.
The finally
clause is generally used to clean up after the code in
the try
clause (e.g., close files and shut down network connections).
The finally
clause is useful because it is guaranteed to be executed
if any portion of the try
block is executed, regardless of how the
code in the try
block completes. In fact, the only way a try
clause
can exit without allowing the finally
clause to be executed is by
invoking the System.exit()
method, which causes the Java interpreter
to stop running.
In the normal case, control reaches the end of the try
block and then
proceeds to the finally
block, which performs any necessary cleanup.
If control leaves the try
block because of a return
, continue
, or
break
statement, the finally
block is executed before control
transfers to its new destination.
If an exception occurs in the try
block and there is an associated
catch
block to handle the exception, control transfers first to the
catch
block and then to the finally
block. If there is no local
catch
block to handle the exception, control transfers first to the
finally
block, and then propagates up to the nearest containing
catch
clause that can handle the exception.
If a finally
block itself transfers control with a return
,
continue
, break
, or throw
statement or by calling a method that
throws an exception, the pending control transfer is abandoned, and this
new transfer is processed. For example, if a finally
clause throws an
exception, that exception replaces any exception that was in the process
of being thrown. If a finally
clause issues a return
statement, the
method returns normally, even if an exception has been thrown and has
not yet been handled.
try
and finally
can be used together without exceptions or any
catch
clauses. In this case, the finally
block is simply cleanup
code that is guaranteed to be executed, regardless of any break
,
continue
, or return
statements within the try
clause.
The standard form of a try
block is very general, but there is a common set of circumstances that require developers to be very careful when writing catch
and finally
blocks.
These circumstances are when operating with resources that need to be cleaned up or closed when they are no longer needed.
Java provides a very useful mechanism for automatically closing resources that require cleanup.
This is known as try-with-resources, or TWR. We discuss TWR in detail in “Classic Java I/O”, but for completeness, let’s introduce the syntax now.
The following example shows how to open a file using the FileInputStream
class (which results in an object that will require cleanup):
try
(
InputStream
is
=
new
FileInputStream
(
"/Users/ben/details.txt"
))
{
// ... process the file
}
This new form of try
takes parameters that are all objects that
require cleanup.2 These objects are scoped to this try
block,
and are then cleaned up automatically no matter how this block is
exited. The developer does not need to write any catch
or finally
blocks—the Java compiler automatically inserts correct cleanup code.
All new code that deals with resources should be written in the TWR
style—it is considerably less error prone than manually writing catch
blocks, and does not suffer from the problems that plague techniques
such as finalization (see “Finalization” for
details).
An assert
statement is an attempt to provide a capability to verify
design assumptions in Java code. An assertion consists of the
assert
keyword followed by a boolean expression that the programmer
believes should always evaluate to true
. By default, assertions are
not enabled, and the assert
statement does not actually do anything.
It is possible to enable assertions as a debugging tool, however; when
this is done, the assert
statement evaluates the expression. If it is
indeed true
, assert
does nothing. On the other hand, if the
expression evaluates to false
, the assertion fails, and the assert
statement throws a java.lang.AssertionError
.
Outside of the core JDK libraries, the assert
statement is extremely
rarely used. It turns out to be too inflexible for testing most
applications and is not often used by ordinary developers. Instead, developers use ordinary testing libraries, such as JUnit.
The assert
statement may include an optional second expression,
separated from the first by a colon. When assertions are enabled and the
first expression evaluates to false
, the value of the second
expression is taken as an error code or error message and is passed to
the AssertionError()
constructor. The full syntax of the statement is:
assert
assertion
;
or:
assert
assertion
:
errorcode
;
To use assertions effectively, you must also be aware of a couple of fine points. First, remember that your programs will normally run with assertions disabled and only sometimes with assertions enabled. This means that you should be careful not to write assertion expressions that contain side effects.
You should never throw AssertionError
from your own code, as it may
have unexpected results in future versions of the platform.
If an AssertionError
is thrown, it indicates that one of the
programmer’s assumptions has not held up. This means that the code is
being used outside of the parameters for which it was designed, and it
cannot be expected to work correctly. In short, there is no plausible
way to recover from an AssertionError
, and you should not attempt to
catch it (unless you catch it at the top level simply so that you can
display the error in a more user-friendly fashion).
For efficiency, it does not make sense to test assertions each time
code is executed—assert
statements encode assumptions that should
always be true. Thus, by default, assertions are disabled, and assert
statements have no effect. The assertion code remains compiled in the
class files, however, so it can always be enabled for diagnostic or
debugging purposes. You can enable assertions, either across the board
or selectively, with command-line arguments to the Java interpreter.
To enable assertions in all classes except for system classes, use the
-ea
argument. To enable assertions in system classes, use -esa
. To
enable assertions within a specific class, use -ea
followed by a colon
and the class name:
java
-
ea:
com
.
example
.
sorters
.
MergeSort
com
.
example
.
sorters
.
Test
To enable assertions for all classes in a package and in all of its
subpackages, follow the -ea
argument with a colon, the package name,
and three dots:
java
-
ea:
com
.
example
.
sorters
...
com
.
example
.
sorters
.
Test
You can disable assertions in the same way, using the -da
argument.
For example, to enable assertions throughout a package and then disable
them in a specific class or subpackage, use:
java
-
ea:
com
.
example
.
sorters
...
-
da:
com
.
example
.
sorters
.
QuickSort
java
-
ea:
com
.
example
.
sorters
...
-
da:
com
.
example
.
sorters
.
plugins
..
Finally, it is possible to control whether or not assertions are enabled or disabled at classloading time. If you use a custom classloader (see Chapter 11 for details on custom classloading) in your program and want to turn on assertions, you may be interested in these methods.
A method is a named sequence of Java statements that can be invoked by other Java code. When a method is invoked, it is passed zero or more values known as arguments. The method performs some computations and, optionally, returns a value. As described earlier in “Expressions and Operators”, a method invocation is an expression that is evaluated by the Java interpreter. Because method invocations can have side effects, however, they can also be used as expression statements. This section does not discuss method invocation, but instead describes how to define methods.
You already know how to define the body of a method; it is simply an arbitrary sequence of statements enclosed within curly braces. What is more interesting about a method is its signature.3 The signature specifies the following:
The name of the method
The number, order, type, and name of the parameters used by the method
The type of the value returned by the method
The checked exceptions that the method can throw (the signature may also list unchecked exceptions, but these are not required)
Various method modifiers that provide additional information about the method
A method signature defines everything you need to know about a method before calling it. It is the method specification and defines the API for the method. In order to use the Java platform’s online API reference, you need to know how to read a method signature. And, in order to write Java programs, you need to know how to define your own methods, each of which begins with a method signature.
A method signature looks like this:
modifiers
type
name
(
paramlist
)
[
throws
exceptions
]
The signature (the method specification) is followed by the method body (the method implementation), which is simply a sequence of Java statements enclosed in curly braces. If the method is abstract (see Chapter 3), the implementation is omitted, and the method body is replaced with a single semicolon.
The signature of a method may also include type variable declarations—such methods are known as generic methods. Generic methods and type variables are discussed in Chapter 4.
Here are some example method definitions, which begin with the signature and are followed by the method body:
// This method is passed an array of strings and has no return value.
// All Java programs have an entry point with this name and signature.
public
static
void
main
(
String
[]
args
)
{
if
(
args
.
length
>
0
)
System
.
out
.
println
(
"Hello "
+
args
[
0
]);
else
System
.
out
.
println
(
"Hello world"
);
}
// This method is passed two double arguments and returns a double.
static
double
distanceFromOrigin
(
double
x
,
double
y
)
{
return
Math
.
sqrt
(
x
*
x
+
y
*
y
);
}
// This method is abstract which means it has no body.
// Note that it may throw exceptions when invoked.
protected
abstract
String
readText
(
File
f
,
String
encoding
)
throws
FileNotFoundException
,
UnsupportedEncodingException
;
modifiers
is zero or more special modifier keywords, separated from
each other by spaces. A method might be declared with the public
and
static
modifiers, for example. The allowed modifiers and their
meanings are described in the next section.
The type
in a method signature specifies the return type of the
method. If the method does not return a value, type
must be void
.
If a method is declared with a non-void
return type, it must include
a return
statement that returns a value of (or is convertible to) the
declared type.
A constructor is a block of code, similar to a method, that is used to
initialize newly created objects. As we’ll see in
Chapter 3, constructors are defined in a very
similar way to methods, except that their signatures do not include this
type
specification.
The name
of a method follows the specification of its modifiers and
type. Method names, like variable names, are Java identifiers and,
like all Java identifiers, may contain letters in any language
represented by the Unicode character set. It is legal, and often quite
useful, to define more than one method with the same name, as long as
each version of the method has a different parameter list. Defining
multiple methods with the same name is called method overloading.
Unlike some other languages, Java does not have anonymous methods. Instead, Java 8 introduces lambda expressions, which are similar to anonymous methods, but which the Java runtime automatically converts to a suitable named method—see “Lambda Expressions” for more details.
For example, the System.out.println()
method we’ve seen already is
an overloaded method. One method by this name prints a string and other
methods by the same name print the values of the various primitive
types. The Java compiler decides which method to call based on the type
of the argument passed to the method.
When you are defining a method, the name of the method is always followed by the method’s parameter list, which must be enclosed in parentheses. The parameter list defines zero or more arguments that are passed to the method. The parameter specifications, if there are any, each consist of a type and a name and are separated from each other by commas (if there are multiple parameters). When a method is invoked, the argument values it is passed must match the number, type, and order of the parameters specified in this method signature line. The values passed need not have exactly the same type as specified in the signature, but they must be convertible to those types without casting.
When a Java method expects no arguments, its parameter list is simply
()
, not (void)
. Java does not regard void
as a type—C and C++
programmers in particular should pay heed.
Java allows the programmer to define and invoke methods that accept a variable number of arguments, using a syntax known colloquially as varargs. Varargs are covered in detail later in this chapter.
The final part of a method signature is the throws
clause, which is
used to list the checked exceptions that a method can throw. Checked
exceptions are a category of exception classes that must be listed in
the throws
clauses of methods that can throw them.
If a method uses the throw
statement to throw a checked exception, or if it calls some other method that throws a checked exception, and does not catch or handle that exception, the method must declare that it can throw that exception.
If a method can throw one or more checked exceptions, it specifies this by placing the throws
keyword after the argument list and following it by the name of the exception class or classes it can throw.
If a method does not throw any exceptions, it does not use the
throws
keyword. If a method throws more than one type of exception,
separate the names of the exception classes from each other with commas.
More on this in a bit.
The modifiers of a method consist of zero or more modifier keywords
such as public
, static
, or abstract
. Here is a list of allowed
modifiers and their meanings:
abstract
An abstract
method is a specification without an implementation. The curly braces and Java statements that would normally comprise the body of the method are replaced with a single semicolon. A class that includes an abstract
method must itself be declared abstract
. Such a class is incomplete and cannot be instantiated (see Chapter 3).
final
A final
method may not be overridden or hidden by a subclass, which makes it amenable to compiler optimizations that are not possible for regular methods. All private
methods are implicitly final
, as are all methods of any class that is declared final
.
native
The native
modifier specifies that the method implementation is written in some “native” language such as C and is provided externally to the Java program. Like abstract
methods, native
methods have no body: the curly braces are replaced with a semicolon.
public
, protected
, private
These access modifiers specify whether and where a method can be used outside of the class that defines it. These very important modifiers are explained in Chapter 3.
static
A method declared static
is a class method associated with the class itself rather than with an instance of the class (we cover this in more detail in Chapter 3).
strictfp
The fp
in this awkwardly named, rarely used modifier stands for “floating point.” Java normally takes advantage of any extended precision available to the runtime platform’s floating-point hardware. The use of this keyword forces Java to strictly obey the standard while running the strictfp
method and only perform floating-point arithmetic using 32- or 64-bit floating-point formats, even if this makes the results less accurate.
synchronized
The synchronized
modifier makes a method threadsafe. Before a thread can invoke a synchronized
method, it must obtain a lock on the method’s class (for static
methods) or on the relevant instance of the class (for non-static
methods). This prevents two threads from executing the method at the same time.
The synchronized
modifier is an implementation detail (because methods
can make themselves threadsafe in other ways) and is not formally part
of the method specification or API. Good documentation specifies
explicitly whether a method is threadsafe; you should not rely on the
presence or absence of the synchronized
keyword when working with
multithreaded programs.
Annotations are an interesting special case (see Chapter 4 for more on annotations)—they can be thought of as a halfway house between a method modifier and additional supplementary type information.
The Java exception-handling scheme distinguishes between two types of exceptions, known as checked and unchecked exceptions.
The distinction between checked and unchecked exceptions has to do with the circumstances under which the exceptions could be thrown. Checked exceptions arise in specific, well-defined circumstances, and very often are conditions from which the application may be able to partially or fully recover.
For example, consider some code that might find its configuration file
in one of several possible directories. If we attempt to open the file
from a directory it isn’t present in, then a FileNotFoundException
will be thrown. In our example, we want to catch this exception and move
on to try the next possible location for the file. In other words,
although the file not being present is an exceptional condition, it is
one from which we can recover, and it is an understood and anticipated
failure.
On the other hand, in the Java environment there are a set of failures
that cannot easily be predicted or anticipated, due to such things as
runtime conditions or abuse of library code. There is no good way to
predict an OutOfMemoryError
, for example, and any method that uses
objects or arrays can throw a NullPointerException
if it is passed an
invalid null
argument.
These are the unchecked exceptions—and practically any method can throw an unchecked exception at essentially any time. They are the Java environment’s version of Murphy’s law: “Anything that can go wrong, will go wrong.” Recovery from an unchecked exception is usually very difficult, if not impossible—simply due to their sheer unpredictability.
To figure out whether an exception is checked or unchecked, remember
that exceptions are Throwable
objects and that these fall into
two main categories, specified by the Error
and Exception
subclasses. Any exception object that is an Error
is unchecked. There
is also a subclass of Exception
called RuntimeException
—and any
subclass of RuntimeException
is also an unchecked exception. All other
exceptions are checked exceptions.
Java has different rules for working with checked and unchecked
exceptions. If you write a method that throws a checked exception, you
must use a throws
clause to declare the exception in the method
signature. The Java compiler checks to make sure you have declared them
in method signatures and produces a compilation error if you have not
(that’s why they’re called “checked exceptions”).
Even if you never throw a checked exception yourself, sometimes you must
use a throws
clause to declare a checked exception. If your method
calls a method that can throw a checked exception, you must either
include exception-handling code to handle that exception or use throws
to declare that your method can also throw that exception.
For example, the following method tries to estimate the size of a web
page—it uses the standard java.net
libraries, and the class URL
(we’ll meet these in Chapter 10) to contact the
web page. It uses methods and constructors that can throw various types of java.io.IOException
objects, so it declares this fact with a
throws
clause:
public
static
estimateHomepageSize
(
String
host
)
throws
IOException
{
URL
url
=
new
URL
(
"htp://"
+
host
+
"/"
);
try
(
InputStream
in
=
url
.
openStream
())
{
return
in
.
available
();
}
}
In fact, the preceding code has a bug: we’ve misspelled the protocol
specifier—there’s no such protocol as htp://. So, the
estimateHomepageSize()
method will always fail with a
MalformedURLException
.
How do you know if the method you are calling can throw a checked exception? You can look at its method signature to find out. Or, failing that, the Java compiler will tell you (by reporting a compilation error) if you’ve called a method whose exceptions you must handle or declare.
Methods may be declared to accept, and may be invoked with, variable
numbers of arguments. Such methods are commonly known as varargs
methods. The “print formatted” method System.out.printf()
as well
as the related format()
methods of String
use varargs, as do a
number of important methods from the Reflection API of
java.lang.reflect
.
To declare a variable-length argument list, follow the type of the
last argument to the method with an ellipsis (...
), indicating that
this last argument can be repeated zero or more times. For example:
public
static
int
max
(
int
first
,
int
...
rest
)
{
/* body omitted for now */
}
Varargs methods are handled purely by the compiler. They operate by
converting the variable number of arguments into an array. To the Java
runtime, the max()
method is indistinguishable from this one:
public
static
int
max
(
int
first
,
int
[]
rest
)
{
/* body omitted for now */
}
To convert a varargs signature to the “real” signature, simply replace
...
with [ ]
. Remember that only one ellipsis can appear in a
parameter list, and it may only appear on the last parameter in the
list.
Let’s flesh out the max()
example a little:
public
static
int
max
(
int
first
,
int
...
rest
)
{
int
max
=
first
;
for
(
int
i
:
rest
)
{
// legal because rest is actually an array
if
(
i
>
max
)
max
=
i
;
}
return
max
;
}
This max()
method is declared with two arguments. The first is just a
regular int
value. The second, however, may be repeated zero or more
times. All of the following are legal invocations of max()
:
max
(
0
)
max
(
1
,
2
)
max
(
16
,
8
,
4
,
2
,
1
)
Because varargs methods are compiled into methods that expect an array
of arguments, invocations of those methods are compiled to include code
that creates and initializes such an array. So the call max(1,2,3)
is
compiled to this:
max
(
1
,
new
int
[]
{
2
,
3
})
In fact, if you already have method arguments stored in an array, it is
perfectly legal for you to pass them to the method that way, instead of
writing them out individually. You can treat any ...
argument as if it
were declared as an array. The converse is not true, however: you can
only use varargs method invocation syntax when the method is actually
declared as a varargs method using an ellipsis.
Now that we have introduced operators, expressions, statements, and methods, we can finally talk about classes. A class is a named collection of fields that hold data values and methods that operate on those values. Classes are just one of five reference types supported by Java, but they are the most important type. Classes are thoroughly documented in a chapter of their own (Chapter 3). We introduce them here, however, because they are the next higher level of syntax after methods, and because the rest of this chapter requires a basic familiarity with the concept of a class and the basic syntax for defining a class, instantiating it, and using the resulting object.
The most important thing about classes is that they define new data
types. For example, you might define a class named Point
to represent
a data point in the two-dimensional Cartesian coordinate system. This
class would define fields (each of type double
) to hold the x and
y coordinates of a point and methods to manipulate and operate on the
point. The Point
class is a new data type.
When discussing data types, it is important to distinguish between the
data type itself and the values the data type represents. char
is a
data type: it represents Unicode characters. But a char
value
represents a single specific character. A class is a data type; a class
value is called an object. We use the name class because each class
defines a type (or kind, or species, or class) of objects. The Point
class is a data type that represents x,y points, while a Point
object represents a single specific x,y point. As you might imagine,
classes and their objects are closely linked. In the sections that
follow, we will discuss both.
Here is a possible definition of the Point
class we have been
discussing:
/** Represents a Cartesian (x,y) point */
public
class
Point
{
// The coordinates of the point
public
double
x
,
y
;
public
Point
(
double
x
,
double
y
)
{
// A constructor that
this
.
x
=
x
;
this
.
y
=
y
;
// initializes the fields
}
public
double
distanceFromOrigin
()
{
// A method that operates
return
Math
.
sqrt
(
x
*
x
+
y
*
y
);
// on the x and y fields
}
}
This class definition is stored in a file named Point.java and compiled to a file named Point.class, where it is available for use by Java programs and other classes. This class definition is provided here for completeness and to provide context, but don’t expect to understand all the details just yet; most of Chapter 3 is devoted to the topic of defining classes.
Keep in mind that you don’t have to define every class you want to use in a Java program. The Java platform includes thousands of predefined classes that are guaranteed to be available on every computer that runs Java.
Now that we have defined the Point
class as a new data type, we can
use the following line to declare a variable that holds a Point
object:
Point
p
;
Declaring a variable to hold a Point
object does not create the object
itself, however. To actually create an object, you must use the new
operator. This keyword is followed by the object’s class (i.e., its
type) and an optional argument list in parentheses. These arguments are
passed to the constructor for the class, which initializes internal
fields in the new object:
// Create a Point object representing (2,-3.5).
// Declare a variable p and store a reference to the new Point object
Point
p
=
new
Point
(
2.0
,
-
3.5
);
// Create some other objects as well
// An object that represents the current time
LocalDateTime
d
=
new
LocalDateTime
();
// A HashSet object to hold a set of strings
Set
<
String
>
words
=
new
HashSet
<>();
The new
keyword is by far the most common way to create objects in
Java. A few other ways are also worth mentioning. First, classes that
meet certain criteria are so important that Java defines special literal
syntax for creating objects of those types (as we discuss later in this
section). Second, Java supports a dynamic loading mechanism that allows
programs to load classes and create instances of those classes
dynamically. See Chapter 11 for more details.
Finally, objects can also be created by deserializing them. An object
that has had its state saved, or serialized, usually to a file, can be
re-created using the java.io.ObjectInputStream
class.
Now that we’ve seen how to define classes and instantiate them by creating objects, we need to look at the Java syntax that allows us to use those objects. Recall that a class defines a collection of fields and methods. Each object has its own copies of those fields and has access to those methods. We use the dot character (.) to access the named fields and methods of an object. For example:
Point
p
=
new
Point
(
2
,
3
);
// Create an object
double
x
=
p
.
x
;
// Read a field of the object
p
.
y
=
p
.
x
*
p
.
x
;
// Set the value of a field
double
d
=
p
.
distanceFromOrigin
();
// Access a method of the object
This syntax is very common when programming in object-oriented
languages, and Java is no exception, frequently. Note, in particular, the expressions p.distance
FromOrigin()
. This tells the Java
compiler to look up a method named distance
FromOrigin()
(which is
defined by the class Point
) and use that method to perform a
computation on the fields of the object p
. We’ll cover the details of
this operation in Chapter 3.
In our discussion of primitive types, we saw that each primitive type has a literal syntax for including values of the type literally into the text of a program. Java also defines a literal syntax for a few special reference types, as described next.
The String
class represents text as a string of characters. Because
programs usually communicate with their users through the written word,
the ability to manipulate strings of text is quite important in any
programming language. In Java, strings are objects; the data type used
to represent text is the String
class. Modern Java programs usually
use more string data than anything else.
Accordingly, because strings are such a fundamental data type, Java
allows you to include text literally in programs by placing it between
double-quote ("
) characters. For example:
String
name
=
"David"
;
System
.
out
.
println
(
"Hello, "
+
name
);
Don’t confuse the double-quote characters that surround string literals
with the single-quote (or apostrophe) characters that surround char
literals. String literals can contain any of the escape sequences
char
literals can (see Table 2-2).
Escape sequences are particularly useful for embedding double-quote
characters within double-quoted string literals. For example:
String
story
=
"\t\"How can you stand it?\" he asked sarcastically.\n"
;
String literals cannot contain comments and may consist of only a single
line. Java does not support any kind of continuation-character syntax
that allows two separate lines to be treated as a single line. If you
need to represent a long string of text that does not fit on a single
line, break it into independent string literals and use the +
operator to concatenate the literals. For example:
// This is illegal; string literals cannot be broken across lines.
String
x
=
"This is a test of the
emergency broadcast system"
;
String
s
=
"This is a test of the "
+
// Do this instead
"emergency broadcast system"
;
The literals are concatenated when your program is compiled, not when it is run, so you do not need to worry about any kind of performance penalty.
The second type that supports its own special object literal syntax is
the class named Class
. Instances of the Class
class represent a
Java data type, and contain metadata about the type that is referred to.
To include a Class
object literally in a Java program, follow the name
of any data type with .class
. For example:
Class
<?>
typeInt
=
int
.
class
;
Class
<?>
typeIntArray
=
int
[].
class
;
Class
<?>
typePoint
=
Point
.
class
;
In Java 8, a major new feature was introduced—lambda expressions. These are a very common programming language construct, and in particular are extremely widely used in the family of languages known as functional programming languages (e.g., Lisp, Haskell, and OCaml). The power and flexibility of lambdas goes far beyond just functional languages, and they can be found in almost all modern programming languages.
The syntax for a lambda expression looks like this:
(
paramlist
)
-
>
{
statements
}
One simple, very traditional example:
Runnable
r
=
()
->
System
.
out
.
println
(
"Hello World"
);
When a lambda expression is used as a value, it is automatically converted to a new object of the correct type for the variable that it is being placed into. This auto-conversion and type inference is essential to Java’s approach to lambda expressions. Unfortunately, it relies on a proper understanding of Java’s type system as a whole. “Nested Types” provides a more detailed explanation of lambda expressions—so for now, it suffices to simply recognize the syntax for lambdas.
A slightly more complex example:
ActionListener
listener
=
(
e
)
->
{
System
.
out
.
println
(
"Event fired at: "
+
e
.
getWhen
());
System
.
out
.
println
(
"Event command: "
+
e
.
getActionCommand
());
};
An array is a special kind of object that holds zero or more primitive values or references. These values are held in the elements of the array, which are unnamed variables referred to by their position or index. The type of an array is characterized by its element type, and all elements of the array must be of that type.
Array elements are numbered starting with zero, and valid indexes range from zero to the number of elements minus one. The array element with index 1, for example, is the second element in the array. The number of elements in an array is its length. The length of an array is specified when the array is created, and it never changes.
The element type of an array may be any valid Java type, including array types. This means that Java supports arrays of arrays, which provide a kind of multidimensional array capability. Java does not support the matrix-style multidimensional arrays found in some languages.
Array types are reference types, just as classes are. Instances of arrays are objects, just as the instances of a class are.4 Unlike classes, array types do not have to be defined. Simply place square brackets after the element type. For example, the following code declares three variables of array type:
byte
b
;
// byte is a primitive type
byte
[]
arrayOfBytes
;
// byte[] is an array of byte values
byte
[][]
arrayOfArrayOfBytes
;
// byte[][] is an array of byte[]
String
[]
points
;
// String[] is an array of strings
The length of an array is not part of the array type. It is not
possible, for example, to declare a method that expects an array of
exactly four int
values. If a method parameter is of type
int[]
, a caller can pass an array with any number (including zero) of
elements.
Array types are not classes, but array instances are objects. This
means that arrays inherit the methods of java.lang.Object
. Arrays
implement the Cloneable
interface and override the clone()
method
to guarantee that an array can always be cloned and that clone()
never
throws a CloneNotSupportedException
. Arrays also implement
Serializable
so that any array can be serialized if its element type
can be serialized. Finally, all arrays have a public final int
field
named length
that specifies the number of elements in the array.
Because arrays extend Object
and implement the Cloneable
and
Serializable
interfaces, any array type can be widened to any of these
three types. But certain array types can also be widened to other array
types. If the element type of an array is a reference type T
, and T
is assignable to a type S
, the array type T[]
is assignable to the
array type S[]
. Note that there are no widening conversions of this
sort for arrays of a given primitive type. As examples, the following
lines of code show legal array widening conversions:
String
[]
arrayOfStrings
;
// Created elsewhere
int
[][]
arrayOfArraysOfInt
;
// Created elsewhere
// String is assignable to Object,
// so String[] is assignable to Object[]
Object
[]
oa
=
arrayOfStrings
;
// String implements Comparable, so a String[] can
// be considered a Comparable[]
Comparable
[]
ca
=
arrayOfStrings
;
// An int[] is an Object, so int[][] is assignable to Object[]
Object
[]
oa2
=
arrayOfArraysOfInt
;
// All arrays are cloneable, serializable Objects
Object
o
=
arrayOfStrings
;
Cloneable
c
=
arrayOfArraysOfInt
;
Serializable
s
=
arrayOfArraysOfInt
[
0
];
This ability to widen an array type to another array type means that the compile-time type of an array is not always the same as its runtime type.
This widening is known as array covariance, and as we shall see in “Bounded Type Parameters”, it is regarded by modern standards as a historical artifact and a misfeature, because of the mismatch between compile and runtime typing that it exposes.
The compiler must usually insert runtime checks before any operation
that stores a reference value into an array element to ensure that the
runtime type of the value matches the runtime type of the array element.
An ArrayStoreException
is thrown if the runtime check fails.
As we’ve seen, you write an array type simply by placing brackets after the element type. For compatibility with C and C++, however, Java supports an alternative syntax in variable declarations: brackets may be placed after the name of the variable instead of, or in addition to, the element type. This applies to local variables, fields, and method parameters. For example:
// This line declares local variables of type int, int[] and int[][]
int
justOne
,
arrayOfThem
[],
arrayOfArrays
[][];
// These three lines declare fields of the same array type:
public
String
[][]
aas1
;
// Preferred Java syntax
public
String
aas2
[][];
// C syntax
public
String
[]
aas3
[];
// Confusing hybrid syntax
// This method signature includes two parameters with the same type
public
static
double
dotProduct
(
double
[]
x
,
double
y
[])
{
...
}
To create an array value in Java, you use the new
keyword, just as
you do to create an object. Array types don’t have constructors, but you
are required to specify a length whenever you create an array. Specify
the desired size of your array as a nonnegative integer between square
brackets:
// Create a new array to hold 1024 bytes
byte
[]
buffer
=
new
byte
[
1024
];
// Create an array of 50 references to strings
String
[]
lines
=
new
String
[
50
];
When you create an array with this syntax, each of the array elements is
automatically initialized to the same default value that is used for the
fields of a class: false
for boolean
elements, \u0000
for char
elements, 0
for integer elements, 0.0
for floating-point elements, and
null
for elements of reference type.
Array creation expressions can also be used to create and initialize a multidimensional array of arrays. This syntax is somewhat more complicated and is explained later in this section.
To create an array and initialize its elements in a single expression, omit the array length and follow the square brackets with a comma-separated list of expressions within curly braces. The type of each expression must be assignable to the element type of the array, of course. The length of the array that is created is equal to the number of expressions. It is legal, but not necessary, to include a trailing comma following the last expression in the list. For example:
String
[]
greetings
=
new
String
[]
{
"Hello"
,
"Hi"
,
"Howdy"
};
int
[]
smallPrimes
=
new
int
[]
{
2
,
3
,
5
,
7
,
11
,
13
,
17
,
19
,
};
Note that this syntax allows arrays to be created, initialized, and used without ever being assigned to a variable. In a sense, these array creation expressions are anonymous array literals. Here are examples:
// Call a method, passing an anonymous array literal that
// contains two strings
String
response
=
askQuestion
(
"Do you want to quit?"
,
new
String
[]
{
"Yes"
,
"No"
});
// Call another method with an anonymous array (of anonymous objects)
double
d
=
computeAreaOfTriangle
(
new
Point
[]
{
new
Point
(
1
,
2
),
new
Point
(
3
,
4
),
new
Point
(
3
,
2
)
});
When an array initializer is part of a variable declaration, you may
omit the new
keyword and element type and list the desired array
elements within curly braces:
String
[]
greetings
=
{
"Hello"
,
"Hi"
,
"Howdy"
};
int
[]
powersOfTwo
=
{
1
,
2
,
4
,
8
,
16
,
32
,
64
,
128
};
Array literals are created and initialized when the program is run, not when the program is compiled. Consider the following array literal:
int
[]
perfectNumbers
=
{
6
,
28
};
This is compiled into Java byte codes that are equivalent to:
int
[]
perfectNumbers
=
new
int
[
2
];
perfectNumbers
[
0
]
=
6
;
perfectNumbers
[
1
]
=
28
;
The fact that Java does all array initialization at runtime has an important corollary. It means that the expressions in an array initializer may be computed at runtime and need not be compile-time constants. For example:
Point
[]
points
=
{
circle1
.
getCenterPoint
(),
circle2
.
getCenterPoint
()
};
Once an array has been created, you are ready to start using it. The following sections explain basic access to the elements of an array and cover common idioms of array usage, such as iterating through the elements of an array and copying an array or part of an array.
The elements of an array are variables. When an array element appears
in an expression, it evaluates to the value held in the element. And
when an array element appears on the lefthand side of an assignment
operator, a new value is stored into that element. Unlike a normal
variable, however, an array element has no name, only a number. Array
elements are accessed using a square bracket notation. If a
is an
expression that evaluates to an array reference, you index that array
and refer to a specific element with a[i]
, where i
is an integer
literal or an expression that evaluates to an int
. For example:
// Create an array of two strings
String
[]
responses
=
new
String
[
2
];
responses
[
0
]
=
"Yes"
;
// Set the first element of the array
responses
[
1
]
=
"No"
;
// Set the second element of the array
// Now read these array elements
System
.
out
.
println
(
question
+
" ("
+
responses
[
0
]
+
"/"
+
responses
[
1
]
+
" ): "
);
// Both the array reference and the array index may be more complex
double
datum
=
data
.
getMatrix
()[
data
.
row
()
*
data
.
numColumns
()
+
data
.
column
()];
The array index expression must be of type int
, or a type that can be
widened to an int
: byte
, short
, or even char
. It is obviously
not legal to index an array with a boolean
, float
, or double
value. Remember that the length
field of an array is an int
and that
arrays may not have more than Integer.MAX_VALUE
elements. Indexing an
array with an expression of type long
generates a compile-time error,
even if the value of that expression at runtime would be within the
range of an int
.
Remember that the first element of an array a
is a[0]
, the second element is a[1]
, and the last is a[a.length-1]
.
A common bug involving arrays is use of an index that is too small (a
negative index) or too large (greater than or equal to the array
length
). In languages like C or C++, accessing elements before the
beginning or after the end of an array yields unpredictable behavior
that can vary from invocation to invocation and platform to platform.
Such bugs may not always be caught, and if a failure occurs, it may be
at some later time. While it is just as easy to write faulty array
indexing code in Java, Java guarantees predictable results by checking
every array access at runtime. If an array index is too small or too
large, Java immediately throws an ArrayIndexOutOfBoundsException
.
It is common to write loops that iterate through each of the elements
of an array in order to perform some operation on it. This is typically
done with a for
loop. The following code, for example, computes the
sum of an array of integers:
int
[]
primes
=
{
2
,
3
,
5
,
7
,
11
,
13
,
17
,
19
,
23
};
int
sumOfPrimes
=
0
;
for
(
int
i
=
0
;
i
<
primes
.
length
;
i
++)
sumOfPrimes
+=
primes
[
i
];
The structure of this for
loop is idiomatic, and you’ll see it
frequently. Java also has the foreach syntax that we’ve already met. The summing code could be rewritten succinctly as follows:
for
(
int
p
:
primes
)
sumOfPrimes
+=
p
;
All array types implement the Cloneable
interface, and any array can
be copied by invoking its clone()
method. Note that a cast is
required to convert the return value to the appropriate array type, but the clone()
method of arrays is guaranteed not to throw
CloneNotSupportedException
:
int
[]
data
=
{
1
,
2
,
3
};
int
[]
copy
=
(
int
[])
data
.
clone
();
The clone()
method makes a shallow copy. If the element type of the
array is a reference type, only the references are copied, not the
referenced objects themselves. Because the copy is shallow, any array
can be cloned, even if the element type is not itself Cloneable
.
Sometimes you simply want to copy elements from one existing array to
another existing array. The System.arraycopy()
method is designed to
do this efficiently, and you can assume that Java VM implementations
perform this method using high-speed block copy operations on the
underlying hardware.
arraycopy()
is a straightforward function that is difficult to use
only because it has five arguments to remember. First, pass the source
array from which elements are to be copied. Second, pass the index of
the start element in that array. Pass the destination array and the
destination index as the third and fourth arguments. Finally, as the
fifth argument, specify the number of elements to be copied.
arraycopy()
works correctly even for overlapping copies within the
same array. For example, if you’ve “deleted” the element at index 0
from array a
and want to shift the elements between indexes 1
and
n
down one so that they occupy indexes 0
through n-1,
you could do
this:
System
.
arraycopy
(
a
,
1
,
a
,
0
,
n
);
The java.util.Arrays
class contains a number of static utility
methods for working with arrays. Most of these methods are heavily
overloaded, with versions for arrays of each primitive type and another
version for arrays of objects. The sort()
and binarySearch()
methods
are particularly useful for sorting and searching arrays. The equals()
method allows you to compare the content of two arrays. The
Arrays.toString()
method is useful when you want to convert array
content to a string, such as for debugging or logging output.
The Arrays
class also includes deepEquals()
, deepHashCode()
, and
deepToString()
methods that work correctly for multidimensional
arrays.
As we’ve seen, an array type is written as the element type followed
by a pair of square brackets. An array of char
is char[]
, and an
array of arrays of char
is char[][]
. When the elements of an array
are themselves arrays, we say that the array is multidimensional. In
order to work with multidimensional arrays, you need to understand a few
additional details.
Imagine that you want to use a multidimensional array to represent a multiplication table:
int
[][]
products
;
// A multiplication table
Each of the pairs of square brackets represents one dimension, so this
is a two-dimensional array. To access a single int
element of this
two-dimensional array, you must specify two index values, one for each
dimension. Assuming that this array was actually initialized as a
multiplication table, the int
value stored at any given element would
be the product of the two indexes. That is, products[2][4]
would be 8,
and products[3][7]
would be 21.
To create a new multidimensional array, use the new
keyword and
specify the size of both dimensions of the array. For example:
int
[][]
products
=
new
int
[
10
][
10
];
In some languages, an array like this would be created as a single block
of 100 int
values. Java does not work this way. This line of code does
three things:
Declares a variable named products
to hold an array of arrays of
int
.
Creates a 10-element array to hold 10 arrays of int
.
Creates 10 more arrays, each of which is a 10-element array of int
.
It assigns each of these 10 new arrays to the elements of the initial
array. The default value of every int
element of each of these 10 new
arrays is 0.
To put this another way, the previous single line of code is equivalent to the following code:
int
[][]
products
=
new
int
[
10
][];
// An array to hold 10 int[] values
for
(
int
i
=
0
;
i
<
10
;
i
++)
// Loop 10 times...
products
[
i
]
=
new
int
[
10
];
// ...and create 10 arrays
The new
keyword performs this additional initialization automatically
for you. It works with arrays with more than two dimensions as well:
float
[][][]
globalTemperatureData
=
new
float
[
360
][
180
][
100
];
When using new
with multidimensional arrays, you do not have to
specify a size for all dimensions of the array, only the leftmost
dimension or dimensions. For example, the following two lines are legal:
float
[][][]
globalTemperatureData
=
new
float
[
360
][][];
float
[][][]
globalTemperatureData
=
new
float
[
360
][
180
][];
The first line creates a single-dimensional array, where each element of
the array can hold a float[][]
. The second line creates a
two-dimensional array, where each element of the array is a float[]
.
If you specify a size for only some of the dimensions of an array,
however, those dimensions must be the leftmost ones. The following lines
are not legal:
float
[][][]
globalTemperatureData
=
new
float
[
360
][][
100
];
// Error!
float
[][][]
globalTemperatureData
=
new
float
[][
180
][
100
];
// Error!
Like a one-dimensional array, a multidimensional array can be initialized using an array initializer. Simply use nested sets of curly braces to nest arrays within arrays. For example, we can declare, create, and initialize a 5 × 5 multiplication table like this:
int
[][]
products
=
{
{
0
,
0
,
0
,
0
,
0
},
{
0
,
1
,
2
,
3
,
4
},
{
0
,
2
,
4
,
6
,
8
},
{
0
,
3
,
6
,
9
,
12
},
{
0
,
4
,
8
,
12
,
16
}
};
Or, if you want to use a multidimensional array without declaring a variable, you can use the anonymous initializer syntax:
boolean
response
=
bilingualQuestion
(
question
,
new
String
[][]
{
{
"Yes"
,
"No"
},
{
"Oui"
,
"Non"
}});
When you create a multidimensional array using the new
keyword, it is
usually good practice to only use rectangular arrays: one in which
all the array values for a given dimension have the same size.
Now that we’ve covered arrays and introduced classes and objects, we can turn to a more general description of reference types. Classes and arrays are two of Java’s five kinds of reference types. Classes were introduced earlier and are covered in complete detail, along with interfaces, in Chapter 3. Enumerated types and annotation types are reference types introduced in Chapter 4.
This section does not cover specific syntax for any particular reference type, but instead explains the general behavior of reference types and illustrates how they differ from Java’s primitive types. In this section, the term object refers to a value or instance of any reference type, including arrays.
Reference types and objects differ substantially from primitive types and their primitive values:
Eight primitive types are defined by the Java language, and the
programmer cannot define new primitive types. Reference types are
user-defined, so there is an unlimited number of them. For example, a
program might define a class named Point
and use objects of this newly
defined type to store and manipulate x,y points in a Cartesian
coordinate system.
Primitive types represent single values. Reference types are aggregate
types that hold zero or more primitive values or objects. Our
hypothetical Point
class, for example, might hold two double
values
to represent the x and y coordinates of the points. The char[]
and
Point[]
array types are aggregate types because they hold a sequence
of primitive char
values or Point
objects.
Primitive types require between one and eight bytes of memory. When a primitive value is stored in a variable or passed to a method, the computer makes a copy of the bytes that hold the value. Objects, on the other hand, may require substantially more memory. Memory to store an object is dynamically allocated on the heap when the object is created and this memory is automatically “garbage collected” when the object is no longer needed.
When an object is assigned to a variable or passed to a method, the memory that represents the object is not copied. Instead, only a reference to that memory is stored in the variable or passed to the method.
References are completely opaque in Java and the representation of a reference is an implementation detail of the Java runtime. If you are a C programmer, however, you can safely imagine a reference as a pointer or a memory address. Remember, though, that Java programs cannot manipulate references in any way.
Unlike pointers in C and C++, references cannot be converted to or from
integers, and they cannot be incremented or decremented. C and C++
programmers should also note that Java does not support the &
address-of operator or the *
and ->
dereference operators.
The following code manipulates a primitive int
value:
int
x
=
42
;
int
y
=
x
;
After these lines execute, the variable y
contains a copy of the value
held in the variable x
. Inside the Java VM, there are two independent
copies of the 32-bit integer 42.
Now think about what happens if we run the same basic code but use a reference type instead of a primitive type:
Point
p
=
new
Point
(
1.0
,
2.0
);
Point
q
=
p
;
After this code runs, the variable q
holds a copy of the reference
held in the variable p
. There is still only one copy of the Point
object in the VM, but there are now two copies of the reference to that
object. This has some important implications. Suppose the two previous
lines of code are followed by this code:
System
.
out
.
println
(
p
.
x
);
// Print out the x coordinate of p: 1.0
q
.
x
=
13.0
;
// Now change the X coordinate of q
System
.
out
.
println
(
p
.
x
);
// Print out p.x again; this time it is 13.0
Because the variables p
and q
hold references to the same object,
either variable can be used to make changes to the object, and those
changes are visible through the other variable as well. As arrays are a
kind of object, the same thing happens with arrays, as illustrated
by the following code:
// greet holds an array reference
char
[]
greet
=
{
'h'
,
'e'
,
'l'
,
'l'
,
'o'
};
char
[]
cuss
=
greet
;
// cuss holds the same reference
cuss
[
4
]
=
'!'
;
// Use reference to change an element
System
.
out
.
println
(
greet
);
// Prints "hell!"
A similar difference in behavior between primitive types and reference types occurs when arguments are passed to methods. Consider the following method:
void
changePrimitive
(
int
x
)
{
while
(
x
>
0
)
{
System
.
out
.
println
(
x
--);
}
}
When this method is invoked, the method is given a copy of the argument
used to invoke the method in the parameter x
. The code in the method
uses x
as a loop counter and decrements it to zero. Because x
is a
primitive type, the method has its own private copy of this value, so
this is a perfectly reasonable thing to do.
On the other hand, consider what happens if we modify the method so that the parameter is a reference type:
void
changeReference
(
Point
p
)
{
while
(
p
.
x
>
0
)
{
System
.
out
.
println
(
p
.
x
--);
}
}
When this method is invoked, it is passed a private copy of a reference
to a Point
object and can use this reference to change the Point
object. For example, consider the following:
Point
q
=
new
Point
(
3.0
,
4.5
);
// A point with an x coordinate of 3
changeReference
(
q
);
// Prints 3,2,1 and modifies the Point
System
.
out
.
println
(
q
.
x
);
// The x coordinate of q is now 0!
When the changeReference()
method is invoked, it is passed a copy of
the reference held in variable q
. Now both the variable q
and the
method parameter p
hold references to the same object. The method can
use its reference to change the contents of the object. Note, however,
that it cannot change the contents of the variable q
. In other words,
the method can change the Point
object beyond recognition, but it
cannot change the fact that the variable q
refers to that object.
We’ve seen that primitive types and reference types differ
significantly in the way they are assigned to variables, passed to
methods, and copied. The types also differ in the way they are compared
for equality. When used with primitive values, the equality operator
(==
) simply tests whether two values are identical (i.e., whether
they have exactly the same bits). With reference types, however, ==
compares references, not actual objects. In other words, ==
tests
whether two references refer to the same object; it does not test
whether two objects have the same content. Here’s an example:
String
letter
=
"o"
;
String
s
=
"hello"
;
// These two String objects
String
t
=
"hell"
+
letter
;
// contain exactly the same text.
if
(
s
==
t
)
System
.
out
.
println
(
"equal"
);
// But they are not equal!
byte
[]
a
=
{
1
,
2
,
3
};
// A copy with identical content.
byte
[]
b
=
(
byte
[])
a
.
clone
();
if
(
a
==
b
)
System
.
out
.
println
(
"equal"
);
// But they are not equal!
When working with reference types, keep in mind there are two kinds of equality:
equality of reference and equality of object. It is important to
distinguish between these two kinds of equality. One way to do this is
to use the word “identical” when talking about equality of references
and the word “equal” when talking about two distinct objects that have
the same content. To test two nonidentical objects for equality, pass
one of them to the equals()
method of the other:
String
letter
=
"o"
;
String
s
=
"hello"
;
// These two String objects
String
t
=
"hell"
+
letter
;
// contain exactly the same text.
if
(
s
.
equals
(
t
))
{
// And the equals() method
System
.
out
.
println
(
"equal"
);
// tells us so.
}
All objects inherit an equals()
method (from Object
), but the
default implementation simply uses ==
to test for identity of
references, not equality of content. A class that wants to allow objects
to be compared for equality can define its own version of the equals()
method. Our Point
class does not do this, but the String
class does,
as indicated in the code example. You can call the equals()
method on
an array, but it is the same as using the ==
operator, because arrays
always inherit the default equals()
method that compares references
rather than array content. You can compare arrays for equality with the java.util.Arrays.equals()
convenience method.
Primitive types and reference types behave quite differently. It is
sometimes useful to treat primitive values as objects, and for this
reason, the Java platform includes wrapper classes for each of the primitive types. Boolean
, Byte
, Short
, Character
, Integer
,
Long
, Float
, and Double
are immutable, final classes whose
instances each hold a single primitive value. These wrapper classes are
usually used when you want to store primitive values in collections
such as java.util.List
:
// Create a List-of-Integer collection
List
<
Integer
>
numbers
=
new
ArrayList
<>();
// Store a wrapped primitive
numbers
.
add
(
new
Integer
(-
1
));
// Extract the primitive value
int
i
=
numbers
.
get
(
0
).
intValue
();
Java allows types of conversions known as boxing and unboxing conversions. Boxing conversions convert a primitive value to its corresponding wrapper object and unboxing conversions do the opposite. You may explicitly specify a boxing or unboxing conversion with a cast, but this is unnecessary, as these conversions are automatically performed when you assign a value to a variable or pass a value to a method. Furthermore, unboxing conversions are also automatic if you use a wrapper object when a Java operator or statement expects a primitive value. Because Java performs boxing and unboxing automatically, this language feature is often known as autoboxing.
Here are some examples of automatic boxing and unboxing conversions:
Integer
i
=
0
;
// int literal 0 boxed to an Integer object
Number
n
=
0.0f
;
// float literal boxed to Float and widened to Number
Integer
i
=
1
;
// this is a boxing conversion
int
j
=
i
;
// i is unboxed here
i
++;
// i is unboxed, incremented, and then boxed up again
Integer
k
=
i
+
2
;
// i is unboxed and the sum is boxed up again
i
=
null
;
j
=
i
;
// unboxing here throws a NullPointerException
Autoboxing makes dealing with collections much easier as well. Let’s look at an example that uses Java’s generics (a language feature we’ll meet properly in “Java Generics”) that allows us to restrict what types can be put into lists and other collections:
List
<
Integer
>
numbers
=
new
ArrayList
<>();
// Create a List of Integer
numbers
.
add
(-
1
);
// Box int to Integer
int
i
=
numbers
.
get
(
0
);
// Unbox Integer to int
A package is a named collection of classes, interfaces, and other reference types. Packages serve to group related classes and define a namespace for the classes they contain.
The core classes of the Java platform are in packages whose names
begin with java
. For example, the most fundamental classes of the
language are in the package java.lang
. Various utility classes are in
java.util
. Classes for input and output are in java.io
, and classes
for networking are in java.net
. Some of these packages contain
subpackages, such as java.lang.reflect
and java.util.regex
.
Extensions to the Java platform that have been standardized by Oracle
(or originally Sun) typically have package names that begin with
javax
. Some of these extensions, such as javax.swing
and its myriad
subpackages, were later adopted into the core platform itself. Finally,
the Java platform also includes several “endorsed standards,” which have
packages named after the standards body that created them, such as
org.w3c
and org.omg
.
Every class has both a simple name, which is the name given to it in its
definition, and a fully qualified name, which includes the name of the
package of which it is a part. The String
class, for example, is part
of the java.lang
package, so its fully qualified name is
java.lang.String
.
This section explains how to place your own classes and interfaces into a package and how to choose a package name that won’t conflict with anyone else’s package name. Next, it explains how to selectively import type names or static members into the namespace so that you don’t have to type the package name of every class or interface you use.
To specify the package a class is to be part of, you use a package
declaration. The package
keyword, if it appears, must be the first
token of Java code (i.e., the first thing other than comments and space)
in the Java file. The keyword should be followed by the name of the
desired package and a semicolon. Consider a Java file that begins with
this directive:
package
org
.
apache
.
commons
.
net
;
All classes defined by this file are part of the package
org.apache.commons.net
.
If no package
directive appears in a Java file, all classes defined in
that file are part of an unnamed default package. In this case, the
qualified and unqualified names of a class are the same.
The possibility of naming conflicts means that you should not use the default package. As your project grows more complicated, conflicts become almost inevitable—much better to create packages right from the start.
One of the important functions of packages is to partition the Java
namespace and prevent name collisions between classes. It is only
their package names that keep the java.util.List
and java.awt.List
classes distinct, for example. In order for this to work, however,
package names must themselves be distinct. As the developer of Java,
Oracle controls all package names that begin with java
, javax
, and
sun
.
One common scheme is to use your domain name, with its elements
reversed, as the prefix for all your package names. For example, the
Apache Project produces a networking library as part of the Apache
Commons project. The Commons project can be found at
http://commons.apache.org/ and
accordingly, the package name used for the networking library is
org.apache.commons.net
.
Note that these package-naming rules apply primarily to API developers. If other programmers will be using classes that you develop along with unknown other classes, it is important that your package name be globally unique. On the other hand, if you are developing a Java application and will not be releasing any of the classes for reuse by others, you know the complete set of classes that your application will be deployed with and do not have to worry about unforeseen naming conflicts. In this case, you can choose a package naming scheme for your own convenience rather than for global uniqueness. One common approach is to use the application name as the main package name (it may have subpackages beneath it).
When referring to a class or interface in your Java code, you must,
by default, use the fully qualified name of the type, including the
package name. If you’re writing code to manipulate a file and need to
use the File
class of the java.io
package, you must type
java.io.File
. This rule has three exceptions:
Types from the package java.lang
are so important and so commonly
used that they can always be referred to by their simple names.
The code in a type p.T
may refer to other types defined in the
package p
by their simple names.
Types that have been imported into the namespace with an import
declaration may be referred to by their simple names.
The first two exceptions are known as “automatic imports.” The types
from java.lang
and the current package are “imported” into the
namespace so that they can be used without their package name. Typing
the package name of commonly used types that are not in java.lang
or
the current package quickly becomes tedious, and so it is also possible
to explicitly import types from other packages into the namespace. This
is done with the import
declaration.
import
declarations must appear at the start of a Java file,
immediately after the package
declaration, if there is one, and before
any type definitions. You may use any number of import
declarations in
a file. An import
declaration applies to all type definitions in the
file (but not to any import
declarations that follow it).
The import
declaration has two forms. To import a single type into the
namespace, follow the import
keyword with the name of the type and a
semicolon:
import
java.io.File
;
// Now we can type File instead of java.io.File
This is known as the “single type import
" declaration.
The other form of import
declaration is the “on-demand type import
.” In this form,
you specify the name of a package followed by the characters .*
to indicate that any type from that package may be used without its package
name. Thus, if you want to use several other classes from the java.io
package in addition to the File
class, you can simply import the
entire package:
import
java.io.*
;
// Use simple names for all classes in java.io
This on-demand import
syntax does not apply to subpackages. If I
import the java.util
package, I must still refer to the
java.util.zip.ZipInputStream
class by its fully qualified name.
Using an on-demand type import
declaration is not the same as explicitly
writing out a single type import
declaration for every type in the
package. It is more like an explicit single type import
for every type
in the package that you actually use in your code. This is the reason
it’s called “on demand”; types are imported as you use them.
import
declarations are invaluable to Java programming. They do
expose us to the possibility of naming conflicts, however. Consider the
packages java.util
and java.awt
. Both contain types named List
.
java.util.List
is an important and commonly used interface. The
java.awt
package contains a number of important types that are
commonly used in client-side applications, but java.awt.List
has been
superseded and is not one of these important types. It is illegal to
import both java.util.List
and java.awt.List
in the same Java file.
The following single type import
declarations produce a compilation
error:
import
java.util.List
;
import
java.awt.List
;
Using on-demand type imports for the two packages is legal:
import
java.util.*
;
// For collections and other utilities.
import
java.awt.*
;
// For fonts, colors, and graphics.
Difficulty arises, however, if you actually try to use the type List
.
This type can be imported “on demand” from either package, and any
attempt to use List
as an unqualified type name produces a compilation
error. The workaround, in this case, is to explicitly specify the
package name you want.
Because java.util.List
is much more commonly used than
java.awt.List
, it is useful to combine the two on-demand type import
declarations with a single type import
declaration that serves to
disambiguate what we mean when we say List
:
import
java.util.*
;
// For collections and other utilities.
import
java.awt.*
;
// For fonts, colors, and graphics.
import
java.util.List
;
// To disambiguate from java.awt.List
With these import
declarations in place, we can use List
to mean the
java.util.List
interface. If we actually need to use the
java.awt.List
class, we can still do so as long as we include its
package name. There are no other naming conflicts between java.util
and java.awt
, and their types will be imported “on demand” when we use
them without a package name.
As well as types, you can import the static members of types using
the keywords import static
. (Static members are explained in
Chapter 3. If you are not already familiar with
them, you may want to come back to this section later.) Like type import
declarations, these static import
declarations come in two forms:
single static member import
and on-demand static member import
. Suppose,
for example, that you are writing a text-based program that sends a lot
of output to System.out
. In this case, you might use this single
static member import
to save yourself typing:
import
static
java
.
lang
.
System
.
out
;
You can then use out.println()
instead of
System.out.println()
. Or suppose you are writing a program that uses
many of the trigonometric and other functions of the Math
class. In a
program that is clearly focused on numerical methods like this, having
to repeatedly type the class name “Math” does not add clarity to your
code; it just gets in the way. In this case, an on-demand static member
import
may be appropriate:
import
static
java
.
lang
.
Math
.*
With this import
declaration, you are free to write concise expressions
like sqrt(abs(sin(x)))
without having to prefix the name of each
static method with the class name Math
.
Another important use of import static
declarations is to import the
names of constants into your code. This works particularly well with
enumerated types (see Chapter 4). Suppose, for
example, that you want to use the values of this enumerated type in code
you are writing:
package
climate
.
temperate
;
enum
Seasons
{
WINTER
,
SPRING
,
SUMMER
,
AUTUMN
};
You could import the type climate.temperate.Seasons
and then prefix
the constants with the type name: Seasons.SPRING
. For more concise
code, you could import the enumerated values themselves:
import
static
climate
.
temperate
.
Seasons
.*;
Using static member import
declarations for constants is generally a better technique than implementing an interface that defines the constants.
A static import
declaration imports a name, not any one specific
member with that name. Because Java allows method overloading and
allows a type to have fields and methods with the same name, a single
static member import
declaration may actually import more than one
member. Consider this code:
import
static
java
.
util
.
Arrays
.
sort
;
This declaration imports the name “sort” into the namespace, not any one
of the 19 sort()
methods defined by java.util.Arrays
. If you use the
imported name sort
to invoke a method, the compiler will look at the
types of the method arguments to determine which method you mean.
It is even legal to import static methods with the same name from two or more different types as long as the methods all have different signatures. Here is one natural example:
import
static
java
.
util
.
Arrays
.
sort
;
import
static
java
.
util
.
Collections
.
sort
;
You might expect that this code would cause a syntax error. In fact, it
does not because the sort()
methods defined by the Collections
class
have different signatures than all of the sort()
methods defined by
the Arrays
class. When you use the name “sort” in your code, the
compiler looks at the types of the arguments to determine which of the
21 possible imported methods you mean.
This chapter has taken us from the smallest to the largest elements of Java syntax, from individual characters and tokens to operators, expressions, statements, and methods, and on up to classes and packages. From a practical standpoint, the unit of Java program structure you will be dealing with most often is the Java file. A Java file is the smallest unit of Java code that can be compiled by the Java compiler. A Java file consists of:
An optional package
directive
Zero or more import
or import static
directives
One or more type definitions
These elements can be interspersed with comments, of course, but they
must appear in this order. This is all there is to a Java file. All Java
statements (except the package
and import
directives, which are not
true statements) must appear within methods, and all methods must appear
within a type definition.
Java files have a couple of other important restrictions. First, each
file can contain at most one top-level class that is declared public
.
A public
class is one that is designed for use by other classes in
other packages. A class can contain any number of nested or inner
classes that are public
. We’ll see more about the public
modifier
and nested classes in Chapter 3.
The second restriction concerns the filename of a Java file. If a Java
file contains a public
class, the name of the file must be the same as
the name of the class, with the extension .java appended. Therefore,
if Point
is defined as a public
class, its source code must appear
in a file named Point.java. Regardless of whether your classes are
public
or not, it is good programming practice to define only one per
file and to give the file the same name as the class.
When a Java file is compiled, each of the classes it defines is compiled
into a separate class file that contains Java byte codes to be
executed by the Java Virtual Machine. A class file has the same name
as the class it defines, with the extension .class appended. Thus, if
the file Point.java defines a class named Point
, a Java compiler
compiles it to a file named Point.class. On most systems, class files
are stored in directories that correspond to their package names. For example, the
class com.davidflanagan.examples.Point
is defined by the class
file com/davidflanagan/examples/Point.class.
The Java runtime knows where the class files for the standard system
classes are located and can load them as needed. When the interpreter
runs a program that wants to use a class named
com.davidflanagan.examples.Point
, it knows that the code for that
class is located in a directory named com/davidflanagan/examples/ and,
by default, it “looks” in the current directory for a subdirectory of
that name. In order to tell the interpreter to look in locations other
than the current directory, you must use the -classpath
option when
invoking the interpreter or set the CLASSPATH
environment variable.
For details, see the documentation for the Java executable, java, in
Chapter 13.
A Java program consists of a set of interacting class definitions. But not every Java class or Java file defines a program. To create a program, you must define a class that has a special method with the following signature:
public
static
void
main
(
String
[]
args
)
This main()
method is the main entry point for your program. It is
where the Java interpreter starts running. This method is passed an
array of strings and returns no value. When main()
returns, the Java
interpreter exits (unless main()
has created separate threads, in
which case the interpreter waits for all those threads to exit).
To run a Java program, you run the Java executable, java,
specifying the fully qualified name of the class that contains the
main()
method. Note that you specify the name of the class, not the
name of the class file that contains the class. Any additional arguments
you specify on the command line are passed to the main()
method as its
String[]
parameter. You may also need to specify the -classpath
option (or -cp
) to tell the interpreter where to look for the classes
needed by the program. Consider the following command:
java
-
classpath
/
opt
/
Jude
com
.
davidflanagan
.
jude
.
Jude
datafile
.
jude
java
is the command to run the Java interpreter.
-classpath /opt/Jude
tells the interpreter where to look for
.class files. com.davidflanagan.jude.Jude
is the name of the program
to run (i.e., the name of the class that defines the main()
method).
Finally, datafile.jude
is a string that is passed to that main()
method as the single element of an array of String
objects.
There is an easier way to run programs. If a program and all its auxiliary classes (except those that are part of the Java platform) have been properly bundled in a Java archive (JAR) file, you can run the program simply by specifying the name of the JAR file. In the next example, we show how to start up the Censum garbage collection log analyzer:
java
-
jar
/
usr
/
local
/
Censum
/
censum
.
jar
Some operating systems make JAR files automatically executable. On those systems, you can simply say:
%
/
usr
/
local
/
Censum
/
censum
.
jar
See Chapter 13 for more details on how to execute Java programs.
In this chapter, we’ve introduced the basic syntax of the Java language. Due to the interlocking nature of the syntax of programming languages, it is perfectly fine if you don’t feel at this point that you have completely grasped all of the syntax of the language. It is by practice that we acquire proficiency in any language, human or computer.
It is also worth observing that some parts of syntax are far more
regularly used than others. For example, the strictfp
and assert
keywords are almost never used. Rather than trying to grasp every aspect
of Java’s syntax, it is far better to begin to acquire facility in the
core aspects of Java and then return to any details of syntax that may
still be troubling you. With this in mind, let’s move to the next
chapter and begin to discuss the classes and objects that are so central
to Java and the basics of Java’s approach to object-oriented programming.
1 Technically, the minus sign is an operator that operates on the literal, but is not part of the literal itself.
2 Technically, they must all implement the AutoCloseable
interface.
3 In the Java Language Specification, the term “signature” has a technical meaning that is slightly different than that used here. This book uses a less formal definition of method signature.
4 There is a terminology difficulty in discussions of arrays. Unlike with classes and their instances, we use the term “array” for both the array type and the array instance. In practice, it is usually clear from context whether a type or a value is being discussed.