Chapter 9. Constructors and Garbage Collection: Life and Death of an Object

image with no caption

Objects are born and objects die. You’re in charge of an object’s lifecycle. You decide when and how to construct it. You decide when to destroy it. Except you don’t actually destroy the object yourself, you simply abandon it. But once it’s abandoned, the heartless Garbage Collector (gc) can vaporize it, reclaiming the memory that object was using. If you’re gonna write Java, you’re gonna create objects. Sooner or later, you’re gonna have to let some of them go, or risk running out of RAM. In this chapter we look at how objects are created, where they live while they’re alive, and how to keep or abandon them efficiently. That means we’ll talk about the heap, the stack, scope, constructors, super constructors, null references, and more. Warning: this chapter contains material about object death that some may find disturbing. Best not to get too attached.

Before we can understand what really happens when you create an object, we have to step back a bit. We need to learn more about where everything lives (and for how long) in Java. That means we need to learn more about the Stack and the Heap. In Java, we (programmers) care about two areas of memory—the one where objects live (the heap), and the one where method invocations and local variables live (the stack). When a JVM starts up, it gets a chunk of memory from the underlying OS, and uses it to run your Java program. How much memory, and whether or not you can tweak it, is dependent on which version of the JVM (and on which platform) you’re running. But usually you won’t have anything to say about it. And with good programming, you probably won’t care (more on that a little later).

The Stack

Where method invocations and local variables live

image with no caption

We know that all objects live on the garbage-collectible heap, but we haven’t yet looked at where variables live. And where a variable lives depends on what kind of variable it is. And by “kind”, we don’t mean type (i.e. primitive or object reference). The two kinds of variables whose lives we care about now are instance variables and local variables. Local variables are also known as stack variables, which is a big clue for where they live.

The Heap

Where ALL objects live

image with no caption

When you call a method, the method lands on the top of a call stack. That new thing that’s actually pushed onto the stack is the stack frame, and it holds the state of the method including which line of code is executing, and the values of all local variables.

The method at the top of the stack is always the currently-running method for that stack (for now, assume there’s only one stack,but in Chapter 14 we’ll add more.) A method stays on the stack until the method hits its closing curly brace (which means the method’s done). If method foo() calls method bar(), method bar() is stacked on top of method foo().

A call stack with two methods

image with no caption

The method on the top of the stack is always the currently-executing method.

public void doStuff() {
   boolean b = true;
   go(4);
}
public void go(int x) {
   int z = x + 24;
   crazy();
   // imagine more code here
}
public void crazy() {
   char c = 'a';
}

Remember, a non-primitive variable holds a reference to an object, not the object itself. You already know where objects live—on the heap. It doesn’t matter where they’re declared or created. If the local variable is a reference to an object, only the variable (the reference/remote control) goes on the stack.

The object itself still goes in the heap.

public class StackRef {
   public void foof() {
      barf();
   }

   public void barf() {
      Duck d = new Duck(24);
   }
}
image with no caption

When you say new CellPhone(), Java has to make space on the Heap for that CellPhone. But how much space? Enough for the object, which means enough to house all of the object’s instance variables. That’s right, instance variables live on the Heap, inside the object they belong to.

Remember that the values of an object’s instance variables live inside the object. If the instance variables are all primitives, Java makes space for the instance variables based on the primitive type. An int needs 32 bits, a long 64 bits, etc. Java doesn’t care about the value inside primitive variables; the bit-size of an int variable is the same (32 bits) whether the value of the int is 32,000,000 or 32.

But what if the instance variables are objects? What if CellPhone HAS-A Antenna? In other words, CellPhone has a reference variable of type Antenna.

When the new object has instance variables that are object references rather than primitives, the real question is: does the object need space for all of the objects it holds references to? The answer is, not exactly. No matter what, Java has to make space for the instance variable values. But remember that a reference variable value is not the whole object, but merely a remote control to the object. So if CellPhone has an instance variable declared as the non-primitive type Antenna, Java makes space within the CellPhone object only for the Antenna’s remote control (i.e. reference variable) but not the Antenna object.

public class CellPhone {
  private Antenna ant;
}

Well then when does the Antenna object get space on the Heap? First we have to find out when the Antenna object itself is created. That depends on the instance variable declaration. If the instance variable is declared but no object is assigned to it, then only the space for the reference variable (the remote control) is created.

private Antenna ant;

No actual Antenna object is made on the heap unless or until the reference variable is assigned a new Antenna object.

private Antenna ant = new Antenna();
public class CellPhone {
  private Antenna ant = new Antenna();
}

Now that you know where variables and objects live, we can dive into the mysterious world of object creation. Remember the three steps of object declaration and assignment: declare a reference variable, create an object, and assign the object to the reference.

But until now, step two—where a miracle occurs and the new object is “born”—has remained a Big Mystery. Prepare to learn the facts of object life. Hope you’re not squeamish.

Review the 3 steps of object declaration, creation and assignment:

image with no caption

Are we calling a method named Duck()? Because it sure looks like it.

image with no caption

No.

We’re calling the Duck constructor.

A constructor does look and feel a lot like a method, but it’s not a method. It’s got the code that runs when you say new. In other words, the code that runs when you instantiate an object.

The only way to invoke a constructor is with the keyword new followed by the class name. The JVM finds that class and invokes the constructor in that class. (OK, technically this isn’t the only way to invoke a constructor. But it’s the only way to do it from outside a constructor. You can call a constructor from within another constructor, with restrictions, but we’ll get into all that later in the chapter.)

But where is the constructor?

If we didn’t write it, who did?

You can write a constructor for your class (we’re about to do that), but if you don’t, the compiler writes one for you!

Here’s what the compiler’s default constructor looks like:

public Duck() {

}

Notice something missing? How is this different from a method?

image with no caption

The key feature of a constructor is that it runs before the object can be assigned to a reference. That means you get a chance to step in and do things to get the object ready for use. In other words, before anyone can use the remote control for an object, the object has a chance to help construct itself. In our Duck constructor, we’re not doing anything useful, but it still demonstrates the sequence of events.

image with no caption

The constructor gives you a chance to step into the middle of new.

image with no caption

Most people use constructors to initialize the state of an object. In other words, to make and assign values to the object’s instance variables.

public Duck() {
  size = 34;
}

That’s all well and good when the Duck class developer knows how big the Duck object should be. But what if we want the programmer who is using Duck to decide how big a particular Duck should be?

Imagine the Duck has a size instance variable, and you want the programmer using your Duck class to set the size of the new Duck. How could you do it?

Well, you could add a setSize() setter method to the class. But that leaves the Duck temporarily without a size[9], and forces the Duck user to write two statements—one to create the Duck, and one to call the setSize() method. The code below uses a setter method to set the initial size of the new Duck.

image with no caption

If an object shouldn’t be used until one or more parts of its state (instance variables) have been initialized, don’t let anyone get ahold of a Duck object until you’re finished initializing! It’s usually way too risky to let someone make—and get a reference to—a new Duck object that isn’t quite ready for use until that someone turns around and calls the setSize() method. How will the Duck-user even know that he’s required to call the setter method after making the new Duck?

The best place to put initialization code is in the constructor. And all you need to do is make a constructor with arguments.

image with no caption
image with no caption

What happens if the Duck constructor takes an argument? Think about it. On the previous page, there’s only one Duck constructor—and it takes an int argument for the size of the Duck. That might not be a big problem, but it does make it harder for a programmer to create a new Duck object, especially if the programmer doesn’t know what the size of a Duck should be. Wouldn’t it be helpful to have a default size for a Duck, so that if the user doesn’t know an appropriate size, he can still make a Duck that works?

Imagine that you want Duck users to have TWO options for making a Duck—one where they supply the Duck size (as the constructor argument) and one where they don’t specify a size and thus get your default Duck size.

You can’t do this cleanly with just a single constructor. Remember, if a method (or constructor—same rules) has a parameter, you must pass an appropriate argument when you invoke that method or constructor. You can’t just say, “If someone doesn’t pass anything to the constructor, then use the default size”, because they won’t even be able to compile without sending an int argument to the constructor call. You could do something clunky like this:

image with no caption

But that means the programmer making a new Duck object has to know that passing a “0” is the protocol for getting the default Duck size. Pretty ugly. What if the other programmer doesn’t know that? Or what if he really does want a zero-size Duck? (Assuming a zero-sized Duck is allowed. If you don’t want zero-sized Duck objects, put validation code in the constructor to prevent it.) The point is, it might not always be possible to distinguish between a genuine “I want zero for the size” constructor argument and a “I’m sending zero so you’ll give me the default size, whatever that is” constructor argument.

You might think that if you write only a constructor with arguments, the compiler will see that you don’t have a no-arg constructor, and stick one in for you. But that’s not how it works. The compiler gets involved with constructor-making only if you don’t say anything at all about constructors.

If you write a constructor that takes arguments, and you still want a no-arg constructor, you’ll have to build the no-arg constructor yourself!

image with no caption

As soon as you provide a constructor, ANY kind of constructor, the compiler backs off and says, “OK Buddy, looks like you’re in charge of constructors now.”

If you have more than one constructor in a class, the constructors MUST have different argument lists.

The argument list includes the order and types of the arguments. As long as they’re different, you can have more than one constructor. You can do this with methods as well, but we’ll get to that in another chapter.

Overloaded constructors means you have more than one constructor in your class.

To compile, each constructor must have a different argument list!

The class below is legal because all five constructors have different argument lists. If you had two constructors that took only an int, for example, the class wouldn’t compile. What you name the parameter variable doesn’t count. It’s the variable type (int, Dog, etc.) and order that matters. You can have two constructors that have identical types, as long as the order is different. A constructor that takes a String followed by an int, is not the same as one that takes an int followed by a String.

image with no caption

Doing all the Brain Barbells has been shown to produce a 42% increase in neuron size. And you know what they say, “Big neurons...”

Here’s where it gets fun. Remember from the last chapter, the part where we looked at the Snowboard object wrapping around an inner core representing the Object portion of the Snowboard class? The Big Point there was that every object holds not just its own declared instance variables, but also everything from its superclasses (which, at a minimum, means class Object, since every class extends Object).

So when an object is created (because somebody said new; there is no other way to create an object other than someone, somewhere saying new on the class type), the object gets space for all the instance variables, from all the way up the inheritance tree. Think about it for a moment... a superclass might have setter methods encapsulating a private variable. But that variable has to live somewhere. When an object is created, it’s almost as though multiple objects materialize—the object being new’d and one object per each superclass. Conceptually, though, it’s much better to think of it like the picture below, where the object being created has layers of itself representing each superclass.

image with no caption
image with no caption

All the constructors in an object’s inheritance tree must run when you make a new object.

Let that sink in.

That means every superclass has a constructor (because every class has a constructor), and each constructor up the hierarchy runs at the time an object of a subclass is created.

Saying new is a Big Deal. It starts the whole constructor chain reaction. And yes, even abstract classes have constructors. Although you can never say new on an abstract class, an abstract class is still a superclass, so its constructor runs when someone makes an instance of a concrete subclass.

The super constructors run to build out the superclass parts of the object. Remember, a subclass might inherit methods that depend on superclass state (in other words, the value of instance variables in the superclass). For an object to be fully-formed, all the superclass parts of itself must be fully-formed, and that’s why the super constructor must run. All instance variables from every class in the inheritance tree have to be declared and initialized. Even if Animal has instance variables that Hippo doesn’t inherit (if the variables are private, for example), the Hippo still depends on the Animal methods that use those variables.

image with no caption

When a constructor runs, it immediately calls its superclass constructor, all the way up the chain until you get to the class Object constructor.

On the next few pages, you’ll learn how superclass constructors are called, and how you can call them yourself. You’ll also learn what to do if your superclass constructor has arguments!

A new Hippo object also IS-A Animal and IS-A Object. If you want to make a Hippo, you must also make the Animal and Object parts of the Hippo.

This all happens in a process called Constructor Chaining.

public class Animal {
   public Animal() {
      System.out.println("Making an Animal");
   }
}
_____________________________________________
public class Hippo extends Animal {
   public Hippo() {
      System.out.println("Making a Hippo");
   }
}
_____________________________________________
public class TestHippo {
   public static void main (String[] args) {
      System.out.println("Starting...");
      Hippo h = new Hippo();
   }
}

The first one, A. The Hippo() constructor is invoked first, but it’s the Animal constructor that finishes first.

You might think that somewhere in, say, a Duck constructor, if Duck extends Animal you’d call Animal(). But that’s not how it works:

image with no caption

The only way to call a super constructor is by calling super(). That’s right—super() calls the super constructor.

image with no caption

A call to super() in your constructor puts the superclass constructor on the top of the Stack. And what do you think that superclass constructor does? Calls its superclass constructor. And so it goes until the Object constructor is on the top of the Stack. Once Object() finishes, it’s popped off the Stack and the next thing down the Stack (the subclass constructor that called Object()) is now on top. That constructor finishes and so it goes until the original constructor is on the top of the Stack, where it can now finish.

If you think of a superclass as the parent to the subclass child, you can figure out which has to exist first. The superclass parts of an object have to be fully-formed (completely built) before the subclass parts can be constructed. Remember, the subclass object might depend on things it inherits from the superclass, so it’s important that those inherited things be finished. No way around it. The superclass constructor must finish before its subclass constructor.

Look at the Stack series in Making a Hippo means making the Animal and Object parts too... again, and you can see that while the Hippo constructor is the first to be invoked (it’s the first thing on the Stack), it’s the last one to complete! Each subclass constructor immediately invokes its own superclass constructor, until the Object constructor is on the top of the Stack. Then Object’s constructor completes and we bounce back down the Stack to Animal’s constructor. Only after Animal’s constructor completes do we finally come back down to finish the rest of the Hippo constructor. For that reason:

image with no caption

The call to super() must be the first statement in each constructor!*

What if the superclass constructor has arguments? Can you pass something in to the super() call? Of course. If you couldn’t, you’d never be able to extend a class that didn’t have a no-arg constructor. Imagine this scenario: all animals have a name. There’s a getName() method in class Animal that returns the value of the name instance variable. The instance variable is marked private, but the subclass (in this case, Hippo) inherits the getName() method. So here’s the problem: Hippo has a getName() method (through inheritance), but does not have the name instance variable. Hippo has to depend on the Animal part of himself to keep the name instance variable, and return it when someone calls getName() on a Hippo object. But... how does the Animal part get the name? The only reference Hippo has to the Animal part of himself is through super(), so that’s the place where Hippo sends the Hippo’s name up to the Animal part of himself, so that the Animal part can store it in the private name instance variable.

image with no caption
image with no caption
image with no caption
image with no caption

What if you have overloaded constructors that, with the exception of handling different argument types, all do the same thing? You know that you don’t want duplicate code sitting in each of the constructors (pain to maintain, etc.), so you’d like to put the bulk of the constructor code (including the call to super()) in only one of the overloaded constructors. You want whichever constructor is first invoked to call The Real Constructor and let The Real Constructor finish the job of construction. It’s simple: just say this(). Or this(aString). Or this(27, x). In other words, just imagine that the keyword this is a reference to the current object

You can say this() only within a constructor, and it must be the first statement in the constructor!

But that’s a problem, isn’t it? Earlier we said that super() must be the first statement in the constructor. Well, that means you get a choice.

Every constructor can have a call to super() or this(), but never both!

image with no caption
image with no caption

An object’s life depends entirely on the life of references referring to it. If the reference is considered “alive”, the object is still alive on the Heap. If the reference dies (and we’ll look at what that means in just a moment), the object will die.

So if an object’s life depends on the reference variable’s life, how long does a variable live?

That depends on whether the variable is a local variable or an instance variable. The code below shows the life of a local variable. In the example, the variable is a primitive, but variable lifetime is the same whether it’s a primitive or reference variable.

image with no caption

The difference between life and scope for local variables:

Life

A local variable is alive as long as its Stack frame is on the Stack. In other words, until the method completes.

Scope

A local variable is in scope only within the method in which the variable was declared. When its own method calls another, the variable is alive, but not in scope until its method resumes. You can use a variable only when it is in scope.

public void doStuff() {
   boolean b = true;
   go(4);
}
public void go(int x) {
   int z = x + 24;
   crazy();
   // imagine more code here
}
public void crazy() {
   char c = 'a';
}

While a local variable is alive, its state persists. As long as method doStuff() is on the Stack, for example, the ‘b’ variable keeps its value. But the ‘b’ variable can be used only while doStuff()’s Stack frame is at the top. In other words, you can use a local variable only while that local variable’s method is actually running (as opposed to waiting for higher Stack frames to complete).

The rules are the same for primitives and references. A reference variable can be used only when it’s in scope, which means you can’t use an object’s remote control unless you’ve got a reference variable that’s in scope. The real question is,

“How does variable life affect object life?”

An object is alive as long as there are live references to it. If a reference variable goes out of scope but is still alive, the object it refers to is still alive on the Heap. And then you have to ask... “What happens when the Stack frame holding the reference gets popped off the Stack at the end of the method?”

If that was the only live reference to the object, the object is now abandoned on the Heap. The reference variable disintegrated with the Stack frame, so the abandoned object is now, officially, toast. The trick is to know the point at which an object becomes eligible for garbage collection.

Once an object is eligible for garbage collection (GC), you don’t have to worry about reclaiming the memory that object was using. If your program gets low on memory, GC will destroy some or all of the eligible objects, to keep you from running out of RAM. You can still run out of memory, but not before all eligible objects have been hauled off to the dump. Your job is to make sure that you abandon objects (i.e, make them eligible for GC) when you’re done with them, so that the garbage collector has something to reclaim. If you hang on to objects, GC can’t help you and you run the risk of your program dying a painful out-of-memory death.

An object’s life has no value, no meaning, no point, unless somebody has a reference to it.

If you can’t get to it, you can’t ask it to do anything and it’s just a big fat waste of bits.

But if an object is unreachable, the Garbage Collector will figure that out. Sooner or later, that object’s goin’ down.

Object-killer #1

Reference goes out of scope, permanently.

image with no caption
public class StackRef {
   public void foof() {
      barf();
   }

   public void barf() {
      Duck d = new Duck();
   }
}
image with no caption

Object-killer #2

Assign the reference to another object

image with no caption
public class ReRef {

    Duck d = new Duck();

    public void go() {
      d = new Duck();
    }
}
image with no caption

Object-killer #3

Explicitly set the reference to null

image with no caption
public class ReRef {

    Duck d = new Duck();

    public void go() {
      d = null;
    }
}


[9] Instance variables do have a default value. 0 or 0.0 for numeric primitives, false for booleans, and null for references.

[10] Not to imply that not all Duck state is not unimportant.

[11] Unless the constructor calls another overloaded constructor (you’ll see that in a few pages).