A place to be (re)educated in Newspeak

Sunday, June 24, 2007

Constructors Considered Harmful

In mainstream object oriented programming languages, objects are created by invoking constructors. This is rather ironic, since you can say a lot about constructors, but you cannot honestly say that they are object oriented. Ok, so what? Isn’t “object-oriented” just an old buzzword? If constructors work well, who cares?

Of course, the problem is that constructors don’t work very well all. Understanding why helps to understand what “object oriented” really means, and why it is important.

Constructors come with a host of special rules and regulations: they cannot be overridden like instance methods; they need to call another constructor, or a superclass constructor etc. Try defining mixins in the presence of constructors - it’s tricky because the interface for creating instances gets bundled with the interface of the instances themselves.

The basic issue is a failure of abstraction, as Dave Ungar put it in his OOPSLA keynote in 2003.

Suppose we have a constructor C(x) and code that creates objects by calling it. What happens if we find that we actually need to return an instance of a class other than C? For example, we might want to lazily load data from secondary storage, and need to return some sort of placeholder object that behaves just like a C, but isn’t? What if we want to avoid reallocating a fresh object, and use a cached one instead?

So it’s clear that you don’t want to publicize your constructors to the clients of your API, as all they can do is tie you down.

The standard recommended solution is to use a static factory. This however, does nothing to help the other victims of constructors - their callers. As a consumer of an API, you don’t want to use constructors: they induce a tight coupling between your code and specific implementations. You can’t abstract over static methods, just as you can’t abstract over constructors. In both cases, there is no object that is the target of the operation and a conventional interface declaration cannot describe it. The absence of an object means that constructors don’t have the benefits objects bring - dynamic binding of method calls chief among these. Which is why constructors and static methods don’t work well, and incidentally, aren’t object oriented.

Having dismissed constructors and static factories, it seems we need to define a factory class whose instances will support an interface that includes a method that constructs the desired objects. How will you create the factory object? By calling a constructor? Or by defining a meta-factory? After how many meta-meta-meta- .. meta-factories do you give up and call a constructor?

What about using a dependency injection framework (DIF)? Ignoring the imbecilic name, I think that if you’re stuck with a mainstream language, that may be a reasonable work around. It requires a significant degree of preplanning, and makes your application dependent on one more piece of machinery that has nothing to do with the actual problem the application is trying to solve. On the positive side, it helps guarantee employment for software engineers. That said, it’s important to understand that DIFs are just a work around for a deficiency in the underlying language.

So why not get rid of constructors and have a class declaration create a factory object instead? Well, Smalltalk did just that a generation ago. Every time you define a class, you define the factory object for its instances. I won’t explain the Smalltalk metaclass hierarchy here. Suffice to say that it is a thing of beauty, resolving a potential infinite regress with an elegant circularity.

Despite this, Smalltalk still does not provide an ideal solution for creating and initializing instances. While it preserves abstraction, it does not easily enable the class to ensure that every instance will always be properly initialized, or that initialization will take place only once. To be sure, these are difficult goals, and Java, despite enormous efforts and complexity focused on these goals, does not fully achieve them either. However, it comes close - at the cost of abstraction failure brought about by the use of constructors.

So can we do better? Of course. The solution combines elements of Scala constructors (which are cleaner than Java constructors) with elements of the Smalltalk approach.

Scala defines a class as a parametric entity, with formal parameters that are in scope in the class body. The class name, together with its formal parameters, define the primary constructor of the class. This allows the instance to initialize itself, accessing the parameters to the constructor without exposing an initialization method that can be called multiple times on an instance. The latter is one of the problems in Smalltalk.

We use a similar device to provide access to the constructor parameters from within the instance. However, we require that the class provide a method header for its primary constructor. Instead of creating instances by a special construct (constructor invocation) as in Java or Scala, we create them via method invocation. The class declaration introduces a factory object that supports the primary constructor as an instance method.

Because we create objects only by invoking methods on other objects, we preserve abstraction. We can create objects by invoking the constructor method on a parameter. We can always define alternative factory objects that support the same constructor method with different behavior, and pass them instead of the class. Furthermore, using a message based programming language, references to the class’ name are always virtual, and can be overridden.

Unlike Smalltalk, the factory class is not a subclass of the superclass factory class. This prevents the possibility of calling superclass constructors and thereby creating partially initialized objects (doing this requires special effort in Smalltalk - one has to manually override the superclass constructors so that they fail; this is tedious and error prone, and not done much in practice).

I know I should be writing examples to make this all clear. However, this post is getting long, so that will wait for another post. I’ll be speaking about some of this work next month a the dynamic language workshop at ECOOP. By then, I imagine I’ll put out some examples of what we’ve been doing these past few months.