In mainstream object oriented programming languages, objects are created by invoking constructors. This is rather ironic, since you can say a lot about constructors, but you cannot honestly say that they are object oriented. Ok, so what? Isn’t “object-oriented” just an old buzzword? If constructors work well, who cares?
Of course, the problem is that constructors don’t work very well all. Understanding why helps to understand what “object oriented” really means, and why it is important.
Constructors come with a host of special rules and regulations: they cannot be overridden like instance methods; they need to call another constructor, or a superclass constructor etc. Try defining mixins in the presence of constructors - it’s tricky because the interface for creating instances gets bundled with the interface of the instances themselves. 
The basic issue is a failure of abstraction, as Dave Ungar put it in his OOPSLA keynote in 2003.
Suppose we have a constructor C(x) and code that creates objects by calling it. What happens if we find that we actually need to return an instance of a class other than C? For example,  we might want to lazily load data from secondary storage, and need to return some sort of placeholder object that behaves just like a C, but isn’t? What if we want to avoid reallocating a fresh object, and use a cached one instead?
So it’s clear that you don’t want to publicize your constructors to the clients of your API, as all they can do is tie you down.
 
The standard recommended solution is to use a static factory. This however, does nothing to help the other victims of constructors - their callers. As a consumer of an API, you don’t want to use constructors: they induce a tight coupling between your code and specific implementations. You can’t abstract over static methods, just as you can’t abstract over constructors. In both cases, there is no object that is the target of the operation and a conventional interface declaration cannot describe it.  The absence of an object means that constructors don’t have the benefits objects bring - dynamic binding of method calls chief among these. Which is why constructors and static methods don’t work well, and incidentally, aren’t object oriented.
Having dismissed constructors and static factories, it seems we need to define a factory class whose instances will  support an interface that includes a method that constructs the desired objects. How will you create the factory object? By calling a constructor? Or by defining a meta-factory? After how many meta-meta-meta- .. meta-factories do you give up and call a constructor?
What about using a dependency injection framework (DIF)? Ignoring the imbecilic name, I think that if you’re stuck with a mainstream language, that may be a reasonable work around. It requires a significant degree of preplanning, and makes your application dependent on one more piece of machinery that has nothing to do with the actual problem the application is trying to solve. On the positive side, it helps guarantee employment for software engineers. That said, it’s important to understand that DIFs are just a work around for a deficiency in the underlying language. 
So why not get rid of constructors and have a class declaration create a factory object instead? Well, Smalltalk did just that a generation ago. Every time you define a class, you define the factory object for its instances. I won’t explain the Smalltalk metaclass hierarchy here. Suffice to say that it is a thing of beauty, resolving a potential infinite regress with an elegant circularity.
Despite this, Smalltalk still does not provide an ideal solution for creating and initializing instances. While it preserves abstraction, it does not easily enable the class to ensure that every instance will always be properly initialized, or that initialization will take place only once. To be sure, these are difficult goals, and Java, despite enormous efforts and complexity focused on these goals, does not fully achieve them either. However, it comes close - at the cost of abstraction failure brought about by the use of constructors.
So can we do better? Of course.  The solution combines elements of Scala constructors (which are cleaner than Java constructors) with elements of the Smalltalk approach.
Scala defines a class as a parametric entity, with formal parameters that are in scope in the class body.  The class name, together with its formal parameters, define the primary constructor of the class. This allows the instance to initialize itself, accessing the parameters to the constructor without exposing an initialization method that can be called multiple times on an instance. The latter is one of the problems in Smalltalk.
We use a similar device to provide access to the constructor parameters from within the instance. However, we require that the class provide a method header for its primary constructor. Instead of creating instances by a special construct (constructor invocation) as in Java or Scala, we create them via method invocation. The class declaration introduces a factory object that supports the primary constructor as an instance method.
Because we create objects only by invoking methods on other objects, we preserve abstraction. We can create objects by invoking the constructor method on a parameter. We can always define alternative factory objects that support the same constructor method with different behavior, and pass them instead of the class. Furthermore, using a message based programming language, references to the class’ name are always virtual, and can be overridden.
Unlike Smalltalk, the factory class is not a subclass of the superclass factory class. This prevents the possibility of calling superclass constructors and thereby creating partially initialized objects (doing this requires special effort in Smalltalk - one has to manually override the superclass constructors so that they fail; this is tedious and error prone, and not done much in practice).
I know I should be writing examples to make this all clear.  However, this post is getting long, so that will wait for another post. I’ll be speaking about some of this work next month a the dynamic language workshop at ECOOP. By then, I imagine I’ll put out some examples of what we’ve been doing these past few months.
A place to be (re)educated in Newspeak
Sunday, June 24, 2007
Subscribe to:
Post Comments (Atom)
 
 
36 comments:
I’ll be speaking about some of this work next month a the dynamic language workshop at ECOOP. By then, I imagine I’ll put out some examples of what we’ve been doing these past few months.
Well that makes me want to go to ECOOP
(http://2007.ecoop.org/)... no wait, I am going: it's just that I'm going to miss the DL workshop. Botheration!
Perhaps the real problem with the factory is the term 'static method', possibly an oxymoron. Lisp has top-level functions that can be redefined. They're not static as in early-bound, but they are equivalent to static methods in other ways.
If you replaced a constructor with a static method/function, but one that could be redefined, I think that would be reasonable.
In Java, aspects can emulate this, apparently.
@Ricky: You would have the same problems as in Smalltalk, where you have to override all instance creating class methods of all superclasses to keep control over the construction process. Or would you restrain calling only those static methods locally defined?
Most of the factory issues in Java are IMO due to Java's inability wrt. Meta-object programming. I had a short entry in my blog* on Meta-Interfaces for Java to tackle things like static (factory) methods without changing the Java class model, but this would, of course, only be a workaround.
I'm quite nosy about a different approach for creating instances.
> Suppose we have a constructor C(x) and code that creates objects by calling it. What happens if we find that we actually need to return an instance of a class other than C?
Do a global search and replace, if it becomes necessary. Code isn't written in stone, it can be changed easily. Why do you think it's called "software".
> So it’s clear that you don’t want to publicize your constructors to the clients of your API, as all they can do is tie you down.
Yea, especially if the consumers have the audacity to inherit from one of you classes. Better make all of your method non-virtual as well just to be on the safe side.
> As a consumer of an API, you don’t want to use constructors: they induce a tight coupling between your code and specific implementations.
So does referencing the library that contains it.
Stefan, I began an experiment in my own code a few years ago, wherein I stopped subclassing. It's ongoing, and has led me to think that subclassing harms reusability (for a start, mixins would do a better job for most of the use cases).
I don't quite understand your question - to 'swap out' the implementation of a function I would do something akin to:
(let ((make-point make-debug-point))
. (test-making-points))
..where make-point is the factory, and make-debug-point is an altered version of that, say, with extra logging. Then, any calls to make-point inside test-making-points will call make-debug-point.
Jonathan,
When you say to do a global search and replace, do you really mean global? Can you search and replace on all your clients' code?
It's much easier to change code after it has been loaded into an environment (depending on the environment), than to change the source code.
"So does referencing the library that contains it"
Not so much. A constructor is more than a binding to a method name, it's a binding to a particular class name, whereas a static factory method is just a method - it can compatibly be changed to return an instance of a subtype.
Why not have a **single** **private** Java like constructor and allow traits to specify static methods that must be present but can be hidden and **must** have a default implementation. If you had a Scala like syntax for the private constructor you could then write code like:
trait MyInt {
__static MyInt instance( Integer i ) {
____return MutableInteger.instace( i );
__}
}
class ImmutableInt( Integer i ) implements MyInt {
__public static Foo instance( Integer i ) {
____return this( i );
__}
}
final MyInt i = MyInt.instance( 1 );
To keep it all nice and OO static could be a field that is a singleton instance automatically created, i.e. static used like Java is just sugar. Like you do in Scala using Object.
Some of your criticisms are specific to Java or some set of languages, but not to all OO languages. Constructor aren't inherited so of course they can't be overriden. Some example code illustrating the problems would help.
Are these real-world problems or just abstract complaints? Lots of people have used C++, Java, and other OO languages to build real applications, so while constructors may have some annoying properties I don't think they are a fatal flaw, or even harmful.
Constructors are intended to initialize an object after it is created but before it is used by other code. They are not a good place to put a pile of code. When I hear complaints about how limiting constructors are my first reaction is that the programmer is trying to do too much in the constructor.
It seems that the issues you bring up could be solved by writing an empty constructor and adding a method to your class to do the complex initialization. You'd have to call it explicitly after creating the object, but that's an idiom you could get used to, or roll up into a static factory function in the class.
I don't understand why multiple initialization is a problem, eve if I understand why non-method constructors may be. If the problem with Smalltalk's #new is that it relyes on a public initializer that can be called multiple times isn't it enough to just follow the python & ruby approach?
Just make the allocation (allocate/__new__) and initialization (initialize/__init__) methods not accessible from outside of the object, leaving only the constructor/factory (Class#new/Class.__call__).
Zero complication and solves everything., no?
gregjor,
"Lots of people have used C++, Java..."
I would hope that C++ and Java aren't the pinnacle (1/2==0 can't be the best we can do). Also, go back 20 or 30 years and you could say "lots of people have used goto liberally and built real software", or "lots of people have killed other people and gone on to live out a full and happy life" and make just as little sense.
"my first reaction is that the programmer is trying to do too much in the constructor"
Look beyond your first reaction - Gilad is not talking about the contents of constructors.
I enjoyed this article, thanks!
My experience in the latest years has been with Ruby and this year with JavaScript. In Ruby we have a classical model with mixins which work fine, but I've used a custom alternative in Ruby which has not been very developed yet.
In JavaScript, I've been using some single inheritance with some custom mixins. For the mainstream browsers it works great, despite being a little verbose as it's not very abstracted because it could cause a performance penalty for many object creations.
Still on JavaScript, it's interesting to use a Hash which passes the parameters to the constructors of objects, so it does not enforce a parameter order or even a minimum set of parameters necessarily. Douglas Crockford promotes this usage with Hashes as parameters when one has multiple parameters or even when one is forward looking, indeed. :-)
Like him, I've come to like it as well, while in Ruby I've tended to use standard parameter filling as Ruby kind of enforces the checking which can come in handy sometimes, while in JavaScript there's even less checking anyway.
I want to tell you that people are having lots of fun with JavaScript using a similar model to mine. People in JavaScript mix "objects" all the time, for instance. And some inheritance frameworks are appearing in JavaScript. I have my own as well which is very practical in nature and at its core has some standardized configuration of classes' objects, as it's a repetitive process. And in every class I have what I called a "gut" variable which holds the inner variables/methods of the class, even though "gut" is public (and has a horrid name to remind me of not relying on it too much on external classes. hehe).
Cheers.
Hi Gilad,
You should know better than speaking to an audience of developers and not showing them any code samples :-)
Please show some code, because as far as I've seen, none of Smalltalk or Scala solve any of the static factory deficiencies that you pointed out. I would love to be proven wrong.
As for the static factory, it's actually quite easy to make it play nice with polymorphism and make it overridable so that it will return subclasses (possibly mocks or specialized versions) of the same class, and I'm quite sure you already know that, so I don't think that your description of the Java weakness is justified.
But let's see some code so we can be sure we are talking about the same thing.
Looking forward to your follow-up.
--
Cedric
Joao,
The problem with Hash-initialized constructors is the loss of static typing, which makes the code more error prone. A better approach is to fully support named parameters so that the compiler can enforce type safety. And by the way, this can be easily emulated in Java with the following, albeit very rarely used in Java, idiom:
new Window().background(RED).width(200).height(100);
@Cedric
I would actually prefer a builder approach, so at initialization time you can be safe to have all the variables needed. E.g.:
WindowBuilder.background(RED).width(200).height(100).build();
The problem I see with the initialization-method idiom and the above code is to not being able to ensure completeness for construction.
With factories, the factory itself usually is a kind of builder class having a no-arg-constructor. The whole factory-class approach would be unnecessary, if one could define classes or class interfaces (which would enable definition of factory methods).
@Ricky
Sorry, a bit late: in Smalltalk, if your superclass defines new: for creating an instance, subclasses would have to override new: to control it being called (otherwise the superclass class-method will be called due to full inheritance). As there is no constraint on how to name a class method that creates an instance, the number of such methods accumulates, and changes to a class in midth of a hierarchy by adding a new constructor method could cause problems.
If one sees inhertiance as being harmful, though, maybe a new language not allowing inheritance but mixins for reuse might work better?
@ricky:
"I would hope that C++ and Java aren't the pinnacle (1/2==0 can't be the best we can do). Also, go back 20 or 30 years and you could say "lots of people have used goto liberally and built real software", or "lots of people have killed other people and gone on to live out a full and happy life" and make just as little sense."
Didn't your mother teach you better manners than that? And didn't you learn the forms of rhetoric in school?
"Look beyond your first reaction - Gilad is not talking about the contents of constructors."
Great, that clears everything up. Thanks!
Define "mainstream object oriented programming". There's C++, Java, C# then ... what? Python? Ruby?"
If you consider Python a mainstream language then there are several corrections to your comment.
- the Python constructor does not need to call the superclass constructor.
- "What happens if we find that we actually need to return an instance of a class other than C?" Then either 1) make C be a function. There's no special "new" syntax required so you can't tell if "C" is a class constructor or a factor function. Or 2) change how __new__ works, which is the special method to create object instance, later followed by the __init__ which usually does the initialization. Oops, and should also mention 3) use a special metaclass.
If the problem is "C++" or "Java" or "C#" constructors considered harmful" then say so. Or perhaps it's that "function calls and class instantiation should be the same". But the problem is not general to OO systems, that I can tell.
gregjor, I don't consider it bad manners to point out that something makes no sense.
Gilad said: "Suppose we have a constructor C(x) and code that creates objects by calling it. What happens if we find that we actually need to return an instance of a class other than C?"
I don't see how he's talking about the contents of the constructor. He's talking about the limitation that a constructor cannot control its own return type or whether it actually instantiates anything or not (perhaps it should reuse an existing instance, etc.). In other words, a constructor is too low-level. A static method is more flexible.
I co-lead PicoContainer (Java), and I've just chatted to Rod-Johnson (co-leads Spring Framework) and we've never heard of a 'so-called' DIF as an acronym nor 'Dependency Injection Framework' as a collective term for PicoContainer, Spring, Guice, Hivemind etc. It might be a newer term, coined since 2003 when we met (in small committee) and coined the term 'Dependency Injection' and its relation to Inversion of Control (IoC).
Your article makes good points about Ctors being final, and unspecified contractually, but nobody has a problem with this. They are discerning as to when to use new, a factory, or a DIF (oops!). Outside of the DI world, ServiceLocator and friends can provide alternate impls if thats what you really need. The aforementioned frameworks also provide nice AOP style ways of intercepting instantiation for all sorts of shenanigans.
Paul,
1) "We've never heard of .. DIF as an acronym".
I believe you. I used an acronym, since Dependency Injection is such a mouthful. You yourself noted how slipped into using the acronym it your post, so I assume you acknowledge the value of a pithy way of referencing the concept.
I believe I've seen "Dependency Injection framework" elsewhere, so I'll just take credit for the acronym.
Since you take credit for the term "Dependency Injection" itself, maybe you can explain what dependencies are being injected? It seems they are being eliminated, not injected. Or maybe you are eliminating them by lethal injection :-)
2. "Nobody has a problem with this". Well, I do. And I think others do as well. The fact that you can get around these problems with enough machinery and planning doesn't mean the situation is ideal, or even good. Maybe you've seen this charming video:
http://www.youtube.com/watch?v=PQbuyKUaKFo
Ruby certainly has its own problems, but clearly *somebody* is less than enamored with the Java way of doing things.
Stefan,
Your meta-interfaces proposal is definitely a step in the right direction. There were several similar suggestions over the years. No doubt you can find them in Sun's bug database - things like "static methods in interfaces" etc. Alas, even though I was pretty positive about some of them, we never pushed them through.
Hi, Gilad. I don't know if you remember me from the closure meetings or not, but I created Guice. We understand and agree fully that DIFs (and most patterns for that matter) are just workarounds for language deficiencies. Such is life, at least until you release your next language. ;)
If you haven't gotten a chance to look at Guice yet, please check out my talk. I caught your "software as a service" talk at Google a few months back and was surprised by how many of the philosophies Guice shares.
Hi Bob,
Nice to hear from you again. I knew you had done Guice (Thanks to Neal Gafter's blog, I think). Now I watched your talk as well. The issues of separate compilation, modularity, loose coupling are indeed themes that come up over and over. My interest in message based programming languages is tied to this - it makes everything loosely coupled. More later.
@ricky
Here's what doesn't make sense:
Gilad said: "Suppose we have a constructor C(x) and code that creates objects by calling it. What happens if we find that we actually need to return an instance of a class other than C?"
Constructors in Java, C++, and most other languages don't return an object or a reference to an object; the return type is void. The new operator (or its equivalent) creates the object, then calls the constructor to initialize the object. The plain fact that constructors are optional clearly shows the difference between creating an object and optionally initializing it.
I don't see how he's talking about the contents of the constructor. He's talking about the limitation that a constructor cannot control its own return type or whether it actually instantiates anything or not (perhaps it should reuse an existing instance, etc.). In other words, a constructor is too low-level. A static method is more flexible.
Another interpretation is that someone is not understanding what constructors are for. Constructors (and destructors) are hooks that the object create/destroy mechanism of the language calls; they are not required. Since constructors don't create objects in the first place, arguing that they aren't very good at it is, as you say, nonsense.
I agree that in some circumstances factory functions that create objects are better than the new/constructor combination. And factories can be implemented as static class functions. No argument there. But pointing out that in some cases factory functions make the most sense doesn't demonstrate that constructors are flawed or "harmful."
gregjor,
You cannot make a call to 'new X()' return a Y, where Y extends X. You can make a call to X.newInstance() return a Y. That's how constructors are inflexible. Put one in your API and you make it hard to change implementation details later.
@Ricky: The problem with using a factory method is that you cannot subclass properly. To create an instance of the super class being properly initialized, the subclass must somehow be able to create an instance of itself where the inherited features are properly initialized, too. And it obviously cannot call the superclass' factory method.
Of course, one could argue that inheritance is considered harmful. But in that case, Constructors wouldn't matter anymore, as it becomes mandatory to use factory classes/methods who "know them all".
One of the reasons that I consider subclassing bad in Java is the dependency on superclass' constructors. I don't know whether you've (Stefan) ever done OO-like code in C. You can subclass by creating a struct (the subclass) that has another struct as its first member (the superclass), directly, not a pointer, and then you can cast pointers to instances of the subclass to be instances of the superclass.
Java and C++ work in a similar way, but you can't see that so easily. In other words, all subclassing is is the embedding of one object in another (not composition, as no pointer is used). If you altered this model somewhat so that subclassing actually did use a pointer, then I don't see a reason why you couldn't 'subclass' an arbitrary object returned from a factory method. I'm not sure that the word subclass would apply there (subobject?).
The superobject could be passed via the constructor, even as part of an anonymous class:
public class Sub extends !Super
{
. . public Sub(Super soup)
. . {
. . . . new(soup);
. . }
}
or
new soup()
{
. . blah
}
where soup is a reference to the superobject.
I might experiment with this in Lisp sometime - it's quite a good language for language experimentation.
@ricky:
You cannot make a call to 'new X()' return a Y, where Y extends X.
Right. That's what 'new Y()' is for. None of that has anything to do with constructors per se, though, because 'new X()' can't return an instance of type Y whether class X has a constructor or not.
You can make a call to X.newInstance() return a Y.
Right... we agree that factory functions are the usual solution to the problem you're describing, and that factory functions can be written as static methods of a superclass.
That's how constructors are inflexible. Put one in your API and you make it hard to change implementation details later.
I'm still not persuaded that this has anything to do with constructors. I agree that constructors in most languages create a dependency on the superclass that you may not want, but that's an implementation detail. Not all languages require a call to the superclass constructor or make one for you, like Java does.
The frustration Gilad expressed in the original post seemed to be about inflexibility when instantiating objects. The issues you are raising are more akin to the fragile base class problem. Both are inherent limitations of OO languages.
gregjor,
"Right. That's what 'new Y()' is for."
new Y() doesn't help if your callers are calling new X(), but you want them to get a Y.
"I'm still not persuaded that this has anything to do with constructors."
Gilad seemed to make two main points - that calling constructors is inflexible, and that the model of having to call the superclass constructor is bad. You seem to have misunderstood me on the former, or missed Gilad on it. That's the only point I've been talking about. We probably agree on that one.
This looks to be a problem of class-based OO, not prototype-base. I like the smalltalk way because is just objects and messages, the constructor is just a method of and object. It is true that you cant ensure proper initialization (is up to you) or that initialization occurs only once, but that is because the initialization is a method not some special portion of code. I dont care living with these problems.
You are right when you talk about the factory class is a subclass of the superclass. I think that they shouldnt be relate by inheritance.
Great post!
Seems to me the real problem is where and how should we do composition and at what level. Java constructors and classes seem like assembly language type constructs. If we could have something higher level for composition, then we could do more metaprogramming.
All the above are programming technique specific objections...mine are more general. See
http://existentialprogramming.blogspot.com/2010/04/class-constructors-considered-harmful.html
For the Java/Class vs Javascript/prototype stuff, check out ExistentialProgramming.com
http://existentialprogramming.blogspot.com/search?q=javascript
Hi Bruce,
Yes, my objections are programming technique related. I tend to take a technical view of these issues and always have - even 20 years ago, when I was more enamored of philosophy than I am today.
I find that philosophy is what we use when we haven't a clue. Philosophy is in constant retreat as science expands its reach, and true understanding replaces sophisticated speculation. Physics has pretty much subsumed natural philosophy, for example. Philosophy can lead to real insight, but it happens very rarely.
So, if you can formulate interesting programming constructs, idioms or methodologies based on philosophy, more power to you.
I read your posts with interest, but I don't think you're there yet.
Gilad is making the correct point that in best design practice composable modules (i.e. APIs) should expose abstract interfaces but not their concrete classes. Thus new becomes impossible, because there is no concrete class to instantiate. Apparently Gosling realized this.
Static factories in abstract interfaces can accomplish this with type parametrization.
Gilad Bracha wrote:
The standard recommended solution is to use a static factory [...] You can’t abstract over static methods
Static methods can be abstracted (over inheritance) with type parametrization, e.g.
interface Factory<+Subclass>
{
newInstance() : Subclass
}
where the + declares that Factory can be referenced (assigned) covariantly due to Liskov Substitution Principle. The + is unnecessary in the above example, but exists in the syntax generally to perform checking against LSP.
Thus any API can abstract over any publicly exposed Factory[T] by associating them privately with instances of Factory[IheritsFromT].
On the topic of concrete implementation inheritance and constructors, the Scala-like mixins can handle this, because external parameters for constructors are not allowed in a trait implementation. Thus each mixin is detached from the external inheritance order. I have taken this a step further in my design for Copute, because I force the separation of purely abstract interface and purely concrete mixin implementation, thus every mixin has to inherit from an abstract interface and can only be referenced as a type via that interface, i.e. concrete mixins are not types in any scope other than the mixin declaration.
Very interesting read. I agree with problems outlined here but on the other side I don't get your solution to them. I don't know much about smalltalk metaclasses but definitely take a look at them soon.
But based on what I know about newspeak you we can do a following:
1. Make classes to be a first class values so we can abstract over classes used in particular module.
2. Make constructor to be a instance method of the class object instance.
Does I understand you correctly?
Side note: funny to say but it looks like javascript is the only mainstream language where you can implement a similar thing since every obejct is created throw function invocation and you can easily abstract over these functions.
Q: Does newspeak constructor allow you to return arbitrary object?
Hi Aliaksandr,
Yes, you understand things correctly it seems to me.
The solution described in the post is the Newspeak solution. You don't have to anything special in Newspeak. Each class has a primary factory which creates properly initialized instances of it.
The key thing is that the factory is always called as an ordinary method on an object which might be the class or it might be something else. You don't know if you are calling the primary factory or another class method or an arbitrary instance method.
Gilad that is the unit function in the Monad or Applicative.
Shelby,
If by "that" you mean that unit acts as a constructor, sure. Indeed, monads can be viewed as objects that implement a particular interface and contract which you might call Monad or Bindable or Comprensible (as in "usable by comprehensions") or just AbstractCollection.
None of which justifies the religious awe in which they are held.
When my hate mail supply drains out, I might write this up. Or maybe better to have Erik Meijer explain this to people more politely than I do.
Post a Comment