Room 101: Object Initialization and Construction Revisited

Wednesday, August 15, 2007

Object Initialization and Construction Revisited

In my last post, which discussed object initialization and construction, I had promised to come back to the topic and clarify it with concrete examples. I've finally found time to do that; hopefully I will dispel some of the misunderstandings that the last post engendered, no doubt replacing them with fresh, deeper misunderstandings.

Below is a standard example - a class representing points in the plane. What’s non-standard is that it is written in Newspeak, an experimental language in the style of Smalltalk, which I and some of my colleagues are working on right now. In cases where the syntax is non-obvious, I’ll use comments (Pascal style, like so: (* this is a comment *)) to show how a similar construct might be written in a more conventional (and less effective) notation.

class Point2D x: i y: j = ( 
(* Javanese version might look like this : 
   class Point2D setXY(i, j) { ...} *)
(*A class representing points in 2-space” *)
|
  public x ::= i.
  public y ::= j.
|
) (    (* instance side *)

  public printString = (
    ˆ ’ x = ’, x printString, ’ y = ’, y printString
  ) 
)

this declaration introduces the class Point2D. The class name is immediately followed by a message pattern (method signature for readers of Javanese) x: i y: j. This pattern describes the primary constructor message for the class. The pattern introduces two formal parameters, i and j, which are in scope in the class body. The result of sending this message to the class is a fresh instance, e.g.:

Point2D x: 42 y: 91 
(* In Javanese, you might write Point2D.setXY(42, 91);  
   But don’t even think of interpreting setXY as a static method!
*)

yields a new instance of Point2D with x = 42 and y = 91. The message causes a new instance to be allocated and executes the slot initializers for that instance, in this case

x ::= i.
y ::= j.

The slots are accessed only through automatically generated getters (x and y) and setters (x: and y:).

How is all this different from mainstream constructors?
Because an instance is created by sending a message to an object, and not by some special construct like a constructor invocation, we can replace the receiver of that message with any object that responds to that message. It can be another class (say, an implementation based on polar coordinates), or it can be a factory object that isn’t a class at all.

Here is a method that takes the class/factory as a parameter

makePoint: pointFactory = (
(* In Javanese: 
   makePoint(pointFactory) {
     return pointFactory.setXY(100, 200)
   } 
*)
  ^pointFactory x: 100 y: 200
)

We can invoke this so:

makePoint: Point2D

but also so:

makePoint: Polar2D

where Polar2D might be written as follows:

class Polar2D rho: r theta: t = (
(* A class representing points in 2-space”*)
|
  public rho ::= r.
  public theta ::= t.
|
) (   (* instance side *)
  public x = ( ^rho * theta cos) (* emulate x/y interface *)
  public y = (^rho * theta sin)
  ...
  public printString  = (
    ˆ ’ x = ’, x printString, ’ y = ’, y printString
  )
) : (  (* class side begins here*)
  public x: i y: j = (
    | r t |
    t := i arcCos.
    r := j/ t sin.
    ˆrho: r theta: t
  )
)

Here, Polar2D has a secondary constructor, a class method x:y:, which will be invoked by makePoint:.

You cannot do this with constructors or with static factories; you simply cannot abstract over them.

You could use reflection in Java, passing the Class object as a parameter and then searching for a constructor or static method matching the desired signature. Even then, you would have to commit to using a class. Here we can use any object that responds to the message x:y:.

Using Java core reflection in this case is awkward and verbose, and historically hasn’t been available on configurations like JavaME. And it doesn’t work well with proxy objects either (see the OOPSLA 2004 paper we wrote for details). What’s more, you may not have the right security permissions to do it. The situation is not much better with the VM from the makers of Zune (tm) either.

Zune is a trademark of Microsoft Corporation. Microsoft is also a trademark of Microsoft Corporation. But GNU’s not Unix

Alternatively, you could also define the necessary factory interface, implement it with factory classes, create factory objects and only pass those around. You’d have to do this for every class of course, whether you declared it or not. This is tedious, error prone, and very hard to enforce. The language should be doing this for you.

So far, we’ve shown how to manufacture instances of a class. What about subclassing? This is usually where things get sticky.

Here’s a class of 3D points

class Point3D x: i y: j z: k = Point2D x: i y: j (
(* A class representing points in 3-space *)
| public z ::= k. |
)  (* end class header *)
(  (*begin instance side *)
   public printString = (
    ˆsuper printString, ’ z = ’, z printString
  )
)

One detail that’s new here is the superclass clause: Point3D inherits from Point2D, and calls Point2D’s primary constructor. This is a requirement, enforced dynamically at instance creation time. It helps ensure that an object is always completely initialized.

Unlike Smalltalk, one cannot call a superclass’ constructors on a subclass. This prevents you from partially instantiating an object, say by writing:

Point3D x: 1 y: 2  (* illegal! *)

without initializing z as intended. Also, unlike Smalltalk, there’s no instance method that does the initialization on behalf of the class object. So you cannot initialize an object multiple times, unless the designer deliberately creates an API to allow it. The idea is to ensure every object is initialized once and only once, but without the downsides associated with constructors.

Preventing malicious subclasses from undermining the superclass initialization takes care. We’re still considering potential solutions. The situation is no worse than in Java, it seems, and we may be able to make it better.

A different concern is that the subclass must call the primary constructor of the superclass. So what happens when I want to change the primary constructor? Say I want to change Point2D to use polar representation. Can I make rho:theta: the primary constructor? How can I do this without breaking subclasses of Point2D, such as Point3D? We can't do it directly yet (though we should have a fix for that in not too long), but I can redefine Point2D
as

class Point2D x: i y: j =  Polar2D rho:  ... theta: ... = ()()
: ( “class side begins here”
(* secondary constructor *)
  public rho: r theta: t = (
    ˆx: r * t cos y: r * t sin
  )
)

Now anyone who uses a Point2D gets a point in polar representation, while preserving the existing interface. And anyone who wants to can of course create polar points using the secondary constructor. I can also arrange for that constructor to return instances of Polar2D directly:

public rho: r theta: t = (
  ˆPolar2D rho: r theta: t
)

If you find this interesting, you might want to read a short position paper I wrote for the Dynamic Languages Workshop at ECOOP. It only deals with one specific issue regarding the interaction of nested classes and inheritance, and it’s a just a position paper describing work in progress, but if you’ve gotten this far, you might take a look.

I still haven’t explained why I see no need for dependency inversion frameworks. The short answer is that because Newspeak classes nest arbitrarily, we can define a whole class library nested inside a class, and parameterize that class with respect to any external classes the library depends on. That probably needs more explanation; indeed, I think there’s a significant academic paper to be written on the subject. Given the length of this post, I won’t expand on the topic just yet.

14 comments:

Unknown said...: Interesting proposal...

You suggested that subclasses would not inherit superclasses constructors. I wonder how does this mix with traditional Smalltalk's metaclasses?; 8/18/2007 2:55 PM
Stefan Schulz said...: Correct me, if I am wrong, but from what I understood, you took the constructor policy from Java and applied it (kind of) to a Smalltalk like ideom. With one exception, that one has to start with a specific "constructor". If that is the case, I am not sure to have understood the relation to your previous post.
The lesser important thing, to me, is that you change the interpretation from calling a method to sending a message, which is not that much different to reflectively calling a method: the receiver has to interpret the message (may be automated by looking up a signature, i.e. reflection) and jump into the according code. Of course, this may be optimized. If I remember correctly, in Smalltalk one can fetch such messages to implement ones own reception mechanism (within metaclasses). Of course, signatures aren't that complicated in Smalltalk but rather symbols. So a hashed lookup is quickly done.; 8/20/2007 9:49 AM
Gilad Bracha said...: Rafael:

The metaclass hierarchy in Newspeak needs to be almost flat. Every class has its own metaclass, as in Smalltalk, because there are class methods. However, all metaclasses are subclasses of a common class.; 8/20/2007 10:02 AM
Gilad Bracha said...: Stefan:

You seem to have missed the point. Both posts are centered around the fact that traditional constructors cannot be properly abstracted over - even at the reflective level, they are special and distinct from methods.

Instead of a constructor we use an ordinary message (or virtual method), which means that all the abstraction mechanisms in the language work with the construction mechanism.

What is different from Smalltalk (and influenced by Java) is that you declare these "constructors" specially, so we can prevent partial or multiple initialization.; 8/20/2007 10:10 AM
Stefan Schulz said...: Well, yes, it may be me.
So, generally speaking, you are introducing a specific modifier for Smalltalk methods to be non-heritable (but neither private nor final), which not necessarily has to be restricted to constructors, in my opinion.; 8/21/2007 5:20 AM
Stefan Schulz said...: Well, yes, it may be me. :)
So, generally speaking, with respect to Smalltalk, you are introducing a specific modifier for methods to be non-heritable (but neither private nor final in the sense of Java), which not necessarily has to be restricted to constructors, in my opinion.; 8/21/2007 5:23 AM
Gilad Bracha said...: Stefan,

There is no such modifier at this time (I rather doubt I'll ever add it, but who knows). In terms of implementation, there are a number of ways of doing this.

One might use a mechanism similar to what you suggest (special methods used only by the implementation), but that depends on your underlying machinery. Newspeak runs on top of Squeak now, but it doesn't have to.; 8/21/2007 8:37 AM
none said...: You always compare with Java, but it seems to me, looking at the code examples that Newspeak is much more dynamic in nature.

In particular, is there a way to ensure at compile-time that the pointFactory argument actually handles x:y: messages? It seems to me this would not be possible, and in that case, Newspeak falls into the dynamically-typed space.

I think I must be missing something here. Is there some kind of type inferencing that can be applied?; 8/21/2007 7:25 PM
Gilad Bracha said...: Bono

Yes, Newspeak is dynamically typed. We plan on adding pluggable types, but it's pretty low priority at the moment.; 8/21/2007 8:49 PM
Unknown said...: Since I was bitten before by java's approach to constructors, I find this idea very appealing. Does Newspeak allows the constructing code in the super class to call overridden methods? If so, how do you solve the problem of the partially initialised sub class data?

Slightly off topic but still interesting: Why call it Newspeak? Do you plan to have less keywords every year?; 8/24/2007 10:05 AM
Gilad Bracha said...: Michael,

Yes, you can call any method you want in an initializer, and any non-private method can be overridden (and privacy isn't implemented yet, but we'll get there).

This implies that we have no protection against malicious subclasses undermining initialization. The post says that explicitly. One answer is careful coding to use only private methods in initializers. This is recommended practice for security sensitive code in Java.

I plan to return to this problem and investigate alternatives. It seems easy to implement a policy
whereby overriding doesn't take effect during initialization, for example.

The nice thing about Newspeak is that we can change it (and in particular, *shrink* it, as you imply). Newspeak is a work in progress, evolving in versions over time, like its namesake.

Why is it called Newspeak? What else would you discuss in Room 101?; 8/24/2007 8:22 PM
Matt Hellige said...: Regarding the "very nice academic paper" you mention at the end, it sounds very much like the approach described in "Scalable Component Abstractions" by Odersky and Zenger at OOPSLA 05 (which is indeed a very nice academic paper). I agree that this is a great approach when compared to dependency injection (and certainly when compared with traditional static references), but I'm not completely sure whether your idea is the same or different. If you're familiar with their work, could you perhaps clarify?

Thanks!; 9/04/2007 2:38 PM
Gilad Bracha said...: Matt,

Thanks for the excellent comment.

I certainly know of the work on components by Matthias Zenger and Martin Odersky. Yes there is a lot of similarity. I've discussed related issues with Martin many times; occasionally with Matthias, but much less, mainly due to lack of time/opportunity.

Overall I think we agree on most things. There are also many differences:

Their component framework is statically typed and prototype based. Mine is dynamically typed and class based. I don't treat components as a separate construct.
An exhaustive comparison will wait until I have time to write a paper.

What I'm doing is closer to Scala then to Matthias' thesis work. I've had some influence on Scala, and Scala has had some influence on me.

Scala and Newspeak are still significantly different in the handling of object construction, typing and reflection.; 9/05/2007 9:30 PM
Matt Hellige said...: Thanks! That helps, and I'll look forward to seeing a paper. The Scala work is very good, but of course there is much more exploring to do, particularly in the areas of object construction and reflection where, it seems to me, the Scala folks have said very little so far.

I'm excited to follow your progress on Newspeak!; 9/06/2007 10:25 AM

Room 101

Wednesday, August 15, 2007

Object Initialization and Construction Revisited

14 comments:

About Me

Blog Archive