A place to be (re)educated in Newspeak

Monday, December 31, 2007

More on Modules

My posts seem to raise more questions than they answer. This is as it should be, in accordance with the Computational Theologist Full Employment Act. In this post, I’ll try and answer some of the questions that arose from my last one.

How does one actually hook modules together and get something going? As I mentioned before, module definitions are top level classes - classes that are defined in a namespace, rather than in another class.

Defining a top level class makes its name available in the surrounding namespace. More precisely, it causes the compiler to define a getter method with the name of the class on the namespace; the method will return the class object.

Since a module definition is just a class, one needs to instantiate it, by calling a constructor - which is a class side method. Continuing with the example from the previous post:

MovieLister finderClass: ColonDelimitedMovieFinder

Now this isn’t quite realistic, since ColonDelimitedMovieFinder probably needs access to things like collections and files to do its job. So it’s probable that it takes at least one parameter itself. The typical situation is that a module definition takes a parameter representing the necessary parts of the platform libraries. It might look something like this:

ColonDelimitedMovieFinder usingLib: platform = (
|
OrderedCollection = platform Collections OrderedCollection.
FileStream = platform Streams FileStream.
|
)...



So we’d really create the application this way:

MovieLister finderClass: (ColonDelimitedMovieFinder usingLib: Platform new)

where Platform is a predefined module that provides access to the built-in libraries.

Bob Lee points out that if I change MovieLister so that it takes another parameter, I have to update all the creation sites for MovieLister, whereas using a good DIF I only declare what needs to be injected and where.

In many cases, I could address this issue by declaring a secondary constructor that feeds the second argument to the primary one.

Say we changed MovieLister because it too needed access to some platform library:

class MovieLister usingLib: platform finderClass: MovieFinder = ...


We might be able to define a secondary constructor

class MovieLister usingLib: platform finderClass: MovieFinder = (
...
): (
finderClass: MovieFinder = (
^usingLib: Platform new finderClass: MovieFinder
)
)


There are however two situations where this won’t work in Newspeak.

One is inheritance, because subclasses must call the primary constructor. I showed how to deal with that in one of my August 2007 posts - don’t change the primary constructor - change the superclass.

class MovieLister finderClass: MovieFinder = NewMovieLister usingLib: Platform new finderClass: MovieFinder
(...)


The other problematic case is for module definitions. In most cases, the solutions above won’t help; they won’t be able to provide a good default for the additional parameter, because they won’t have access to the surrounding scope. For this last situation I have no good answer yet. I will say that the public API of a top level module definition should be pretty stable, and the number of calls relatively few.

So overall, I think Bob makes an important point - DIFs give you a declarative way of specifying how objects are to be created. On the other had, it gets a bit complicated when different arguments are needed in different places, or if we don’t want to compute so many things up front at object creation time. Guice has mechanisms to help with that, but I find them a bit rich for my blood. In those cases, I really prefer to specify things naturally in my code.

Another advantage of abstracting freely over classes is that you can inherit from classes that are provided as parameters.

class MyCollections usingLib: platform = (
| ArrayList = platform ArrayList. |
)(
ExtendedArray List = ArrayList (...)
)


Now, depending what library you actually provide to MyCollections as an argument, you can obtain distinct subclasses (in fact, there’s an easier way to do this, but this post is once again getting too long). Correct me if I’m wrong, but I don’t think a DIF helps here.

You can also do class hierarchy inheritance: modify an entire library embedded within a module by subclassing it and changing only what’s needed. This is somewhat less modular (inheritance always causes problems) but the tradeoff is well worth it in my opinion.

I spoke about class hierarch inheritance at JAOO, and will likely speak about it again in one or more of my upcoming talks on Newspeak, at Google in Kirkland on January 8th, at FOOL on January 13th, or at Lang.Net 2008 in Redmond in late January.

I’m trying to make each of these talks somewhat different, but they will necessarily have some overlap. I hope that some of these talks will make it onto the net and make these ideas more accessible.

2 comments:

Paul said...

Hi Gilad,

Thanks for the speedy response. I guessed that the bit about Modules names appearing in the enclosing namespace was a clue :^)

OK, I get how it all hooks-up now. Interesting choice to use nested classes. I always thought they were a poor substitute for closures when you needed call backs in Java :^).

I see that if you can get rid of any lexical ambiguity, that nesting classes could be an ideal way of creating abstractions which are larger then a single class. I like the idea of defining modules, sub-systems, etc.

These ideas are pretty new to me. I went back and read your position paper you linked to in your august post and found it very useful. If I get time I will look up the language Beta also.

No questions at the moment, just a fuzzy head :^). I just wanted to let you know that people are actually reading this stuff.

Cheers,

Paul.

Gilad Bracha said...

Paul,

Thanks for the feedback.

One thing to be clear on is that Newspeak's nested classes are quite different from Java's.

Like Beta (and unlike Java), every instance of an outer class has its own distinct set of nested classes.

Furthermore, nested classes are always "virtual" - that is, they are dynamically bound, exactly like methods.

The idea is to keep the rules very uniform, general and simple.