Wednesday, December 09, 2009

Chased by One’s Own Tail

Functional programming languages generally allow you to program without resorting to explicit loops. You can use recursion instead. It’s more elegant and higher level. In particular, it allows one to code certain nonterminating computations (such as state machines) as (a set of mutually) recursive calls.

In order to reliably support tail calls, implementations convert tail recursion into iteration. As a result, the activation frames for the recursive calls never come into existence, preventing nasty stack overflows.

I recently came across a truly brilliant post by Guy Steele arguing that tail recursion is actually essential to object oriented programming. Go read it, as well the terrific essay by William Cook that it references.

And yet, object oriented languages usually don’t do tail call optimization (TCO for short). What’s their excuse?

One excuse, especially in the Java world, is security. Specifically, stack frames carry information about protection domains that the the classic J2SE model depends upon. Since I am not (and have never been) a fan of that particular model, I’ll dispense with that argument. There are better ways to get security.

A more important excuse is debugging. Stack traces are very useful for debugging of course, and if stack frames are optimized away via tail recursion, key information is lost.
I myself want complete stack traces, but I also want to write tail recursive programs and know that they will reliably work.

One possibility is that upon stack overflow, we compress the stack - eliminating any tail recursive frame that is at its return point. This should be controllable via debugging flags. The excess frames are eliminated, but only when strictly necessary. Programs that work well without tail call optimization won’t be affected. So if you think a reliable stack trace is more important than tail call optimization, you have no need for concern.

Programs that rely on tail calls might suffer performance loss in this mode, but they will still run - and be easier to debug. If this is a problem, one can specify a different configuration which optimizes tail calls away. So if you favor tail call optimization over stack traces, you need not be alarmed either.

Now let us change perspective slightly. Keep in mind that the stack compression suggested above is really just a particular form of garbage collection.

It’s good to recall that the whole notion of the call stack is simply an optimization. One can allocate activation records on the heap and dispense with a call “stack” altogether. If you did that, you could make each activation point at its caller. Tail recursive calls might produce activations that pointed not at the caller, but at the caller of the first activation of the tail recursive function. All intermediate frames would be subject to garbage collection.

In the above model, aggressive garbage collection corresponds to traditional TCO. On the other hand, the idea of "collecting" the stack only when it runs out, gives us as much debugging information as a conventional non-TCO implementation. We only lose information in situations that simply don't work without TCO.

Now let's pursue the analogy between stack frames and heap storage further. We’d like to have all our past states in the heap, so we could examine them with a time traveling debugger, but that’s usually too costly. If we are lucky enough to have a time traveling debugger at all, we must configure how much of the past is available.

The past history of the call stack (i.e., what tail recursive calls took place) is not fundamentally different. If we are chasing our own tail, we can go back in time and see it chasing us. So my ideal development environment would allow me to recover stack frames that had been eliminated by TCO if I really wanted to.

As usual, it’s easier to see the common abstraction if one considers everything, including activations, as an object.

I therefore argue that:

  1. Support for tail recursion should be required by the language specification. This is essential: if tail recursive programs only run on certain implementations, one cannot write portable code that relies on tail recursion.
  2. How tail recursion is supported is an implementation detail. Whether tail recursive activations never come into being, or exist on the stack or the heap and get collected by one policy or another, is up to the implementation.
  3. During development, the IDE should let you choose policy on retaining computation history. In particular, it should allow you to retain as much of the history of your computation as possible - on the stack and the heap (e.g., time traveling debugging, full stack traces etc.).
I’ve endeavored to support this outlook in Newspeak - but only partially so far.
The Newspeak spec states that method or closure activations are objects, and a called activation is passed a reference to its continuation activation (not its continuation!), which is either its calling activation, or the continuation activation of the caller if there is no further code to execute in the caller (i.e., the caller is a tail call).

This meets requirements (1) and (2). Alas, the implementation does not yet comply (so forget about (3)). I hope we can correct this in time.

We need not compromise on either debugging or tail recursion. We do need to seek out common abstractions (like objects) and insist on better tools and languages rather than settling for the dubious services of the mainstream.

Saturday, November 21, 2009

Objects are not Hash Tables

Hash tables are objects, not the other way around, as they are modeled in many popular scripting languages.

Ok, so objects aren't hash tables. They aren't cats either. What are they then? What would be a good definition of the term object as used in modern programming languages? I’d start with saying that an object is a self-referential record:

point = record {

rho = 0;

theta = 0;

x = fun(){return cos(point.rho, point.theta)};

y = fun(){return sin(point.rho, point.theta)}

}

Notice how the self reference is explicit, via the name point. The example does not rely on static typing, classes, prototypes, inheritance, delegation or on whether the language is imperative or not (the example above could easily be in either). So we can discuss them without getting into all those things.

Self reference, on the other hand, is absolutely essential - a record (or struct, for those who’ve used C for too long) is not in itself an object.

Popular scripting languages introduce a variation on this theme: they replace records with hash tables. The hash tables must still be self referential of course.

Hash tables have huge advantages. The first is that they are indexed by first class values, which means you can abstract over their keys. Hence you can access an object member without statically knowing its name - the equivalent of Smalltalk’s perform:, and a close cousin of Method.invoke() and Field.get()/set() in Java. You can also iterate over the keys (getMethods() etc.). You get reflection for free.

In an imperative language, hash tables are a mutable data structure. You can add, remove or change elements. So you can do schema changes on your objects and modify their behavior. Again, all this reflection comes for free - you need not even design an API for it.

Therein lies the rub of course. The trick of exposing the data structures of your implementation at the language level is immensely powerful - but it does expose them, and that has very real disadvantages. It is very hard to typecheck, optimize, or (especially) make any kind of security guarantees, without losing or restricting this great power.

Hence the title of this post. There is more to objects than hash tables (even with self reference). We should not confuse objects with the data structures that might implement them.

Let’s revisit the definition at the top of this post. It has the advantage that it is general enough to fit anything that people might call an object, but it isn’t good enough. For me, an object is an encapsulated self referential record. Encapsulation is open to interpretation: some people take it to mean “bundled together” while others expect it implies some sort of data abstraction/information hiding. In the above context, I would hope that it would be clear that I mean the latter, as the word record already implies bundling/aggregation.

This definition excludes the objects we find in common scripting languages; these rely on closures to encapsulate data. The definition also excludes mainstream statically typed languages, where encapsulation is enforced at the type level, by the type system:

class C {

private internals; // so delicate, so secret

public slashAndBurn(C victim){victim.internals = rubbish}

}

As the above code illustrates, class or type based encapsulation does not protect one object from another. However, if your type system is strictly based on interfaces, than you do get object based encapsulation.

I don’t advocate relying on a type system to ensure encapsulation: I’ve argued elsewhere against relying on mandatory type systems for security or any other crucial property.

Instead, I view encapsulation as inherent to objects themselves. Objects expose a procedural interface, independent of any type system, and explicitly define which members are public and which are hidden.

This is of course the model of objects used in the Smalltalk family of languages: Smalltalk, Self and now Newspeak. Self and Newspeak go further, and require that even the hidden members be accessed via a procedural interface.

Needless to say, this doesn’t imply losing the reflective power of the “hash table languages”. It does, however, force us to come up with a reflective API. Having a reflective API imposes structure that is lacking in typical scripting languages; this structure is a good thing. Making the reflective API secure is an open problem, but the fundamental approach of using mirrors means it is at least possible; but that is for another post.

Saturday, October 31, 2009

Atomic Install

A few months ago, I wrote a post about the become: operation in the Smalltalk family of languages. The post elicited a good discussion. One of the interesting points that came up, was that you could implement become: by changing the class of objects, provided all the changes were atomic - changing the class, modifying the object’s schema accordingly, and copying the data between the objects.

Setting the discussion of become: aside, I observe that in general, reflective change should be atomic. What I mean by this, is that one should be able to group an arbitrary number of distinct reflective changes to the program, and install them into the running program all at once, in one atomic operation. I will use the term atomic install to refer to this ability.

Oddly enough, this point is not well understood. Most languages that support reflective change do not provide a way for atomically performing a set of changes.

Smalltalk provides a reflective API centered around classes. You can ask a class to modify itself in a variety of ways - adding or removing variables, methods or changing its superclass. There is, however, no way to change several classes at once.

The CLOS APIs are similar in this respect.

Scripting languages don’t really have a reflective API as such. Instead, the reflective changes may come about as the result of executing some code (e.g., an assignment may add a variable). In other cases, the reflective capacity of the language comes about directly from exposing the data structures of the language implementation back to the language.

In all these cases, one cannot perform a set of program changes as an atomic unit.

Why do you need to atomically apply a set of changes instead of sequentially applying one change after another?

One reason is that you’d like to apply the changes as a transaction. If a change fails (say, because you created a cycle in the class graph, or duplicated an instance variable) you don’t want any further changes to take place. It’s easier if you don’t have to catch exceptions etc. after each step.

Another reason is that the changes are often dependent on each other. Applying one change without the other leaves your program in a broken state. The fact that intermediate states may be inconsistent even though the overall set of changes is correct means that it isn't sufficient to wrap a series of reflective changes in a transaction.

Of course, most people don’t rely on reflective modification to develop their programs. Rather, they suffer through the classic edit-compile-run-debug cycle. In the absence of anything better, you typically edit source code in files. You then compile these files and load the resulting program. This actually has one big advantage: the load is atomic - all the changes that resulted from your edits are loaded as a unit.

Evolving your program reflectively has the advantage that you can make corrections to the running program. Often, this is done during fix-and-continue debugging. Even then, in most cases, the “program” in question is an application that is stopped in the debugger, and you can apply fixes sequentially. But as noted above, it’s still easier if you can apply the changes as a transaction.

More interesting cases are when you are modifying a program, and need to run it between the individual modifications. Unlike most people, I run into this often, when modifying the IDE I am using. However, there are additional situations of this nature.

Long lived applications that must provide continuous service have this flavor. Erlang allows you to replace an entire module as a unit in these situations. In Java, you use class loaders to get the desired effect; it’s complicated, but it’s your only way out. A scenario I'm especially interested in is software services: the service updates applications on the fly without shutting them down - and in particular, I may want to update the update mechanism itself.

If you can’t make all the changes in one go, you find that you have to break the transition into a series of steps, each of which leaves the system in a consistent state while leading to the desired final program. This is tricky and error prone: you need to apply the right changes in just the right order.

Scripting languages like Ruby often go through a series of program changes during program start up. Different modules are loaded, modifying the program in the process. This process is also order dependent, and therefore brittle. In many cases, this sort of reflective change isn’t actually essential; rather it’s an artifact of the language semantics. However, I suspect there are situations where it is used to real advantage.

Overall, one can live for a long time without the ability to atomically apply a set of program changes. And yet it seems that there are some situations where atomic install seems to be very useful. It also has another advantage: batching the changes is more performant. Often one doesn’t care, but it doesn’t hurt to be faster, and on some occasions it actually can matter. One might as well see if one can come up with a reflective API that supports atomic install.

In Strongtalk, the VM supported atomic install as a primitive operation in the VM. More recently, in the latest Newspeak update release, I added an atomic install facility written entirely in Newspeak. Exactly 372 lines of code (whitespace and copious comments included). It is a tribute to the Squeak design that this is doable without any privileged access to the VM .

Tangent1: Thanks to Peter Ahe and Eliot Miranda for the discussions that led to this scheme; and to Lars Bak, for the discussions that led to the original notion of atomic install.


Tangent2: Squeak’s existing mechanism for changing classes, the ClassBuilder, is, on the other hand, rather unattractive. It’s three times as long, vastly more complicated, and provides only a subset of the functionality. It shows how tricky this kind of reflective change can get if you don’t conceptualize it the right way.

Naturally, the actual atomic step here is done using a variant of become:. Specifically, it’s a one-way become: on an array of objects. Given two object arrays a and b, both of size n, all references to a[i] are changed to refer to b[i], for i =1 .. n.

The atomic installation process nice and simple. The input is a list of mirrors describing mixins, and a namespace listing which of those mixins already exist in the system. The namespace parameter is crucial BTW, as there is no global namespace in Newspeak.

We then produce a fresh set of mixins based on the input list. For any existing mixins, we locate all classes that are invocations of those mixins, and all subclasses thereof. We make fresh versions of these as well, reflecting the changed mixins involved. For each such class, we locate all its instances (does your system have allInstances?) and produce new instances based on the new descriptions. Each new object is associated with the old object by keeping them in parallel arrays. Then we just do the become: and we’re done.

Unlike Smalltalk, in Newspeak we never have to recompile code that hasn’t been modified at the source level - because Newspeak code is representation independent.

It’s clearly harder to do atomic install in a system that does sophisticated JIT compilation and inlining - but as I said, Strongtalk supported it (albeit at the VM level, largely because of the extar complexity involved) in the mid-90s.

How do we get this kind of reflective power on mainstream platforms? It is harder, because the widely used platforms don’t support the needed abstractions as well as Smalltalk VMs do.

At least on the web browser, I expect to be able to get similar effects with Javascript, albeit with a very different implementation.

In Java, the absence of the necessary primitives tends to force one to build one’s own custom object representation, which is costly in both developer time and machine time.

Ironically (but not coincidentally), the bulk of the necessary machinery already exists in the Hotspot JVM, which is capable of changing at least method implementations on the fly (via JVMDI), including deoptimizing compiled code that may have inlined a method that has been modified. The problem is exposing it to user - and especially exposing it securely. Mirrors can help here - but that is for a future post.

On .Net, the DLR helps one construct one’s own custom representation. Conversely, there’s little support for deoptimization etc. in the CLR itself.

Of course, one goal of this post is to encourage implementors to add such support, and do it in the right way.

Tuesday, October 06, 2009

An Image Problem

Imagine a library, say in Java, that provided a universal facility for saving the exact runtime state and configuration of any application. It would essentially capture the heap, all threads and their stacks, and save them to a file.

You could then load this file from disk and reconstitute your application just as it was when it was saved: For example, windows would re-open just where they were (adjusting their location if the screen size differed). The system would cope with open file handles and sockets in a reasonable fashion, re-opening them if they were available. All this would work irrespective of the underlying platform, and it would be fast as well.

The facility would be transparent to the programmer - you wouldn’t need to change your program in any way, beyond calling the library function. Pretty neat.

I don’t know of such a facility in Java or .Net. However, Smalltalk has had such a beast for over 30 years. In Smalltalk parlance, it’s called an image. Most Smalltalks are inseparably coupled to the notion of such an image.

Tangent: There are exceptions of course, notably Strongtalk.

The idea hasn’t caught on. Why? Should it catch on, and if so, how?

To be sure, the image is a very powerful mechanism, and has significant advantages. For example, in the context of the IDE, it gives the ability to save and share debugging sessions.

One advantage of images is rapid start-up. If your program builds up a lot of state as it starts, it can be intolerably slow coming up. This is one the problems that killed Java on the client. One should avoid computing that state at startup; one easy way to do this is to precompute most of it and store it in an image. It is typically much faster to read in an image than to try and compute its contents from scratch.

The problem begins when one relies exclusively on the image mechanism. It becomes very difficult to disentangle one’s program from the state of the ongoing computation.

If your program misbehaves (and they always do) and corrupts the state of the process, things get tedious. Say you have a test that somehow failed to clean up properly. In a conventional setting, you fix the problem with your code, and run the test again. With an image, you have the added burden of cleaning up the mess your program made.

In general, traditional environments tend to force you to start over with each run of the program: the edit-compile-link-load-throw-the-application-off-the- cliff-let-it-crash-and-start-all-over-again cycle, in the words of the original Java whitepaper.

Tangent: That whitepaper is a masterpiece of technical rhetoric, and was visionary in its day. Alas, Java never fully realized that vision.

The thing is, sometimes it’s good to be able to start afresh. It may be easier to start from scratch than to mutate your existing process.

Mega-tangent: incidentally, this is an argument for sexual reproduction as well.

Of course sometimes starting anew isn’t so nice: think about fix-and-continue debugging.

In some cases it is even more critical to separate your code from the computation. You often save your image just to save your program. It may take you a while to find out that your image has been corrupted. Now you need to go back to a correct image, and yet you need to extract your code safely from the corrupt image. To be sure, Smalltalk IDEs provide a variety of tools that can help you with that, but I have never been really happy with them.

Tangent: this is where irate Smalltalkers berate me about change sets and logs and Envy and Monticello etc. etc. Sorry, I don’t think it’s good enough.


In general, Smalltalk makes it hard to modularize one's code, and especially to separate the application from the IDE. The exclusive reliance on the image model greatly aggravates these difficulties.

Traditional development tools, primitive as they often are, naturally provide a persistent, stateless representation of the program. In fact they provide two: the source code, in a text file, and a binary.

Semi-tangent: source code seems the most obvious thing in the world; but traditional Smalltalk’s have no real syntax above the method level! Classes are defined via the evaluation of reflective expressions, which rely on the reflective API. This is very problematic: the API often varies from one implementation to another. By the way, this is one of the ways Newspeak differs from almost every Smalltalk (the late, great Resilient being the only exception I can recall). Newspeak has a true syntax. Furthermore, because Newspeak module declarations are fully parametric in all their external dependencies, they can be compiled at any time in any order - unlike code in most languages (say Java packages) where there are numerous constraints on compilation order (e.g., imports must be defined).

A binary is a stateless representation of the program code, but one that does not require a compiler to decode it. Smalltalk doesn’t usually have a binary form. Code is embedded in the image. There are exceptions, and some Smalltalk flavors have ways of producing executables, but the classic approach ties code and computation together in the image and makes it very hard to pry them apart.

None of this means you can’t have an image as well as a binary format. What is important is that you do not have just an image. Ideally, you have images and a binary format. This is one of my goals with Newspeak, and we are pretty close.

In Newspeak, serialized top level classes can serve as a binary format. I will expand on how serialization can serve as a binary format in an upcoming post. At the same time, we continue to use images, though I hope they will become much less central to our practice as time goes by.

Thursday, September 03, 2009

Systemic Overload

Type-based method overloading is a (mis)feature of mainstream statically typed languages. I am very much against it - I’ve spoken about it more than once in the past. I decided I should put my views on the matter in writing, just to get it out of my system. I’ll start with the better known problems, and work my way up to more exotic issues.

Throughout this post, I’ll use the term overloading to denote type based overloading. Arity based overloading, as in e.g., Erlang, is pretty harmless.

When introducing a feature into a language, one has to consider the cost/benefit ratio. You might argue that overloading lets you separate the handling of different types into different methods, instead of doing the type dispatch inside one method, which is tiresome and costly. A classic example are mathematical operators - things like +.

This argument would have merit, if the overloading was dynamic, as in multi-methods. Since it isn’t, overloading doesn’t solve this kind of problem. Not that I’m advocating multi-methods here - they have their own problems - but at least they are based on accurate type information, whereas overloading is based on crude static approximations.

Consider this code (loosely based on an example from Dave Ungar’s OOPSLA 2003 keynote).

class VehicleUtilities {

int numberOfAxles(Vehicle v) { return 2;} // a plausible default

int numberOfAxles (Truck t){ return 3;}

}




Vehicle v = new Truck();

VehicleUtilities u = new VehicleUtilities();

u.numberOfAxles(v); // returns 2

This simple example illustrates the dangerous disconnect between static and dynamic information engendered by overloading.

What exactly is the benefit of type based overloading? It saves you some trouble inventing names. That’s it. This may seem important, but the only case where it is actually needed is when writing constructors - and only because the language doesn’t allow you to give them distinct names.

Constructors are a bad idea, as I’ve already explained, so let’s assume we don’t have them. For methods, type based overloading provides a trivial rewrite and that is all. I don’t think it is that hard to give the different operations different names. Sometimes, the language syntax can make it easier for you (like keyword based method names in Smalltalk), but even in conventional syntax, it isn’t that difficult.

You pay for this convenience in myriad ways. The code above exemplifies one set of issues.

Another problem is the risk of ambiguity. In most overloading schemes, you can create situations where you can’t decide which method to call and therefore declare the call illegal. Unfortunately, as the type hierarchy evolves, legal code can become illegal, or simply change its meaning.

This means that existing code breaks when you recompile, or does the wrong thing if you don’t.

Overloading is open to abuse: it allows you to give different operations the same name. Altogether, you need style guides like Effective Java to warn you to use overloading as little as possible. Language constructs that require style guides to tell you not to use them are a bit suspect.

Ok, you probably know all this. So what contribution does this post make? Well, the systemic costs of overloading in terms of designing and engineering languages are less widely appreciated.

Overloading makes it hard to interoperate with other languages. It’s harder to call your methods from another language. Such a language may have a different type system and/or different overload rules. Or it may be dynamically typed.

You often find a dynamic language implementing multi-method dispatch to approximate the behavior of overloaded methods it needs to call. This is costly at run time, and is a burden on the language implementor.

Scala supports overloading primarily so it can call Java; you might say overloading is a contagious disease, transmitted from one language to another through close contact.

In general, overloading adds complexity to the language; it tends to interact with all sorts of other features, making those features harder to learn, harder to use, and harder to implement. In particular, any change to the type system is almost certain to interact with type based overloading.

Here are some examples. Answers at the end of this post.

Exhibit 1: auto-boxing

void foo(Number n) { ... }

void foo(int i) { ...}



foo(new Integer(3)); // quick, what does this do?

Exhibit 2: the var args feature in Java

void foo(String s, Number... n) {...}

void foo(String s, int i, Integer... n) {...}

void foo(String s, int i, int... n) {...}

void foo(String s, Integer i, Integer... n) {...}

void foo(String s, Integer i, int... n) {...}


foo(“What does this do?”, 1, 2);



Exhibit 3: Generics

void foo(Collection c) {...}

void foo(Collection < String> c){...}


void foo(Collection < Boolean> c){...}


void foo(Collection < ? extends String > c){...}


void foo(Collection < ? super String > c){...}



Collection< String> cs;

foo(cs);


/* Which of these is legal? What would happen if we didn’t use erasure? You have 3 seconds. */


Each of the above exhibits shows a specific type system extension which gets entangled with overloading.

You might say you don’t care; these are pretty sick examples, and the language designers sorted out the rules. What is their suffering to you? Well, these complications all have cost, and since resources are finite, they come at the expense of other, more beneficial things.

Tangent: People tend to think that the language design and implementation teams at large companies like Sun or Microsoft command vast resources. In reality, the resources are spread pretty thin considering the scope of the task at hand.

The more time is spent chasing down these issues, the less is spent focusing on doing stuff that actually buys you something. Products ship later and have more bugs and less desirable features, because people spent time worrying about gratuitous complications.

This is not hypothetical. I know the poor soul who wrote the spec for this might have spent his time doing something else, like day dreaming or adding closures. The compiler writers could have fixed more outstanding bugs, perhaps even reaching a state where there were no open bugs. I know they tried. The testers could have tested for other, more pressing issues and discovered more bugs to open, before shipping. These are the wages of sin - and the sin is unjustified complexity.

Now, for the sake of balance, I should say that overloading, and language complexity in general, do have one advantage I haven’t yet mentioned. They open up great opportunities for training, support and consulting. You can even write some really cool books full of language puzzlers.

It’s worth noting that this kind of overloading is only a concern in languages with a mandatory type system. If you use optional typing (or just dynamic typing), you aren’t allowed to let the static types of the arguments change the meaning of the method invocation. This keeps you honest.

Will future language designs avoid the problems of overloading? I wish I was confident that was the case, but I realize overloading is an entrenched tradition, and thus not easily eradicated.

However, the overarching point I want to make is that the costs of complexity are insidious - unobvious in the short term but pervasive over the longer haul. KISS (Keep It Simple, S - where the binding of S is open to interpretation).


Answers:

  1. The issue here is whether we auto-unbox Integer(3) first to produce an int (and call foo(int)) or resolve overloading in favor of foo(Number) and don’t unbox at all. Java does the latter. The reason is to remain compatible with older versions.
  2. This is ambiguous. Except for the first declaration, no method is more specific than the others.
  3. They all have the same erasure, and so the example is illegal. If we did not use erasure, than foo(Collection < String >) would be the most specific method.

Thursday, July 30, 2009

The Miracle of become:

One of Smalltalk’s most unique and powerful features is also one of the least known outside the Smalltalk community. It’s a little method called become: .

What become: does is swap the identities of its receiver and its argument. That is, after

a become: b

all references to the object denoted by a before the call point refer to the object that was denoted by b, and vice versa.

Take a minute to internalize this; you might misunderstand it as something trivial. This is not about swapping two variables - it is literally about one object becoming another. I am not aware of any other language that has this feature. It is a feature of enormous power - and danger.

Consider the task of extending your language to support persistent objects. Say you want to load an object from disk, but don’t want to load all the objects it refers to transitively (otherwise, it’s just plain object deserialization). So you load the object itself, but instead of loading its direct references, you replace them with husk objects.

The husks stand in for the real data on secondary storage. That data is loaded lazily. When you actually need to invoke a method on a husk, its doesNotUnderstand: method loads the corresponding data object from disk (but again, not transitively).

Then, it does a become:, replacing all references to the husk with references to the newly loaded object, and retries the call.

Some persistence engines have done this sort of thing for decades - but they usually relied on low level access to the representation. Become: lets you do this at the source code level.

Now go do this in Java. Or even in another dynamic language. You will recognize that you can do a general form of futures this way, and hence laziness. All without privileged access to the workings of the implementation. It’s also useful for schema evolution - when you add an instance variable to a class, for example. You can “reshape” all the instances as needed.

Of course, you shouldn’t use become: casually. It comes at a cost, which may be prohibitive in many implementations. In early Smalltalks, become: was cheap, because all objects were referenced indirectly by means of an object table. In the absence of an object table, become: traverses the heap in a manner similar to a garbage collector. The more memory you have, the more expensive become: becomes.

Having an object table takes up storage and slows down access; but it does buy you a great deal of flexibility. Hardware support could ease the performance penalty. The advantage is that many hard problems become quite tractable if you are willing to pay the cost of indirection via an object table up front. Remember: every problem in computer science can be solved with extra levels of indirection. Alex Warth has some very interesting work that fits in this category, for example.

Become: has several variations - one way become: changes the identity of an object A to that of another object B, so that references to A now point at B; references to B remain unchanged. It is often useful to do become: in bulk - transmuting the identities of all objects in an array (either unidirectionally or bidirectionally). A group become: which does it magic atomically is great for implementing reflective updates to a system, for example. You can change a whole set of classes and their instances in one go.

You can even conceive of type safe become: . Two way become: is only type safe if the type of A is identical to that of B, but one way become: only requires that the new object be a subtype of the old one.

It may be time to reconsider whether having an object table is actually a good thing.

Saturday, July 11, 2009

A Ban on Imports (continued)

In my previous post, I characterized imports as evil, and promised to expand upon non-evil (yay verily, even good) alternatives. First, to recap:

Imports are used for linking modules together. Unfortunately, they are embedded within the modules they link instead of being external to them. This embedding makes the modules containing the imports dependent on the specific linkage configuration the imports represent.

Workarounds like dependency injection are just that: workarounds (e.g., see this post). They are complex, cumbersome, heavyweight. OSGi even more so. Above all, they are unnecessary - provided the language has adequate modularity constructs.

So, which languages have sufficient modularity support? I know of only two such languages: Newspeak and PLT Scheme. ML has a very elaborate module system, but ultimately it does not meet my requirements.

Modules and their definitions (these are two distinct things) should be first class and support mutual recursion. This isn’t the case in ML, though some dialects do support mutual recursion.

Tangent: The difficulty in ML, incidentally, is rooted in the type system. It is very hard to typecheck the kind of abstractions we are talking about. Worse, if you want to make your type declarations modular, your modules end up having types as members. This can lead you into deep water with types of types (making your type system undecidable). To avoid that trap, ML opts to stratify the system, so that modules (that contain types) are not values (that have types).

Not surprisingly then, progress on these issues comes from the dynamically typed world. Over a decade ago, the Schemers introduced Units.

The biggest difference between Newspeak modularity constructs and units is probably the treatment of inheritance. In Newspeak our module definitions are exactly top level classes, which reduces the number of concepts while allowing module definitions to benefit from inheritance.

There are strong arguments against inheritance of module definitions. For example, you cannot reliably add members to a module definition, because they might conflict with identically named members in the heirs of that definition. Specifying a superclass (or super module definition) looks like a hardwired dependency as well.

On the other hand, being able to reuse module definitions via inheritance is very attractive. Especially if you can mix them in freely.

Ultimately, we decided that the benefits of unifying classes and module definitions outweighed the costs.

Take the argument above regarding extending module definitions with new members. Newspeak was designed with an eye toward a completely networked world, where software is a service, not an artifact. In such a world, you can find all your heirs - just as if you were working on your own private application in your IDE. So if you need to add a member to a module definition, you should be able check who is mixing it in and what names they have added.

Tangent: This may still sound radical today, but this world is moving into place as we speak:
V8 gives the web browser the performance needed to be a platform for serious client software.
HTML 5, Gears etc. provide such software with persistent storage on the client
Chrome OS makes it obvious (as if it wasn’t clear enough before) that this in turn commoditizes the OS, and that the missing pieces will keep coming.

Likewise, in the absence of a global namespace, top level classes do not inherit from any specific superclass (and nested classes don’t either because all names are late bound) . Overall, the downside of allowing inheritance on module definitions doesn't apply in Newspeak.

The upside compared to conventional constructs is huge. It means you can easily take entire libraries, create multiple instances of them (each with its own configuration), mix them into new definitions, write polymorphic code that can work simultaneously with different instances or even different implementations of the API etc. You can store the libraries and their instances in variables, pass them as parameters, return them from computations, hold them in data structures, serialize them to disk or over the wire - all with the same mechanisms you use for ordinary classes and objects.

This economy of mechanism is important. It means you don’t have to learn a variety of specialized and complex tools to build modular systems. The same basic tools you use to implement basic CS101 examples will serve across the board. This will carry through to other areas like tooling: an object inspector can be used to inspect a “package”, for example. Altogether, your system can be much smaller - which makes it easier to learn, faster to load, likelier to fit on small devices etc. Simplicity is an advantage in itself.

As I explained in the first half of this series, the only need for a global namespace is for configuration: linking the pieces of an application together. There are several ways you can deal with the configuration/linkage issue. It’s a tooling issue. We use the IDE, as I described in an older post. So using the running example from part 1, we can write:

class SoundSystem usingPlatform: platform andPlayer: player = {

|

(* dependencies on platform might include things like the following: *)

List = platform Collections List. (* You can see how this replaces an import *)

mp3Player = player usingPlatform: platform withDock: self.

|
}{ ... }


class IPhone usingPlatform: platform withDock: dock = {


|

(* dependencies on platform elided *)

myDock = dock.

|

}{ ... }


class Zune usingPlatform: platform withDock: dock = {


|


(* dependencies on platform elided *)

theirDock = dock.

|

}{ ... }

and then create instances in the IDE (which provides us with a namespace where SoundSystem, iPhone(tm) and Zune(tm) are all bound to the classes defined above):

sys1:: SoundSystem usingPlatform: Platform new andPlayer: IPhone.

sys2:: SoundSystem usingPlatform: Platform new andPlayer: Zune.

tm: Did you know? iPhone is trademark of Apple; Zune is a trademark of Microsoft.

Variations on the above are possible; hopefully, you get the idea. If not - well, don’t worry, I probably won’t explain it again.

The absence of a global namespace has additional advantages of course: there’s no static state, and it’s good for security (but that is for another day).

Tuesday, June 30, 2009

A Ban on Imports

This is not, of course, an essay on restricting free trade. Rather, this post is about the evils of the import clause, which occurs in one form or another across a wide array of programming languages, from Ada, Oberon and the various Modulas, through Java and C#, and on to F# and Haskell.

The import clause should be banned because it undermines modularity in a deep and insidious way. This is a point I’ve attempted to convey time and time again, with only limited success. I will now try to illustrate the problem via a hardware inspired example.

Consider the not-so-humble MP3 player. An MP3 player is a hardware module. The market is full of them, as well as other hardware modules they can plug in to. For example, sound systems where on can dock an MP3 player and have it play on stereo speakers.

Let’s try and describe the analog of such a sound system using programming language modularity constructs that rely on imports:

module SoundSystem

import MP3Player;

... wonderful functionality elided ...


end


I want to describe how my sound system works, separately from the description of how an MP3 player works. I would like to later plug in a particular MP3 player, say a Zune(tm) or an iPod(tm)

tm: Zune and iPod are trademarks of Microsoft and Apple respectively, two companies with armies of lawyers who might harass me if I do not state the obvious.

Now the first problem is that neither Zune or iPod are named MP3Player. If I want to connect my sound system to a Zune, I will have to edit the definition of SoundSystem to name the specific module I want to import.

If you’re very petty, you might say that Zune and iPod do not share a common interface and cannot be docked into the same sound system. Imagine that we wish to use our sound system with an iPhone (tm) and an iPod Touch (tm) of some compatible generation.

tm: iPhone and iPod Touch are trademarks of Apple.

Say I decide to go with a Zune.

module SoundSystem

import Zune;

... wonderful functionality elided ...


end


Later I change my mind for some reason, and want to hook up my system to an iPod. It’s easy: I just edit the definition of my system again, to import iPod:

module SoundSystem

import iPod;


... wonderful functionality elided ...


end


The question you should be asking is: Why should I edit the definition of my system each time I change the configuration? In reality, it is unlikely that I am actually the designer of SoundSystem. I probably don’t even have access to its definition. I just want to configure it to work with my MP3 player.

The problem is that import confounds module definition and module configuration. Module definition describes the design of a module; module configuration describes how one hooks up different modules. The former has to do with module internals; the latter should be done externally to the modules involved, to allow them to be used in any context where they could function.

We clearly want our sound system to abstract over the specific player being plugged in to it. Any player with a compatible interface will do. A well known mechanism for abstracting things is parameterization. We might be happier if we defined our sound system parametrically with respect to the MP3 player

module SoundSystem(anMP3Player){
... great wonders using anMP3Player ...
}


We could them configure our system to use an iPod:

SoundSystem(iPod);

or a Zune

SoundSystem(Zune);

without having to modify (or even have access to) the source code for the definition of SoundSystem. Hurray!

The module definition looks a lot like a function, and the configuration code looks like a function application. This is very suggestive. Indeed, ML introduced a module system based on function-like things called functors a quarter century ago. But there’s a bit more to this.
These hardware pieces tend to plug in to each other. For example, the definition of IPod is parametric too:

IPod(dockingStation){... even greater wonders ...}

Our sound system does its thing by behaving like a docking station. It and the MP3 player are mutually recursive modules. Configuration therefore requires support for mutual recursion (which is not allowed in Standard ML):

letrec {

dock = SoundSystem(mp3Player);


mp3Player = IPod(dock);


} in dock;


If this notation is unfamiliar, please brush up on your functional programming skills before you become unemployable. Basically, ignore the first and last line, and treat the two lines involving = as equations.

So the module definitions are a lot like functions that yield modules. You could also think of module definitions as classes yielding instances. The instances are like physical hardware modules.

Now we can add another sound system and use our old Zune

letrec {

dock2 = SoundSystem(oldMP3);


oldMP3 = Zune(dock2);


}
in dock2;

This is what is often called side by side deployment - multiple instances of the same design, configured differently.

Tangent: Yes, Virginia, you can achieve that sort of thing in Java despite imports, using class loaders. Imports hardwire names into your code, and class loaders can counteract that by letting you define multiple namespaces. These can have multiple copies of your code, potentially hardwired to different things (even though they all have the same name). If you think class loaders offer a simple, clean way of doing things that is easy to learn, use, understand and debug, this post is not for you. Nor will any amount of OSGi magic on top fundamentally change things.

We might also choose to define things differently

module SoundSystem(MP3Player) {

player = MP3Player(self);


...


}


Here, we are passing module definitions as parameters. We are also referring to SoundSystem’s current instance from within itself - a lot like classes, no? We might configure things thusly

SoundSystem(iPod);

So it looks like mutual recursion and first class module definitions are very natural things to have. And yet traditional languages do not support this - even though many languages have constructs like classes and functions that are first class values and can be defined in a mutually recursive fashion.

One problem with using these constructs to define modules is that they are usually able to access anything in the global namespace. This makes it very hard to avoid implicit dependencies.

Interestingly, the global namespace is exactly what import requires. Since we don’t need or want import, let’s do away with it and the global namespace. We clearly will get a much more modular system without it; but wait - there seems to be one place where we really want the global namespace. That is when we write our configuration code, the code that wires our modules together.

That’s fine - there are a number of solutions for that. It isn’t always clear that our configuration language is the same language as the programming language(s) that define our modules, for example. If you write a makefile, the global namespace is defined by your file system and accessible within the makefile. Not that I really want to recommend make and its ilk.

I think we do want to code our configuration in a nice general purpose high level programming language. One solution is to have our IDE provide us with an object representing the known global namespace, and write our configuration code with respect to that namespace object. This is essentially what we do in Newspeak.

In the next post, I’ll discuss more of the advantages of this approach, contrast how Newspeak handles things with other languages with powerful module systems, like Scheme (which for the past decade or so has had a system called Units that is quite close to what I’ve discussed so far) and ML, and show once more how one actually does configuration in Newspeak.

To be continued.

Monday, May 25, 2009

Original Sin

I’ve often said that Java’s original sin was not being a pure object oriented language - a language where everything is an object.

As one example, consider type char. When Java was introduced, the Unicode standard required 16 bits. This later changed, as 16 bits were inadequate to describe the world’s characters.

Tangent: Not really surprising if you think about it. For example, Apple’s internationalization support allocated 24 bits per character in the early 1990s. However, engineering shortcuts are endemic.

In the meantime, Java had committed to a 16 bit character type. Now, if characters were objects, their representation would be encapsulated, and nobody would very much affected how many bits are needed. A primitive type like char, however, advertises its representation to the world. Consequently, people dealing with unicode in Java have to deal with encoding code points themselves.

I was having this conversation for the umpteenth time last week. My interlocutor asked whether it would be possible for Java to be as efficient as it was without primitive types. The answer is yes, and prompted this post.

So how would we go about getting rid of primitive types without incurring a significant performance penalty?

Java has a mandatory static type system; it is compiled into a statically typed assembly language (Java byte codes, aka JVML). It supports final classes. I do not favor any of these features, but we will take them as a given. The only changes we will propose are those necessary to eradicate primitive types.

Assume that the we have a final class Int representing 32 bit integers. The compiler can translate occurrences of this type into type int. Hence, we can generate the same code for scalars as Java does today with no penalty whatsoever.

To make Int a suitable replacement for int, we would like the syntactic convenience of using operators:

Int twice(Int i) { return i + i;}

This is easy enough. We don’t want operator overloading of course. What we want is simply the ability to define methods whose names are operators. The language can support the same set of binary operators it does today, with the same fixed precedence. No need for special syntax and rules to define the precedence of operators etc. I leave that sort of thing to people who love complexity.

It is crucial that Ints behave as values. This requires that the == operator on Ints behaves the same as equality. There should be no way to detect if we’ve allocated one copy of the number 3 or a million of them. Since we can define == as a method, we can define it in Object to work the way it does for most objects, and override it with a method in Int and other value classes to work differently. We’ll want to take care of identityHash as well of course.

At this point, we can simply say that int stands for Int. Except for locking. What would it mean to synchronize on an instance of Int? To avoid this nastiness, we’ll just say that all instances of Int are locked when they are created. Hence no user code can ever synchronize on them.

In this hypothetical dialect, literal integers such as 3 are considered instances of Int. They, and all other integer results, can be stored in collections, passed to polymorphic code that requires type Object etc. The compiler sees to it that they are usually represented as the primitive integer type of the JVM. When they are used as objects, they get boxed into real instances of Int. When an object gets cast to an Int, it gets unboxed.

This is similar to what happens today, except that it is completely transparent. Because these objects have true value semantics for identity, you can never tell if a new instance was allocated by the boxing or not; indeed, you cannot tell if boxing or unboxing occur at all.

So far I haven’t described anything really new. A detailed proposal along the lines above existed years ago. It would allow adding new user defined value types like Complex as well.

A more restricted, but somewhat similar proposal was considered as part of JSR201 - the idea being that boxing would create value objects, without actually replacing the existing primitive types. It was rejected by the expert group. As I recall, only myself and Corky Cartwright supported it. There were concerns as to what would become of the existing wrapper classes (Integer and friends); those are used by reflection etc., and are hopelessly broken because they have public constructors that allocate new instances for all the world to see.

Tangent: I don’t know who came up with the idea of an Integer class that could have multiple distinct instances of 3. I assume it was another engineering shortcut (see above) by someone really clever.

To be clear - I am not suggesting that this discussion should be reopened. This is just a mental exercise.

There is only one place where this proposal introduces a performance penalty: polymorphic code on type Object that tests for identity. Now that identity testing is method, it will be slower. Not by all that much, and only for type Object.

Why? Because we preclude overriding == in non-value classes. There are several ways to arrange this. Either by fiat, or by rearranging the class hierarchy with types Value (for value types) and Reference (for regular objects) as the only subtypes of Object, making == final in those two types. Or some other variation

What has not been seriously discussed, to my knowledge, is what to do with arrays. As indicated above, scalar uses of primitive types don’t pose much of a problem.

However, what of types like int[]? Again, we can allocate these as arrays of 32 bit integers just as we do today. No memory penalty, no speed penalty when writing in ints or reading them out.

What makes things complicated is this: If int is a subtype of Object (as it is in the above) we’d expect int[] to be a subtype of Object[], because in Java we expect covariant subtyping among arrays.

Of course that isn’t type safe, and one could certainly argue that this our chance to correct that problem. But I won’t. Instead, assume we want to preserve Java’s covariant array subtyping.

The thing is, we do not want to box all the elements of an int[] when we assign it to an Object[].

The zeroth order solution is to leave the existing subtype relations among arrays of the predefined types unchanged. It’s ugly, but we are no worse off than we are today - and have a bunch of advantages because we are free of primitive types. But we can do better.

Suppose Object[] has methods getObjectElement and setObjectElement. These (respectively) retrieve and insert elements into the array.

We can equip int[] with versions of these methods that box/unbox elements that are being retrieved or inserted. We ensure that a[i] and a[i] = e are compiled differenty based upon the static type of a. If a is Object[], we use the getObjectElement and setObjectElement. If a is of type int[], we use methods that access the integer representation directly, say, getIntElement and setIntElement.

At this point, we can even introduce meaningful relations among array types like int[] and short[] using the same technique.

In essence, Object[] provides an interface that different array types may implement differently. We just provide a sugar for accessing this interface magically based on type. On a modern implementation like Hotspot, the overhead for this would be minimal.

Of course, if you make arrays invariant, the whole issue goes away. That just makes things too easy. Besides, I’ve come to the conclusion that Java made the right call on array covariance in the first place.

All in all - Java could have been purely object oriented with no significant performance hit. But it wasn’t, isn’t and likely won’t. Sic Transit Gloria Mundi.

Thursday, April 30, 2009

The Need for More Lack of Understanding

People are always claiming that if only there was more understanding in the world, it would be a better place. This post will argue that less is more: we need less understanding - specifically more not understanding.

A couple of weeks ago I gave a talk at DSL Dev Con. One of the encouraging things that was evident there was the increased understanding that not understanding is important.

Tangent: While I'm advertising this talk, I might as well advertise my interview on Microsoft's channel 9 which explains the motivation for Newspeak and its relation to cloud computing.


Several programming languages support a mechanism by which a class or object can declare a general-purpose handler for method invocations it does not explicitly support.

Smalltalk was, AFIK, the first language to introduce this idea. You do it by declaring a method called doesNotUnderstand: . The method takes a single argument, that represents a reification of the call. The argument tells us the name of the method that was invoked, and the actual arguments passed. If a method m is invoked on an object that does not have a member method m (that is, m is not declared by the class of the object or any of its superclasses), then the object’s doesNotUnderstand: method is invoked. The default implementation of doesNotUnderstand:, declared in class Object, is to throw an exception. By overriding doesNotUnderstand: one can control the system’s behavior when such calls are made. Similar mechanisms exist in several other dynamic languages (e.g., missingMethod in Ruby and Groovy, _noSuchMethod_ in some dialects of Javascript).

Aficionados of these languages know that this is an extremely useful mechanism. However, users of mainstream object-oriented languages typically lack an appreciation of the power this mechanism can provide. I hope this post can be a small step in rectifying that situation.

DoesNotUnderstand: helps implement orthogonal persistence, lazy loading, futures, and remote proxies, to name a few. Recently, there’s been a surge of interest in domain-specific languages, and doesNotUnderstand: can help there as well.

We’ll use an example from my talk at DSL Dev Con. Consider how to interact with an OS shell like bash or csh from within a general purpose programming language. We’ll use Newspeak as our general purpose language (what were you expecting?), because it works best (in my unbiased opinion).

Suppose you want a listing of the files in the current directory. You could view ls as a method on a shell object, and write: shell ls. Of course, we won’t do something like the following Java code:

class Shell {
public Collection ls() {...}
... an infinity of other stuff
}

There are any number of commands that a shell can understand, depending on the current path and the executables in the directories on that path. We cannot plausibly enumerate them all as a fixed set of methods in Shell.

Instead, in we can define a class NewShell with a doesNotUnderstand: method to look up the name of the message in the shell’s path and execute it.

shell ls

If we write this code in the context of a subclass of NewShell, we can take advantage of Newspeak’s implicit receiver sends and just write

ls

Nice, but not quite good enough.

ls aFilename

doesn’t work at all. We don’t want to invoke ls immediately here - we need to gather its arguments in some way. One way to do this is to have doesNotUnderstand: return a function object, that can be fed its arguments. This is in fact what we do in our implementation. We call this object a CommandSession. To get a CommandSession to actually run the command, you call one of is value methods, with the desired arguments:

ls value: aFileName

This is less convenient for the simple case, where we need to write

ls value

to get ls to do something - but it is much more general.

What about modifiers, as in ls -l ? We can make simple cases work slightly better by defining - as a method on CommandSession :

ls -’l’


This is what the current implementation does.

The most general approach is to treat them as arguments

ls value: ‘-l’
ls value: ‘-l’ value: aFileName

An alternative might be to leave ls as it was originally, but allow

ls: aFileName

as well. In this version, doesNotUnderstand: checks to see if the message takes an argument (i.e., it ends with a colon). If so it strips the colon off the message name, creates CommandSession for the result, and calls its value: method with the argument. This handles modifiers pretty well

ls: ‘-l’

If there are multiple arguments, we can pass a tuple as the argument, and doesNotUnderstand: will unpack it as needed.

ls: {‘-l’. aFileName}

Now how about pipes?

We could introduce pipeValue methods, that produced an object that responded to the pipe operator. Or we could say that everything produced a CommandSession (and these understood “|”) and a special action is needed to get a result (sending it an evaluate or end message). This action is the analog of the newline that tells the shell to go ahead and evaluate. This could be dispensed with in a lazy setting.

Combining our second proposal above with this, we could say that value was used to derive a result. Then we can view the shell as a combinator library for CommandSessions. This does conflate two issues - the use of CommandSession to delay evaluation until a result is needed (the shell parses the input as a unit ensuring laziness) and the use of real combinators on byte streams.

We use NewShell in our IDE - for example, to manipulate subversion commands in the source control browser. It would be nice to refine it further, perhaps along the lines suggested above, but even in its current simplistic incarnation, it is quite useful.

As I noted at the beginning of this post, there a host of other cool uses for doesNotUnderstand:. I may return to those in another post.

Of course, if you are a fan of mandatory static typing, you aren’t allowed to use doesNotUnderstand: in your language. n the general case, it simply cannot be statically typed - which is an argument against mandatory typing, not against doesNotUnderstand:.

Just as switch statements, catch clauses and regular expressions all need defaults/catch-alls/wildcards, so does method dispatch. There are situations where you cannot avoid uncertainty. Reality is dynamic.

Saturday, April 18, 2009

The Language Designer’s Dilemma

Last week I was at lang.net 09 and DSL Dev Con. It was great fun, as always. The best talk was by Erik Meijer. Use the link to see it - but without live video of Erik in action, you can't really understand what it was like. Such is performance art.

I also gave a talk, and there were many others, but that is not the point of this post. Rather, this post was prompted by one specific, and excellent, talk - Lars Bak’s presentation on V8. Lars clearly enjoyed his visit to the lion’s den; more importantly, perhaps Microsoft will finally shake off their apparent paralysis and produce a competitive Javascript engine.

That, in turn, will make it possible to distribute serious applications targeting the web browser, with the knowledge that all major browsers have performant Javascript engines that can carry the load.

It’s all part of the evolution of the web browser into an OS. This was what Netscape foresaw back in 1995. And it is a perfect example of innovator’s dilemma.

Innovator’s dilemma applies very directly to programming languages, and Todd Proebsting already discussed this back in 2002.

To recap, the basic idea is that established players in a market support sustaining innovation but not disruptive innovation. Examples of sustaining innovation would be the difference between Windows 2000 and Windows XP, or between Java 1.0 thru Java 6.
Sustaining innovations are gradual improvements and refinements - some improvement in performance, or some small new feature etc.

Disruptive innovation tends to appear “off the radar” of the established players. It is often inferior to the established product in many ways. It tends to start in some small niche where it happens to work better than the established technology. The majors ignore it, as it clearly can’t handle the “real” problems they are focused on. Over time, the new technology grows more competent and eats its way upward, consuming the previous market leader’s lunch.

We’ve seen this happen many times before. Remember Sun workstations? PCs came in and took over that market. Sun retreated to the server business, and PC’s chase it up towards the high end of that market, until there’s nowhere left to run.

In the programming language space, take Ruby as an example. Ruby is slow, until quite recently lacked IDE support etc. It’s easy for the Javanese to dismiss it. Over time, such technology can evolve to be much faster, have better tooling etc. And so it may grow into Java’s main market.

Don’t believe me? Java in 1995 was just a language for writing applets. It’s performance was poorer than Smalltalk’s, and nowhere near that of C++. There were no IDEs. Who's laughing now?

Javascript is an even clearer case in point. It was just a scripting language for web browsers. Incredibly slow implementations, restricted to interacting with the just the browser. No debuggers, no tools, no software engineering support to speak of.

Then V8 came along and showed people that it can be much faster. Lars’ has doubled performance since the release in September, and expects to double it again within a year, and again 2-3 years after that.

Javascript security and modularity are evolving as well. By the time that’s done, I suspect it will far outstrip clunky packaging mechanisms like OSGi. This doesn’t mean you have to write everything in Javascript - I don’t believe in a monolingual world.

Tangent: Yes, I did spend ten years at Sun. A lot of it was spent arguing that Java was not the final step in the evolution of programming languages. I guess I’m just not very persuasive.

The web browser itself is one of the most disruptive technologies in computing. It is poised to become the OS. Javascript performance is an essential ingredient but there are others. SVG provides a graphics model. HTML 5 brings with it persistent storage - a file system, as it were - only better. You may recall that Vista was supposed to have a database like file system. Vista didn’t deliver, but web browsers will.

This trend seems to make the traditional OS irrelevant. Who cares what’s managing the disk drive? It’s just some commoditized component, like a power supply. We can already see how netbooks are prospering. This isn’t good news for Windows.

Of course, you might ask why Microsoft would go along with this? Is IE’s Javascript so inadequate on purpose? Maybe it is part of the master plan? Well, I don’t believe in conspiracy theories.

It does look as if Microsoft is facing a lose-lose proposition. Making IE competitive supports the movement toward a web-browser-as-OS world. Conversely, if they let IE languish, as they have, they can expect its market share to continue to drop.

However, you cannot stop the trend toward the Web OS by making your tools inferior. You can only compete by innovating, not by standing still.

I wouldn’t count Redmond out just yet. They have huge assets - a terrific research lab, armies of smart people in product land as well, market dominance, and vast amounts of money. They also have huge liabilities of course - and those add up to innovator’s dilemma.

In any case, for language implementors, it’s clear that one needs to be able to compile to the internet platform and Javascript is its assembly language. Web programming will evolve into general purpose programming, using the persistent storage provided by the browser to function off line as well as online.

As many readers of this blog will recognize, this sets the stage for the brave new world of objects as software services. I hope to bring Newspeak to the point where it is an ideal language for this coming world. It’s a bit difficult with the very limited resources at our disposal, but we are making progress.

The entire progression from conventional computing to the world of the Web OS is taking longer than one expects, but is happening. The time will come when we hit the tipping point, and things will change very rapidly.

Sunday, March 29, 2009

Subsuming Packages and other Stories

Barbara Liskov recently won the Turing award. It’s great to see the importance of programming language work recognized once again. Even if you aren’t into the academic literature, you have probably heard her name, for example, when people mention the Liskov substitution principle (LSP). The LSP states that if a property P is provable for an object of type S, and T is a subtype of S, then P is provable for an object of type T.

This is behavioral subtyping - instances of subtypes are semantically substitutable where supertypes are expected. This is a very strong requirement, but not something a practical programming language can enforce.

Closely related (and enforceable) is the subsumption rule of type systems for OO languages. This rule states that if an expression e has type T, and T is a subtype of S, then e has type S as well. This post is about the importance of subsumption in programming language design.

The subsumption rule is a cornerstone of formal type systems for OO. More importantly, it captures a deep intuition that programmers have about subtyping. After all, what could be more natural: if e: T and T <: S, e: S. I hope readers agree with me that this property is intuitive, perhaps even so obvious that you wonder why anyone would bother belaboring the point. *** So it is most unfortunate that mainstream programming languages violate subsumption left and right. Below, we’ll look at some examples, and draw some conclusions about language design. Case 1: Hiding fields. In Java (and C++) you can define a field in a subclass with type X, even though the superclass has a field of the same name, with unrelated type Y. The subclass’ field hides the superclass field.

class S { final int f = 42;}

class T extends S { final String f = “!”;}

new T().f; // “!”
((S) new T()).f; // 42

In case it isn’t obvious, any object of type S has the property that it’s f field has type int. An instance of T should also have the same property, but doesn’t. To get at the field defined by S, we have to coerce the T value into an S.

Tangent: It would be much better to ensure that fields are always private, or better yet, avoid field references altogether, as in Self or Newspeak.

A typical programmer in the Java/C++ tradition may ask what’s the harm in the rules as they are.

The harm is precisely in having rules that contradict strong and valid intuition. The contradiction will bite you in the end, because you will end up thinking some program property holds based on your intuition, while the rules of the language contradict that intuition. Years of exposure to these misfeatures may have inured the mind to their toxicity, but they are toxic all the same.

However, there is much more harm, as we’ll see as this post progresses.

Case 2: Hiding static methods. The same sort of situation occurs with static methods. Static methods aren’t really methods at all, as no run time object is involved in determining the meaning of their invocations. No overriding can occur between static methods. The meaning of a static method is statically bound: static method is a misnomer for subroutine (as in FORTRAN subroutine circa 1955). And so Java allows static methods to hide each other much like fields. You can easily recreate the above example with static methods.

I’ve written elsewhere about the problems of all things static; add this to the list. Maybe you’re beginning to get the sense that violating subsumption correlates with iffy language constructs.

Case 3: Package privacy. This brings us back to Barbara Liskov. Her work on CLU in the mid 70s brought David Parnas’ notion of information hiding into programming languages, with support for abstract datatypes (ADTs). Packages are one of the mechanisms that Java uses to support ADTs. Packages have had all kinds of problems; lack of subsumption is at the root of many of them.

For example, recall that in Java, protected access implies package access. Package access is always relative the the current package. So when a protected member is inherited by a subclass in another package, it eliminates the access from the superclass package (while granting access to its own package):

package P1;

public class A {protected int foo(){return 91;}}

class C {static int i = (new P2.B()).foo();}

**
package P2;

public class B extends P1.A {}


This code is legal. If we then change B such that

public class B extends P1.A {protected int foo() {return 42;}}

The call in C is no longer allowed! We have not added or removed members in the subclass, or changed their accessibility (or have we? Did we have a choice?) and yet we’ve broken client code.

I don’t want to go into the details of an even more bizarre example (due, as far as I can recall, to David Chase) - but others have. Suffice to say that there was a huge bug tail around such examples; it took years to resolve them. Compiler engineers did not correctly diagnose the problems, because their intuition led them to try and preserve subsumption, to no avail. I have several patents to my name that deal with the mechanisms of implementing the desired behavior.

And why should you care? Because these issues led to subtle incompatibilities between compilers and VMs, across vendors and releases; these led to subtle program bugs, to jokes about write once debug everywhere etc.

Packages are one way in which Java supports ADTs. It turns out that you cannot have ADTs, inheritance and subsumption in one language. You must make a choice - and I argue that subsumption must always be preserved.

Case 4: Class-based Encapsulation.
Another mechanism that supports ADTs in all the mainstream OO languages is class based encapsulation. This the idea that privacy is per class type, not per object. This makes it possible to write code like

class C {
private int f(){return 42;}
public int foo(C c){return c.f();} // I can get at c’s f(), because we’re both C’s
}

It also makes it possible to violate subsumption:

public class A {
{
((A) new B()).f(); // 42
new B().f(); // “!”
}
private int f(){return 42;} // package private
}

public class B extends A {
public String f(){return “!”;}
}

An alternative to class-based encapsulation is object-based encapsulation. Privacy is per object. You can enforce it via context free syntax. For example, you can arrange that only method invocations on self (aka this) attempt to lookup private methods. You don’t need a typechecker (aka verifier) to prevent malicious parties accessing private methods or fields. Consequently, you don’t have to rely on a complex byte code verifier to prevent such access as one does in Java and .Net.

This fits perfectly with the object-capability security model. It is essentially what we do in Newspeak; but the ideas are universal.

Oh, and object-based encapsulation works with both subsumption and inheritance.

Case 5: Privileged access within “nests” of classes. Giving enclosing classes privileged access to their nested classes, and giving sibling classes privileged access to each other, is another violation of subsumption (but giving a nested class free access to its surrounding scope, including private elements of its enclosing class, is fine).

Another way to look at all this is that object-based encapsulation falls out naturally from the use of interfaces. Consider what would happen if the type C above was, by language definition, the public interface of C (as it is in Strongtalk, for example). Subsumption would fall out automatically.

Subsumption is a very good litmus test for language constructs; if a construct violates subsumption, it will cause grief. Ditch it.

Strict adherence to subsumption reduces cognitive dissonance for programmers, compiler writers, VM implementors and language designers. It helps avoid bugs in user code, compilers and VMs, enhancing reliability and portability. It makes it easier to build secure systems.

So if you design a type system for an OO language, preserve subsumption at all costs. Do not try and combine ADTs with inheritance (because then you lose subsumption).

Now if we can’t have ADTs and inheritance, we have to chose between them. I believe the benefits of inheritance far outweigh the costs. Those who feel differently are unlikely to do better than Luca Cardelli‘s sublimely beautiful Quest. On the other hand, if one choses inheritance, one should go with object-based encapsulation.

Put another way, if you take the adage program to an interface, not an implementation seriously, it will prevent a host of problems. This is what we are doing with Newspeak.

At a meta-level, the lesson is: pay attention to programming language theory; occasionally, you can learn from it.

Update: Yardena pointed out this discussion of Java packages. It goes through the problems in detail. I've added a link to it in the main post as well.