A place to be (re)educated in Newspeak

Saturday, June 07, 2008

Incremental Development Environments

Back in 1997, when I started working at Sun, I did not expect to do much programming language design. After all, Java was completely specified in the original JLS, or so it was thought.

What I actually expected to do was to work on a Java IDE. Given my Smalltalk background, I was of course very much aware of how valuable a good IDE is for programmers. The Java IDE I had in mind never materialized. Management at the time thought this was an issue to be left to other vendors. If this seems a little strange today - well, it seemed strange back then too. In any case, the Java world has since learned that IDEs are very important and strategic.

That said, todays Java IDEs are still very different from what I envisioned based on my Smalltalk background.

Java IDEs have yet to fully embrace the idea of incremental development. Look in any such system, and you’ll find a button or menu labeled Build. The underlying idea is that a program is assembled from scratch every time you change it.

Of course, that is how the real world works, right? You find a broken window on the Empire State building, so you tear it down and rebuild it with the fixed window. If you're clever, you might be able to do it pretty fast. Ultimately, it doesn't scale.

The build button comes along with an obsession with files. In C and its ilk, the notion of files is part of the language, because of #include. Fortunately, Java did away with that legacy. Java programmers can think in terms of packages, independent of their representation on disk. Why spend time worrying about class paths and directory structures? What do these have to do with the structure of your program and the problem you’re trying to solve?

The only point where files are useful in this context is as a medium for sharing source code or distributing binary code. Files are genuinely useful for those purposes, and Smalltalk IDEs have generally gone overboard in the opposite direction; but I digress.

A consequence of the file/build oriented view is that the smallest unit of incremental change is the file - something that is often too big (if you ever notice the time it takes to compile a change, that’s too long) and moreover, not even a concept in the language.

More fundamentally, what’s being changed is a static, external representation of the program code; there is no support for changing the live process derived from the code so that it matches up with the code that describes it. It’s like having a set of blueprints for a building (the code in the file) which doesn’t match the building (the process generated from the code).

For example, once you add an instance variable to a class, what happens to all the existing instances on the heap? In Smalltalk, they all have the new instance variable. In Java - well, nothing happens.

In general, any change you make to the code should be reflected immediately in the entire environment. This implies a very powerful form of fix-and-continue debugging (I note in amazement that after all these years, Eclipse still doesn’t support even the most basic form of fix-and-continue).

All this is of course a very tall order for a language with a mandatory static type system.
I’m not aware of a JVM implementation that can begin to deal with class schema changes (that is, changing the shapes of objects because their class has lost or acquired fields). It’s not impossible, but it is hard.

Consider that removing a field requires invalidating any code that refers to it. In a language where fields are private, the amount of code to invalidate is nicely bounded to the class (good design decisions tend to simplify life). Public fields, apart from their nasty software engineering properties, add complexity to the underlying system.

This isn’t just a headache for the IDE. The VM has to provide support for changing the classes of existing instances. However, in the absence of VM support there all kinds of tricks one can play. If you compile all code yourself, you can ensure that no one accesses fields directly - everything goes through accessors. You can even rewrite imported binaries. With enough effort, I believe you can make it all work on an existing JVM with good JVMDI support.

Changing code in a method is supported by JVMDI (well, the JVMDI spec allows for supporting schema changes as well - it’s just that it isn’t required and no one ever implemented it). However, what happens if you change the signature of a method? Any number of existing callers may be invalid due to type errors. The IDE needs to tell you about this pretty much instantaneously, invalidating all these callers. Most of this worked in Trellis/Owl back in the late 80s. The presence of the byte code verifier means that this applies to binary code as well.

Achieving true incremental development is very hard. Still, given the amount of people working on Java, you’d think it would have happened after all these years. It hasn’t, and I don’t expect it to.

Someone will rightly make the point that mandatory typing can be very helpful in an IDE - its easier to find callers of methods, implementors, references to fields or classes, as well as refactoring (though, oddly, all these features originated in IDEs for dynamic languages - Smalltalk or Lisp; speculating why that is would make for another controversial post). This post isn’t really about static/dynamic typing - it’s about incrementality in the IDE.

Of course, mainstream IDEs annoy me for other reasons: the bloat, the slow startup, and most of all the B-52 bomber/Apollo space capsule style UI. That probably deserves another post.

In the meantime, I can go back to Vassili’s fabulous Hopscotch browsers and leave the mainstream to cope with all the docking bars, tabs and panes too small to print their name. You, dear reader, if you’re using a mainstream IDE, may not realize what you’re missing. To an extent, these things have to be experienced to be appreciated. Still I encourage you to demand better - demand true incremental development support, in whatever language you use. Because in the end, there are only two kinds of development - incremental, and excremental.