A place to be (re)educated in Newspeak

Saturday, June 07, 2008

Incremental Development Environments

Back in 1997, when I started working at Sun, I did not expect to do much programming language design. After all, Java was completely specified in the original JLS, or so it was thought.

What I actually expected to do was to work on a Java IDE. Given my Smalltalk background, I was of course very much aware of how valuable a good IDE is for programmers. The Java IDE I had in mind never materialized. Management at the time thought this was an issue to be left to other vendors. If this seems a little strange today - well, it seemed strange back then too. In any case, the Java world has since learned that IDEs are very important and strategic.

That said, todays Java IDEs are still very different from what I envisioned based on my Smalltalk background.

Java IDEs have yet to fully embrace the idea of incremental development. Look in any such system, and you’ll find a button or menu labeled Build. The underlying idea is that a program is assembled from scratch every time you change it.

Of course, that is how the real world works, right? You find a broken window on the Empire State building, so you tear it down and rebuild it with the fixed window. If you're clever, you might be able to do it pretty fast. Ultimately, it doesn't scale.

The build button comes along with an obsession with files. In C and its ilk, the notion of files is part of the language, because of #include. Fortunately, Java did away with that legacy. Java programmers can think in terms of packages, independent of their representation on disk. Why spend time worrying about class paths and directory structures? What do these have to do with the structure of your program and the problem you’re trying to solve?

The only point where files are useful in this context is as a medium for sharing source code or distributing binary code. Files are genuinely useful for those purposes, and Smalltalk IDEs have generally gone overboard in the opposite direction; but I digress.

A consequence of the file/build oriented view is that the smallest unit of incremental change is the file - something that is often too big (if you ever notice the time it takes to compile a change, that’s too long) and moreover, not even a concept in the language.

More fundamentally, what’s being changed is a static, external representation of the program code; there is no support for changing the live process derived from the code so that it matches up with the code that describes it. It’s like having a set of blueprints for a building (the code in the file) which doesn’t match the building (the process generated from the code).

For example, once you add an instance variable to a class, what happens to all the existing instances on the heap? In Smalltalk, they all have the new instance variable. In Java - well, nothing happens.

In general, any change you make to the code should be reflected immediately in the entire environment. This implies a very powerful form of fix-and-continue debugging (I note in amazement that after all these years, Eclipse still doesn’t support even the most basic form of fix-and-continue).

All this is of course a very tall order for a language with a mandatory static type system.
I’m not aware of a JVM implementation that can begin to deal with class schema changes (that is, changing the shapes of objects because their class has lost or acquired fields). It’s not impossible, but it is hard.

Consider that removing a field requires invalidating any code that refers to it. In a language where fields are private, the amount of code to invalidate is nicely bounded to the class (good design decisions tend to simplify life). Public fields, apart from their nasty software engineering properties, add complexity to the underlying system.

This isn’t just a headache for the IDE. The VM has to provide support for changing the classes of existing instances. However, in the absence of VM support there all kinds of tricks one can play. If you compile all code yourself, you can ensure that no one accesses fields directly - everything goes through accessors. You can even rewrite imported binaries. With enough effort, I believe you can make it all work on an existing JVM with good JVMDI support.

Changing code in a method is supported by JVMDI (well, the JVMDI spec allows for supporting schema changes as well - it’s just that it isn’t required and no one ever implemented it). However, what happens if you change the signature of a method? Any number of existing callers may be invalid due to type errors. The IDE needs to tell you about this pretty much instantaneously, invalidating all these callers. Most of this worked in Trellis/Owl back in the late 80s. The presence of the byte code verifier means that this applies to binary code as well.

Achieving true incremental development is very hard. Still, given the amount of people working on Java, you’d think it would have happened after all these years. It hasn’t, and I don’t expect it to.

Someone will rightly make the point that mandatory typing can be very helpful in an IDE - its easier to find callers of methods, implementors, references to fields or classes, as well as refactoring (though, oddly, all these features originated in IDEs for dynamic languages - Smalltalk or Lisp; speculating why that is would make for another controversial post). This post isn’t really about static/dynamic typing - it’s about incrementality in the IDE.

Of course, mainstream IDEs annoy me for other reasons: the bloat, the slow startup, and most of all the B-52 bomber/Apollo space capsule style UI. That probably deserves another post.

In the meantime, I can go back to Vassili’s fabulous Hopscotch browsers and leave the mainstream to cope with all the docking bars, tabs and panes too small to print their name. You, dear reader, if you’re using a mainstream IDE, may not realize what you’re missing. To an extent, these things have to be experienced to be appreciated. Still I encourage you to demand better - demand true incremental development support, in whatever language you use. Because in the end, there are only two kinds of development - incremental, and excremental.

14 comments:

Matthew Kanwisher said...

I like the content but that contrast of the colors has to go LOL !

Stefan Schulz said...

While I agree that current IDEs do not serve the same level of convenience when changing a program as Smalltalk's environment did/does, there are some major differences in my opinion. For once, there is no separate Smalltalk IDE but actually Smalltalk. It's the philosophy of the language to develop applications in its very own environment, so, actually, developing an application is extending/modifying the environment itself.
Second, in current Java IDEs like eclipse and IntelliJ, there is more than just the core language being part of an application, actually with a tendency to "polyglot" programming. This is simply not possible to wrap up and cover in a single language environment.
And, surely, Java IDEs do nowadays support at least some kind of hot-code replacement, so it's not the whole application that is to be rebuild each time. It could be better wrt. structural changes, for that I agree.

Bob said...

I like your analogies. FWIW, the Java community recognizes the importance of real reloading. It's the #6 most popular request for enhancement: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4910812 (in part due to our campaigning: http://crazybob.org/2005/10/vote-for-real-class-reloading.html)

Sun's VM has supported changing method bodies for years now.

Eclipse has also supported "fake reloading" for as long as I can remember. They work around the VM by loading a second (third, fourth...) copy of the class. Unfortunately, this doesn't affect existing instances, but you can get pretty far if you work with it (always creating new instances, etc.).

There's also a new tool named JavaRebel which is supposed to support real reloading. It's closed source and they don't tell anyone how it works though, so it's tough to say how well it works. I'm skeptical.

It's also worth noting that even if we do get full class reloading in Java, we'll also need some sort of callback mechanism so the VM can tell frameworks when classes reload so they can update their own caches, etc.

Gilad Bracha said...

Bob,

I'm glad to hear that the need is getting more and more recognized. Since I started arguing for class reloading in 1997, I can't say I'm thrilled with the rate of progress so far. But then, I was also arguing for closures back then as well ...

Cedric said...

Hi Gilad,

Can you expand on how Eclipse doesn't allow fix-and-continue? Not only does Eclipse build incrementally all the time (you hardly ever use that Build button) and it allows you to refactor your code even if it has errors in it (could the Smalltalk IDE do this)? Other than that, you are right that IDEA and NetBeans are still way behind since they don't even build incrementally.

As for your other points, it seems to me that Smalltalk can get away with it because it basically requires you to ship the IDE as part of your applications, which is neither practical nor realistic.

-- ..
Cedric

Gilad Bracha said...

Cedric,

It's good that Eclipse builds incrementally (at the file level, which may still be inadequate for large files). It may be better than some other Java IDEs in that respect. I have no interest in the competition between Java IDEs. My points apply to any IDE in any modern language.

In any case, fix-and-continue doesn't work in Eclipse 3.0. Of course, you have to be clear what fix-and-continue means.

Write a method. Set a breakpoint in it. When it stops, edit it. Keep executing. Did the changes have any effect on the current execution? In Eclipse, they don't.

This is the most basic form of fix and continue. It doesn't involve types changing, objects mutating etc. It's not enough, but it is an important and useful step forward.

So please don't get up in arms about Eclipse vs. Netbeans. If you're a proponent of Eclipse, just fix it so it really is as good or better than its competition.

As for Smalltalk. It's easier in Smalltalk, for many reasons, as I mention in my post. However, the fact that Smalltalk IDEs are not traditionally separated from the application has absolutely nothing to do with it.

That lack of separation is a major flaw, which can be corrected by using a mirror based reflective API.
We are doing that in Newspeak right now.

This goes to Stefan's point that in Smalltalk you often build applications that are part of the environment. That is an option you have because of the ability to incrementally modify the running process - it is an effect of incrementality, not a cause.

Dan Fabulich said...

"Write a method. Set a breakpoint in it. When it stops, edit it. Keep executing. Did the changes have any effect on the current execution? In Eclipse, they don't."

With all due respect, did you actually try that example, Gilad? I just did and it works for me in Eclipse 3. I use it all the time.

There's a catch, of course... You can only do this with instance methods (modifying a static method will give you an error). Also, modifying the function will reset the PC to the top of the method (as if you'd used Eclipse's "Drop to Frame" function).

Here's my example:

---
public class Experiment {
public static void main(String[] args) {
new Experiment().main();
}

void main() {
System.out.println("hello1");
System.out.println("hello2");
}
}
---

Set a breakpoint on "hello2". Modify the line to say "hello3", save. The PC will be back at the top of the method, but will then run "hello1" and "hello3". Your console will say:

hello1
hello1
hello3

On the other hand, if you just put the println lines in the static "main" method, and try to modify them there, Eclipse will give you a nasty warning about obsolete stack frames and won't replace the method.

It's not perfect, but it's usually good enough for my purposes. (We all avoid static code anyway for loose coupling reasons, right?)

Gilad Bracha said...

Dan,

Yes, I did. Repeatedly. In an instance method (and while static methods are a bad idea, as long as they exist, it should work there just as well). In my case, I believe I added a local variable, changed some assignments to existing locals etc. No effect.

When I have a moment, I'll try your example, and more interesting variations. I'm glad some of it works.

The real point is that this is just the beginning. The principle is that you should be able to modify your program on the fly. Always.

Bob said...

It's probably not worthwhile to debate the nuances of Java class reloading. The fact is that it does suck and is pretty useless in its current state. Changing a method body constitutes a tiny percentage of the types of changes I'm likely to make. Changing my development process for the unlikely scenario where I only change a method body isn't really worth it, IMO. I don't want the debugger's limitations to drive my design decisions ("if I can just change this method body instead of making the change that actually makes better sense, I don't have to restart my app!").

I forgot to mention that John Rose is back working on reloading. He's trying to support "dynamic languages" but will support Java, too, if he can.

Paul said...

Hi Gilad,

I find it strange that people choose to point to flaws in Smalltalk, when it is clear (to me at least) that Smalltalk got the most important things right, and the rest is fixable like you say.

Today it is still light years ahead of languages like Java and C#. The moment I tried Smalltalk, coming from a C/C++ background, it just felt right and I knew this was the future of programming.

We just seem to be spending a long time getting there.. :(

Like you point out incremental development is how the real world works, and OO is all about modeling the real world. There is no edit/build/run modes in real life. Our world is mode less, running all the time. So why should software be any different?

Philippe said...

Well fix and continue doesn't really work in Smalltalk as well at least in Squeak.

An example is if you open a debugger (you need to change the source for this bug that's an other issue) is a block of a method that is no longer active. Now type something, now accept it. BOOM! If you're lucky get change got saved and the only thing you have to do is restart. If you're not lucky the last quarter of an hour of your work just vaporized. And no, the changes file won't help you here.

An other problem is that active contexts are not migrated. An example of this is if you have a long running block, lets say used to fork a Process that access an instance variable. Now you change the class and remove some instance variables. The context is not migrated. The next time the block accesses the instance variable the VM segmentation faults.

Eliot Miranda said...

Hi Gilad,

nice post! Do you really think Smalltalk IDEs have gone overboard in the other direction? Personally (having some stake in this) I think some Smalltalk IDEs have done a reasonable job at least with binary files. I think for example that the VisualWorks parcel system is better than Java class files. Parcel files are a fast-load binary format for code. They come with source, load fast, support class shape change on load (doing a merge of the class definition in image and that in the file) and support overrides (a necessary evil) and partial loading.

Overrides are the ability to redefine an arbitrary existing method and remember that it was overridden such that if the parcel that overrode it is ever unloaded the overridden version is restored. I can understand that this can be see as hackery in the extreme but it does allow one to extend or patch an arbitrary running system.

Partial loading is the ability to load only the code that "fits". So that if the parcel contains methods or subclasses of a class not present in the image loading will not fail. Instead the unloadable code will be stored in the image until a subsequent parcel load makes it possible to attach the unloadable code.

This means that one can design a parcel as a logical component no a physical component. One can keep all the code together in a single "file" (actually pair of code and source) and load it in any context. If a system into which one might load some parcel is decomposed into more subcomponents there is no need to revisit the parcel and cut it into bits that would fit. So partial loading avoids a nasty maintennance combinatorial explosion one usually has to suffer in complex systems.

Gilad Bracha said...

Hi Eliot,

You ask: what's wrong with parcels?
Well, if you insist:

As you know, I am a man of very simple tastes. I find parcels too complex. Parcels attempt to bring order to the chaotic world of monkey-patching by specifying who can do what, when and how. They are inherently imperative.

For a system patch management utility, this may be fine. Indeed, parcels deserve much more study and recognition. They are light years ahead of other mechanisms (like OSGi).

However, to me this seems to be an issue that requires language level support. Language mechanisms should not, if at all possible, be defined in terms of algorithms and procedures, but declaratively.

This is what Newspeak's modularity constructs are trying to address. They don't necessarily address all of the same stuff, but there is an overlap.

helium said...

Visual Studio supports fix-and-continue at least for C#. You can stop at a breakpoint change some lines of code continue stepping through the code or put the execution point a few lines fore/back, jump into another if/else-branch, set variables to different objects using the watch-window (yes, you can assign there, call methods and stuff like that).