This is not, of course, an essay on restricting free trade. Rather, this post is about the evils of the import clause, which occurs in one form or another across a wide array of programming languages, from Ada, Oberon and the various Modulas, through Java and C#, and on to F# and Haskell.
The import clause should be banned because it undermines modularity in a deep and insidious way. This is a point I’ve attempted to convey time and time again, with only limited success. I will now try to illustrate the problem via a hardware inspired example.
Consider the not-so-humble MP3 player. An MP3 player is a hardware module. The market is full of them, as well as other hardware modules they can plug in to. For example, sound systems where on can dock an MP3 player and have it play on stereo speakers.
Let’s try and describe the analog of such a sound system using programming language modularity constructs that rely on imports:
module SoundSystem
import MP3Player;
... wonderful functionality elided ...
end
I want to describe how my sound system works, separately from the description of how an MP3 player works. I would like to later plug in a particular MP3 player, say a Zune(tm) or an iPod(tm)
tm: Zune and iPod are trademarks of Microsoft and Apple respectively, two companies with armies of lawyers who might harass me if I do not state the obvious.
Now the first problem is that neither Zune or iPod are named MP3Player. If I want to connect my sound system to a Zune, I will have to edit the definition of SoundSystem to name the specific module I want to import.
If you’re very petty, you might say that Zune and iPod do not share a common interface and cannot be docked into the same sound system. Imagine that we wish to use our sound system with an iPhone (tm) and an iPod Touch (tm) of some compatible generation.
tm: iPhone and iPod Touch are trademarks of Apple.
Say I decide to go with a Zune.
module SoundSystem
import Zune;
... wonderful functionality elided ...
end
Later I change my mind for some reason, and want to hook up my system to an iPod. It’s easy: I just edit the definition of my system again, to import iPod:
module SoundSystem
import iPod;
... wonderful functionality elided ...
end
The question you should be asking is: Why should I edit the definition of my system each time I change the configuration? In reality, it is unlikely that I am actually the designer of SoundSystem. I probably don’t even have access to its definition. I just want to configure it to work with my MP3 player.
The problem is that import confounds module definition and module configuration. Module definition describes the design of a module; module configuration describes how one hooks up different modules. The former has to do with module internals; the latter should be done externally to the modules involved, to allow them to be used in any context where they could function.
We clearly want our sound system to abstract over the specific player being plugged in to it. Any player with a compatible interface will do. A well known mechanism for abstracting things is parameterization. We might be happier if we defined our sound system parametrically with respect to the MP3 player
module SoundSystem(anMP3Player){
... great wonders using anMP3Player ...
}
We could them configure our system to use an iPod:
SoundSystem(iPod);
or a Zune
SoundSystem(Zune);
without having to modify (or even have access to) the source code for the definition of SoundSystem. Hurray!
The module definition looks a lot like a function, and the configuration code looks like a function application. This is very suggestive. Indeed, ML introduced a module system based on function-like things called functors a quarter century ago. But there’s a bit more to this.
These hardware pieces tend to plug in to each other. For example, the definition of IPod is parametric too:
IPod(dockingStation){... even greater wonders ...}
Our sound system does its thing by behaving like a docking station. It and the MP3 player are mutually recursive modules. Configuration therefore requires support for mutual recursion (which is not allowed in Standard ML):
letrec {
dock = SoundSystem(mp3Player);
mp3Player = IPod(dock);
} in dock;
If this notation is unfamiliar, please brush up on your functional programming skills before you become unemployable. Basically, ignore the first and last line, and treat the two lines involving = as equations.
So the module definitions are a lot like functions that yield modules. You could also think of module definitions as classes yielding instances. The instances are like physical hardware modules.
Now we can add another sound system and use our old Zune
letrec {
dock2 = SoundSystem(oldMP3);
oldMP3 = Zune(dock2);
} in dock2;
This is what is often called side by side deployment - multiple instances of the same design, configured differently.
Tangent: Yes, Virginia, you can achieve that sort of thing in Java despite imports, using class loaders. Imports hardwire names into your code, and class loaders can counteract that by letting you define multiple namespaces. These can have multiple copies of your code, potentially hardwired to different things (even though they all have the same name). If you think class loaders offer a simple, clean way of doing things that is easy to learn, use, understand and debug, this post is not for you. Nor will any amount of OSGi magic on top fundamentally change things.
We might also choose to define things differently
module SoundSystem(MP3Player) {
player = MP3Player(self);
...
}
Here, we are passing module definitions as parameters. We are also referring to SoundSystem’s current instance from within itself - a lot like classes, no? We might configure things thusly
SoundSystem(iPod);
So it looks like mutual recursion and first class module definitions are very natural things to have. And yet traditional languages do not support this - even though many languages have constructs like classes and functions that are first class values and can be defined in a mutually recursive fashion.
One problem with using these constructs to define modules is that they are usually able to access anything in the global namespace. This makes it very hard to avoid implicit dependencies.
Interestingly, the global namespace is exactly what import requires. Since we don’t need or want import, let’s do away with it and the global namespace. We clearly will get a much more modular system without it; but wait - there seems to be one place where we really want the global namespace. That is when we write our configuration code, the code that wires our modules together.
That’s fine - there are a number of solutions for that. It isn’t always clear that our configuration language is the same language as the programming language(s) that define our modules, for example. If you write a makefile, the global namespace is defined by your file system and accessible within the makefile. Not that I really want to recommend make and its ilk.
I think we do want to code our configuration in a nice general purpose high level programming language. One solution is to have our IDE provide us with an object representing the known global namespace, and write our configuration code with respect to that namespace object. This is essentially what we do in Newspeak.
In the next post, I’ll discuss more of the advantages of this approach, contrast how Newspeak handles things with other languages with powerful module systems, like Scheme (which for the past decade or so has had a system called Units that is quite close to what I’ve discussed so far) and ML, and show once more how one actually does configuration in Newspeak.
To be continued.
Room 101
A place to be (re)educated in Newspeak
Tuesday, June 30, 2009
Monday, May 25, 2009
Original Sin
I’ve often said that Java’s original sin was not being a pure object oriented language - a language where everything is an object.
As one example, consider type char. When Java was introduced, the Unicode standard required 16 bits. This later changed, as 16 bits were inadequate to describe the world’s characters.
Tangent: Not really surprising if you think about it. For example, Apple’s internationalization support allocated 24 bits per character in the early 1990s. However, engineering shortcuts are endemic.
In the meantime, Java had committed to a 16 bit character type. Now, if characters were objects, their representation would be encapsulated, and nobody would very much affected how many bits are needed. A primitive type like char, however, advertises its representation to the world. Consequently, people dealing with unicode in Java have to deal with encoding code points themselves.
I was having this conversation for the umpteenth time last week. My interlocutor asked whether it would be possible for Java to be as efficient as it was without primitive types. The answer is yes, and prompted this post.
So how would we go about getting rid of primitive types without incurring a significant performance penalty?
Java has a mandatory static type system; it is compiled into a statically typed assembly language (Java byte codes, aka JVML). It supports final classes. I do not favor any of these features, but we will take them as a given. The only changes we will propose are those necessary to eradicate primitive types.
Assume that the we have a final class Int representing 32 bit integers. The compiler can translate occurrences of this type into type int. Hence, we can generate the same code for scalars as Java does today with no penalty whatsoever.
To make Int a suitable replacement for int, we would like the syntactic convenience of using operators:
Int twice(Int i) { return i + i;}
This is easy enough. We don’t want operator overloading of course. What we want is simply the ability to define methods whose names are operators. The language can support the same set of binary operators it does today, with the same fixed precedence. No need for special syntax and rules to define the precedence of operators etc. I leave that sort of thing to people who love complexity.
It is crucial that Ints behave as values. This requires that the == operator on Ints behaves the same as equality. There should be no way to detect if we’ve allocated one copy of the number 3 or a million of them. Since we can define == as a method, we can define it in Object to work the way it does for most objects, and override it with a method in Int and other value classes to work differently. We’ll want to take care of identityHash as well of course.
At this point, we can simply say that int stands for Int. Except for locking. What would it mean to synchronize on an instance of Int? To avoid this nastiness, we’ll just say that all instances of Int are locked when they are created. Hence no user code can ever synchronize on them.
In this hypothetical dialect, literal integers such as 3 are considered instances of Int. They, and all other integer results, can be stored in collections, passed to polymorphic code that requires type Object etc. The compiler sees to it that they are usually represented as the primitive integer type of the JVM. When they are used as objects, they get boxed into real instances of Int. When an object gets cast to an Int, it gets unboxed.
This is similar to what happens today, except that it is completely transparent. Because these objects have true value semantics for identity, you can never tell if a new instance was allocated by the boxing or not; indeed, you cannot tell if boxing or unboxing occur at all.
So far I haven’t described anything really new. A detailed proposal along the lines above existed years ago. It would allow adding new user defined value types like Complex as well.
A more restricted, but somewhat similar proposal was considered as part of JSR201 - the idea being that boxing would create value objects, without actually replacing the existing primitive types. It was rejected by the expert group. As I recall, only myself and Corky Cartwright supported it. There were concerns as to what would become of the existing wrapper classes (Integer and friends); those are used by reflection etc., and are hopelessly broken because they have public constructors that allocate new instances for all the world to see.
Tangent: I don’t know who came up with the idea of an Integer class that could have multiple distinct instances of 3. I assume it was another engineering shortcut (see above) by someone really clever.
To be clear - I am not suggesting that this discussion should be reopened. This is just a mental exercise.
There is only one place where this proposal introduces a performance penalty: polymorphic code on type Object that tests for identity. Now that identity testing is method, it will be slower. Not by all that much, and only for type Object.
Why? Because we preclude overriding == in non-value classes. There are several ways to arrange this. Either by fiat, or by rearranging the class hierarchy with types Value (for value types) and Reference (for regular objects) as the only subtypes of Object, making == final in those two types. Or some other variation
What has not been seriously discussed, to my knowledge, is what to do with arrays. As indicated above, scalar uses of primitive types don’t pose much of a problem.
However, what of types like int[]? Again, we can allocate these as arrays of 32 bit integers just as we do today. No memory penalty, no speed penalty when writing in ints or reading them out.
What makes things complicated is this: If int is a subtype of Object (as it is in the above) we’d expect int[] to be a subtype of Object[], because in Java we expect covariant subtyping among arrays.
Of course that isn’t type safe, and one could certainly argue that this our chance to correct that problem. But I won’t. Instead, assume we want to preserve Java’s covariant array subtyping.
The thing is, we do not want to box all the elements of an int[] when we assign it to an Object[].
The zeroth order solution is to leave the existing subtype relations among arrays of the predefined types unchanged. It’s ugly, but we are no worse off than we are today - and have a bunch of advantages because we are free of primitive types. But we can do better.
Suppose Object[] has methods getObjectElement and setObjectElement. These (respectively) retrieve and insert elements into the array.
We can equip int[] with versions of these methods that box/unbox elements that are being retrieved or inserted. We ensure that a[i] and a[i] = e are compiled differenty based upon the static type of a. If a is Object[], we use the getObjectElement and setObjectElement. If a is of type int[], we use methods that access the integer representation directly, say, getIntElement and setIntElement.
At this point, we can even introduce meaningful relations among array types like int[] and short[] using the same technique.
In essence, Object[] provides an interface that different array types may implement differently. We just provide a sugar for accessing this interface magically based on type. On a modern implementation like Hotspot, the overhead for this would be minimal.
Of course, if you make arrays invariant, the whole issue goes away. That just makes things too easy. Besides, I’ve come to the conclusion that Java made the right call on array covariance in the first place.
All in all - Java could have been purely object oriented with no significant performance hit. But it wasn’t, isn’t and likely won’t. Sic Transit Gloria Mundi.
As one example, consider type char. When Java was introduced, the Unicode standard required 16 bits. This later changed, as 16 bits were inadequate to describe the world’s characters.
Tangent: Not really surprising if you think about it. For example, Apple’s internationalization support allocated 24 bits per character in the early 1990s. However, engineering shortcuts are endemic.
In the meantime, Java had committed to a 16 bit character type. Now, if characters were objects, their representation would be encapsulated, and nobody would very much affected how many bits are needed. A primitive type like char, however, advertises its representation to the world. Consequently, people dealing with unicode in Java have to deal with encoding code points themselves.
I was having this conversation for the umpteenth time last week. My interlocutor asked whether it would be possible for Java to be as efficient as it was without primitive types. The answer is yes, and prompted this post.
So how would we go about getting rid of primitive types without incurring a significant performance penalty?
Java has a mandatory static type system; it is compiled into a statically typed assembly language (Java byte codes, aka JVML). It supports final classes. I do not favor any of these features, but we will take them as a given. The only changes we will propose are those necessary to eradicate primitive types.
Assume that the we have a final class Int representing 32 bit integers. The compiler can translate occurrences of this type into type int. Hence, we can generate the same code for scalars as Java does today with no penalty whatsoever.
To make Int a suitable replacement for int, we would like the syntactic convenience of using operators:
Int twice(Int i) { return i + i;}
This is easy enough. We don’t want operator overloading of course. What we want is simply the ability to define methods whose names are operators. The language can support the same set of binary operators it does today, with the same fixed precedence. No need for special syntax and rules to define the precedence of operators etc. I leave that sort of thing to people who love complexity.
It is crucial that Ints behave as values. This requires that the == operator on Ints behaves the same as equality. There should be no way to detect if we’ve allocated one copy of the number 3 or a million of them. Since we can define == as a method, we can define it in Object to work the way it does for most objects, and override it with a method in Int and other value classes to work differently. We’ll want to take care of identityHash as well of course.
At this point, we can simply say that int stands for Int. Except for locking. What would it mean to synchronize on an instance of Int? To avoid this nastiness, we’ll just say that all instances of Int are locked when they are created. Hence no user code can ever synchronize on them.
In this hypothetical dialect, literal integers such as 3 are considered instances of Int. They, and all other integer results, can be stored in collections, passed to polymorphic code that requires type Object etc. The compiler sees to it that they are usually represented as the primitive integer type of the JVM. When they are used as objects, they get boxed into real instances of Int. When an object gets cast to an Int, it gets unboxed.
This is similar to what happens today, except that it is completely transparent. Because these objects have true value semantics for identity, you can never tell if a new instance was allocated by the boxing or not; indeed, you cannot tell if boxing or unboxing occur at all.
So far I haven’t described anything really new. A detailed proposal along the lines above existed years ago. It would allow adding new user defined value types like Complex as well.
A more restricted, but somewhat similar proposal was considered as part of JSR201 - the idea being that boxing would create value objects, without actually replacing the existing primitive types. It was rejected by the expert group. As I recall, only myself and Corky Cartwright supported it. There were concerns as to what would become of the existing wrapper classes (Integer and friends); those are used by reflection etc., and are hopelessly broken because they have public constructors that allocate new instances for all the world to see.
Tangent: I don’t know who came up with the idea of an Integer class that could have multiple distinct instances of 3. I assume it was another engineering shortcut (see above) by someone really clever.
To be clear - I am not suggesting that this discussion should be reopened. This is just a mental exercise.
There is only one place where this proposal introduces a performance penalty: polymorphic code on type Object that tests for identity. Now that identity testing is method, it will be slower. Not by all that much, and only for type Object.
Why? Because we preclude overriding == in non-value classes. There are several ways to arrange this. Either by fiat, or by rearranging the class hierarchy with types Value (for value types) and Reference (for regular objects) as the only subtypes of Object, making == final in those two types. Or some other variation
What has not been seriously discussed, to my knowledge, is what to do with arrays. As indicated above, scalar uses of primitive types don’t pose much of a problem.
However, what of types like int[]? Again, we can allocate these as arrays of 32 bit integers just as we do today. No memory penalty, no speed penalty when writing in ints or reading them out.
What makes things complicated is this: If int is a subtype of Object (as it is in the above) we’d expect int[] to be a subtype of Object[], because in Java we expect covariant subtyping among arrays.
Of course that isn’t type safe, and one could certainly argue that this our chance to correct that problem. But I won’t. Instead, assume we want to preserve Java’s covariant array subtyping.
The thing is, we do not want to box all the elements of an int[] when we assign it to an Object[].
The zeroth order solution is to leave the existing subtype relations among arrays of the predefined types unchanged. It’s ugly, but we are no worse off than we are today - and have a bunch of advantages because we are free of primitive types. But we can do better.
Suppose Object[] has methods getObjectElement and setObjectElement. These (respectively) retrieve and insert elements into the array.
We can equip int[] with versions of these methods that box/unbox elements that are being retrieved or inserted. We ensure that a[i] and a[i] = e are compiled differenty based upon the static type of a. If a is Object[], we use the getObjectElement and setObjectElement. If a is of type int[], we use methods that access the integer representation directly, say, getIntElement and setIntElement.
At this point, we can even introduce meaningful relations among array types like int[] and short[] using the same technique.
In essence, Object[] provides an interface that different array types may implement differently. We just provide a sugar for accessing this interface magically based on type. On a modern implementation like Hotspot, the overhead for this would be minimal.
Of course, if you make arrays invariant, the whole issue goes away. That just makes things too easy. Besides, I’ve come to the conclusion that Java made the right call on array covariance in the first place.
All in all - Java could have been purely object oriented with no significant performance hit. But it wasn’t, isn’t and likely won’t. Sic Transit Gloria Mundi.
Thursday, April 30, 2009
The Need for More Lack of Understanding
People are always claiming that if only there was more understanding in the world, it would be a better place. This post will argue that less is more: we need less understanding - specifically more not understanding.
A couple of weeks ago I gave a talk at DSL Dev Con. One of the encouraging things that was evident there was the increased understanding that not understanding is important.
Tangent: While I'm advertising this talk, I might as well advertise my interview on Microsoft's channel 9 which explains the motivation for Newspeak and its relation to cloud computing.
Several programming languages support a mechanism by which a class or object can declare a general-purpose handler for method invocations it does not explicitly support.
Smalltalk was, AFIK, the first language to introduce this idea. You do it by declaring a method called doesNotUnderstand: . The method takes a single argument, that represents a reification of the call. The argument tells us the name of the method that was invoked, and the actual arguments passed. If a method m is invoked on an object that does not have a member method m (that is, m is not declared by the class of the object or any of its superclasses), then the object’s doesNotUnderstand: method is invoked. The default implementation of doesNotUnderstand:, declared in class Object, is to throw an exception. By overriding doesNotUnderstand: one can control the system’s behavior when such calls are made. Similar mechanisms exist in several other dynamic languages (e.g., missingMethod in Ruby and Groovy, _noSuchMethod_ in some dialects of Javascript).
Aficionados of these languages know that this is an extremely useful mechanism. However, users of mainstream object-oriented languages typically lack an appreciation of the power this mechanism can provide. I hope this post can be a small step in rectifying that situation.
DoesNotUnderstand: helps implement orthogonal persistence, lazy loading, futures, and remote proxies, to name a few. Recently, there’s been a surge of interest in domain-specific languages, and doesNotUnderstand: can help there as well.
We’ll use an example from my talk at DSL Dev Con. Consider how to interact with an OS shell like bash or csh from within a general purpose programming language. We’ll use Newspeak as our general purpose language (what were you expecting?), because it works best (in my unbiased opinion).
Suppose you want a listing of the files in the current directory. You could view ls as a method on a shell object, and write: shell ls. Of course, we won’t do something like the following Java code:
class Shell {
public Collection ls() {...}
... an infinity of other stuff
}
There are any number of commands that a shell can understand, depending on the current path and the executables in the directories on that path. We cannot plausibly enumerate them all as a fixed set of methods in Shell.
Instead, in we can define a class NewShell with a doesNotUnderstand: method to look up the name of the message in the shell’s path and execute it.
shell ls
If we write this code in the context of a subclass of NewShell, we can take advantage of Newspeak’s implicit receiver sends and just write
ls
Nice, but not quite good enough.
ls aFilename
doesn’t work at all. We don’t want to invoke ls immediately here - we need to gather its arguments in some way. One way to do this is to have doesNotUnderstand: return a function object, that can be fed its arguments. This is in fact what we do in our implementation. We call this object a CommandSession. To get a CommandSession to actually run the command, you call one of is value methods, with the desired arguments:
ls value: aFileName
This is less convenient for the simple case, where we need to write
ls value
to get ls to do something - but it is much more general.
What about modifiers, as in ls -l ? We can make simple cases work slightly better by defining - as a method on CommandSession :
ls -’l’
This is what the current implementation does.
The most general approach is to treat them as arguments
ls value: ‘-l’
ls value: ‘-l’ value: aFileName
An alternative might be to leave ls as it was originally, but allow
ls: aFileName
as well. In this version, doesNotUnderstand: checks to see if the message takes an argument (i.e., it ends with a colon). If so it strips the colon off the message name, creates CommandSession for the result, and calls its value: method with the argument. This handles modifiers pretty well
ls: ‘-l’
If there are multiple arguments, we can pass a tuple as the argument, and doesNotUnderstand: will unpack it as needed.
ls: {‘-l’. aFileName}
Now how about pipes?
We could introduce pipeValue methods, that produced an object that responded to the pipe operator. Or we could say that everything produced a CommandSession (and these understood “|”) and a special action is needed to get a result (sending it an evaluate or end message). This action is the analog of the newline that tells the shell to go ahead and evaluate. This could be dispensed with in a lazy setting.
Combining our second proposal above with this, we could say that value was used to derive a result. Then we can view the shell as a combinator library for CommandSessions. This does conflate two issues - the use of CommandSession to delay evaluation until a result is needed (the shell parses the input as a unit ensuring laziness) and the use of real combinators on byte streams.
We use NewShell in our IDE - for example, to manipulate subversion commands in the source control browser. It would be nice to refine it further, perhaps along the lines suggested above, but even in its current simplistic incarnation, it is quite useful.
As I noted at the beginning of this post, there a host of other cool uses for doesNotUnderstand:. I may return to those in another post.
Of course, if you are a fan of mandatory static typing, you aren’t allowed to use doesNotUnderstand: in your language. n the general case, it simply cannot be statically typed - which is an argument against mandatory typing, not against doesNotUnderstand:.
Just as switch statements, catch clauses and regular expressions all need defaults/catch-alls/wildcards, so does method dispatch. There are situations where you cannot avoid uncertainty. Reality is dynamic.
A couple of weeks ago I gave a talk at DSL Dev Con. One of the encouraging things that was evident there was the increased understanding that not understanding is important.
Tangent: While I'm advertising this talk, I might as well advertise my interview on Microsoft's channel 9 which explains the motivation for Newspeak and its relation to cloud computing.
Several programming languages support a mechanism by which a class or object can declare a general-purpose handler for method invocations it does not explicitly support.
Smalltalk was, AFIK, the first language to introduce this idea. You do it by declaring a method called doesNotUnderstand: . The method takes a single argument, that represents a reification of the call. The argument tells us the name of the method that was invoked, and the actual arguments passed. If a method m is invoked on an object that does not have a member method m (that is, m is not declared by the class of the object or any of its superclasses), then the object’s doesNotUnderstand: method is invoked. The default implementation of doesNotUnderstand:, declared in class Object, is to throw an exception. By overriding doesNotUnderstand: one can control the system’s behavior when such calls are made. Similar mechanisms exist in several other dynamic languages (e.g., missingMethod in Ruby and Groovy, _noSuchMethod_ in some dialects of Javascript).
Aficionados of these languages know that this is an extremely useful mechanism. However, users of mainstream object-oriented languages typically lack an appreciation of the power this mechanism can provide. I hope this post can be a small step in rectifying that situation.
DoesNotUnderstand: helps implement orthogonal persistence, lazy loading, futures, and remote proxies, to name a few. Recently, there’s been a surge of interest in domain-specific languages, and doesNotUnderstand: can help there as well.
We’ll use an example from my talk at DSL Dev Con. Consider how to interact with an OS shell like bash or csh from within a general purpose programming language. We’ll use Newspeak as our general purpose language (what were you expecting?), because it works best (in my unbiased opinion).
Suppose you want a listing of the files in the current directory. You could view ls as a method on a shell object, and write: shell ls. Of course, we won’t do something like the following Java code:
class Shell {
public Collection
... an infinity of other stuff
}
There are any number of commands that a shell can understand, depending on the current path and the executables in the directories on that path. We cannot plausibly enumerate them all as a fixed set of methods in Shell.
Instead, in we can define a class NewShell with a doesNotUnderstand: method to look up the name of the message in the shell’s path and execute it.
shell ls
If we write this code in the context of a subclass of NewShell, we can take advantage of Newspeak’s implicit receiver sends and just write
ls
Nice, but not quite good enough.
ls aFilename
doesn’t work at all. We don’t want to invoke ls immediately here - we need to gather its arguments in some way. One way to do this is to have doesNotUnderstand: return a function object, that can be fed its arguments. This is in fact what we do in our implementation. We call this object a CommandSession. To get a CommandSession to actually run the command, you call one of is value methods, with the desired arguments:
ls value: aFileName
This is less convenient for the simple case, where we need to write
ls value
to get ls to do something - but it is much more general.
What about modifiers, as in ls -l ? We can make simple cases work slightly better by defining - as a method on CommandSession :
ls -’l’
This is what the current implementation does.
The most general approach is to treat them as arguments
ls value: ‘-l’
ls value: ‘-l’ value: aFileName
An alternative might be to leave ls as it was originally, but allow
ls: aFileName
as well. In this version, doesNotUnderstand: checks to see if the message takes an argument (i.e., it ends with a colon). If so it strips the colon off the message name, creates CommandSession for the result, and calls its value: method with the argument. This handles modifiers pretty well
ls: ‘-l’
If there are multiple arguments, we can pass a tuple as the argument, and doesNotUnderstand: will unpack it as needed.
ls: {‘-l’. aFileName}
Now how about pipes?
We could introduce pipeValue methods, that produced an object that responded to the pipe operator. Or we could say that everything produced a CommandSession (and these understood “|”) and a special action is needed to get a result (sending it an evaluate or end message). This action is the analog of the newline that tells the shell to go ahead and evaluate. This could be dispensed with in a lazy setting.
Combining our second proposal above with this, we could say that value was used to derive a result. Then we can view the shell as a combinator library for CommandSessions. This does conflate two issues - the use of CommandSession to delay evaluation until a result is needed (the shell parses the input as a unit ensuring laziness) and the use of real combinators on byte streams.
We use NewShell in our IDE - for example, to manipulate subversion commands in the source control browser. It would be nice to refine it further, perhaps along the lines suggested above, but even in its current simplistic incarnation, it is quite useful.
As I noted at the beginning of this post, there a host of other cool uses for doesNotUnderstand:. I may return to those in another post.
Of course, if you are a fan of mandatory static typing, you aren’t allowed to use doesNotUnderstand: in your language. n the general case, it simply cannot be statically typed - which is an argument against mandatory typing, not against doesNotUnderstand:.
Just as switch statements, catch clauses and regular expressions all need defaults/catch-alls/wildcards, so does method dispatch. There are situations where you cannot avoid uncertainty. Reality is dynamic.
Saturday, April 18, 2009
The Language Designer’s Dilemma
Last week I was at lang.net 09 and DSL Dev Con. It was great fun, as always. The best talk was by Erik Meijer. Use the link to see it - but without live video of Erik in action, you can't really understand what it was like. Such is performance art.
I also gave a talk, and there were many others, but that is not the point of this post. Rather, this post was prompted by one specific, and excellent, talk - Lars Bak’s presentation on V8. Lars clearly enjoyed his visit to the lion’s den; more importantly, perhaps Microsoft will finally shake off their apparent paralysis and produce a competitive Javascript engine.
That, in turn, will make it possible to distribute serious applications targeting the web browser, with the knowledge that all major browsers have performant Javascript engines that can carry the load.
It’s all part of the evolution of the web browser into an OS. This was what Netscape foresaw back in 1995. And it is a perfect example of innovator’s dilemma.
Innovator’s dilemma applies very directly to programming languages, and Todd Proebsting already discussed this back in 2002.
To recap, the basic idea is that established players in a market support sustaining innovation but not disruptive innovation. Examples of sustaining innovation would be the difference between Windows 2000 and Windows XP, or between Java 1.0 thru Java 6.
Sustaining innovations are gradual improvements and refinements - some improvement in performance, or some small new feature etc.
Disruptive innovation tends to appear “off the radar” of the established players. It is often inferior to the established product in many ways. It tends to start in some small niche where it happens to work better than the established technology. The majors ignore it, as it clearly can’t handle the “real” problems they are focused on. Over time, the new technology grows more competent and eats its way upward, consuming the previous market leader’s lunch.
We’ve seen this happen many times before. Remember Sun workstations? PCs came in and took over that market. Sun retreated to the server business, and PC’s chase it up towards the high end of that market, until there’s nowhere left to run.
In the programming language space, take Ruby as an example. Ruby is slow, until quite recently lacked IDE support etc. It’s easy for the Javanese to dismiss it. Over time, such technology can evolve to be much faster, have better tooling etc. And so it may grow into Java’s main market.
Don’t believe me? Java in 1995 was just a language for writing applets. It’s performance was poorer than Smalltalk’s, and nowhere near that of C++. There were no IDEs. Who's laughing now?
Javascript is an even clearer case in point. It was just a scripting language for web browsers. Incredibly slow implementations, restricted to interacting with the just the browser. No debuggers, no tools, no software engineering support to speak of.
Then V8 came along and showed people that it can be much faster. Lars’ has doubled performance since the release in September, and expects to double it again within a year, and again 2-3 years after that.
Javascript security and modularity are evolving as well. By the time that’s done, I suspect it will far outstrip clunky packaging mechanisms like OSGi. This doesn’t mean you have to write everything in Javascript - I don’t believe in a monolingual world.
Tangent: Yes, I did spend ten years at Sun. A lot of it was spent arguing that Java was not the final step in the evolution of programming languages. I guess I’m just not very persuasive.
The web browser itself is one of the most disruptive technologies in computing. It is poised to become the OS. Javascript performance is an essential ingredient but there are others. SVG provides a graphics model. HTML 5 brings with it persistent storage - a file system, as it were - only better. You may recall that Vista was supposed to have a database like file system. Vista didn’t deliver, but web browsers will.
This trend seems to make the traditional OS irrelevant. Who cares what’s managing the disk drive? It’s just some commoditized component, like a power supply. We can already see how netbooks are prospering. This isn’t good news for Windows.
Of course, you might ask why Microsoft would go along with this? Is IE’s Javascript so inadequate on purpose? Maybe it is part of the master plan? Well, I don’t believe in conspiracy theories.
It does look as if Microsoft is facing a lose-lose proposition. Making IE competitive supports the movement toward a web-browser-as-OS world. Conversely, if they let IE languish, as they have, they can expect its market share to continue to drop.
However, you cannot stop the trend toward the Web OS by making your tools inferior. You can only compete by innovating, not by standing still.
I wouldn’t count Redmond out just yet. They have huge assets - a terrific research lab, armies of smart people in product land as well, market dominance, and vast amounts of money. They also have huge liabilities of course - and those add up to innovator’s dilemma.
In any case, for language implementors, it’s clear that one needs to be able to compile to the internet platform and Javascript is its assembly language. Web programming will evolve into general purpose programming, using the persistent storage provided by the browser to function off line as well as online.
As many readers of this blog will recognize, this sets the stage for the brave new world of objects as software services. I hope to bring Newspeak to the point where it is an ideal language for this coming world. It’s a bit difficult with the very limited resources at our disposal, but we are making progress.
The entire progression from conventional computing to the world of the Web OS is taking longer than one expects, but is happening. The time will come when we hit the tipping point, and things will change very rapidly.
I also gave a talk, and there were many others, but that is not the point of this post. Rather, this post was prompted by one specific, and excellent, talk - Lars Bak’s presentation on V8. Lars clearly enjoyed his visit to the lion’s den; more importantly, perhaps Microsoft will finally shake off their apparent paralysis and produce a competitive Javascript engine.
That, in turn, will make it possible to distribute serious applications targeting the web browser, with the knowledge that all major browsers have performant Javascript engines that can carry the load.
It’s all part of the evolution of the web browser into an OS. This was what Netscape foresaw back in 1995. And it is a perfect example of innovator’s dilemma.
Innovator’s dilemma applies very directly to programming languages, and Todd Proebsting already discussed this back in 2002.
To recap, the basic idea is that established players in a market support sustaining innovation but not disruptive innovation. Examples of sustaining innovation would be the difference between Windows 2000 and Windows XP, or between Java 1.0 thru Java 6.
Sustaining innovations are gradual improvements and refinements - some improvement in performance, or some small new feature etc.
Disruptive innovation tends to appear “off the radar” of the established players. It is often inferior to the established product in many ways. It tends to start in some small niche where it happens to work better than the established technology. The majors ignore it, as it clearly can’t handle the “real” problems they are focused on. Over time, the new technology grows more competent and eats its way upward, consuming the previous market leader’s lunch.
We’ve seen this happen many times before. Remember Sun workstations? PCs came in and took over that market. Sun retreated to the server business, and PC’s chase it up towards the high end of that market, until there’s nowhere left to run.
In the programming language space, take Ruby as an example. Ruby is slow, until quite recently lacked IDE support etc. It’s easy for the Javanese to dismiss it. Over time, such technology can evolve to be much faster, have better tooling etc. And so it may grow into Java’s main market.
Don’t believe me? Java in 1995 was just a language for writing applets. It’s performance was poorer than Smalltalk’s, and nowhere near that of C++. There were no IDEs. Who's laughing now?
Javascript is an even clearer case in point. It was just a scripting language for web browsers. Incredibly slow implementations, restricted to interacting with the just the browser. No debuggers, no tools, no software engineering support to speak of.
Then V8 came along and showed people that it can be much faster. Lars’ has doubled performance since the release in September, and expects to double it again within a year, and again 2-3 years after that.
Javascript security and modularity are evolving as well. By the time that’s done, I suspect it will far outstrip clunky packaging mechanisms like OSGi. This doesn’t mean you have to write everything in Javascript - I don’t believe in a monolingual world.
Tangent: Yes, I did spend ten years at Sun. A lot of it was spent arguing that Java was not the final step in the evolution of programming languages. I guess I’m just not very persuasive.
The web browser itself is one of the most disruptive technologies in computing. It is poised to become the OS. Javascript performance is an essential ingredient but there are others. SVG provides a graphics model. HTML 5 brings with it persistent storage - a file system, as it were - only better. You may recall that Vista was supposed to have a database like file system. Vista didn’t deliver, but web browsers will.
This trend seems to make the traditional OS irrelevant. Who cares what’s managing the disk drive? It’s just some commoditized component, like a power supply. We can already see how netbooks are prospering. This isn’t good news for Windows.
Of course, you might ask why Microsoft would go along with this? Is IE’s Javascript so inadequate on purpose? Maybe it is part of the master plan? Well, I don’t believe in conspiracy theories.
It does look as if Microsoft is facing a lose-lose proposition. Making IE competitive supports the movement toward a web-browser-as-OS world. Conversely, if they let IE languish, as they have, they can expect its market share to continue to drop.
However, you cannot stop the trend toward the Web OS by making your tools inferior. You can only compete by innovating, not by standing still.
I wouldn’t count Redmond out just yet. They have huge assets - a terrific research lab, armies of smart people in product land as well, market dominance, and vast amounts of money. They also have huge liabilities of course - and those add up to innovator’s dilemma.
In any case, for language implementors, it’s clear that one needs to be able to compile to the internet platform and Javascript is its assembly language. Web programming will evolve into general purpose programming, using the persistent storage provided by the browser to function off line as well as online.
As many readers of this blog will recognize, this sets the stage for the brave new world of objects as software services. I hope to bring Newspeak to the point where it is an ideal language for this coming world. It’s a bit difficult with the very limited resources at our disposal, but we are making progress.
The entire progression from conventional computing to the world of the Web OS is taking longer than one expects, but is happening. The time will come when we hit the tipping point, and things will change very rapidly.
Sunday, March 29, 2009
Subsuming Packages and other Stories
Barbara Liskov recently won the Turing award. It’s great to see the importance of programming language work recognized once again. Even if you aren’t into the academic literature, you have probably heard her name, for example, when people mention the Liskov substitution principle (LSP). The LSP states that if a property P is provable for an object of type S, and T is a subtype of S, then P is provable for an object of type T.
This is behavioral subtyping - instances of subtypes are semantically substitutable where supertypes are expected. This is a very strong requirement, but not something a practical programming language can enforce.
Closely related (and enforceable) is the subsumption rule of type systems for OO languages. This rule states that if an expression e has type T, and T is a subtype of S, then e has type S as well. This post is about the importance of subsumption in programming language design.
The subsumption rule is a cornerstone of formal type systems for OO. More importantly, it captures a deep intuition that programmers have about subtyping. After all, what could be more natural: if e: T and T <: S, e: S. I hope readers agree with me that this property is intuitive, perhaps even so obvious that you wonder why anyone would bother belaboring the point. *** So it is most unfortunate that mainstream programming languages violate subsumption left and right. Below, we’ll look at some examples, and draw some conclusions about language design. Case 1: Hiding fields. In Java (and C++) you can define a field in a subclass with type X, even though the superclass has a field of the same name, with unrelated type Y. The subclass’ field hides the superclass field.
class S { final int f = 42;}
class T extends S { final String f = “!”;}
new T().f; // “!”
((S) new T()).f; // 42
In case it isn’t obvious, any object of type S has the property that it’s f field has type int. An instance of T should also have the same property, but doesn’t. To get at the field defined by S, we have to coerce the T value into an S.
Tangent: It would be much better to ensure that fields are always private, or better yet, avoid field references altogether, as in Self or Newspeak.
A typical programmer in the Java/C++ tradition may ask what’s the harm in the rules as they are.
The harm is precisely in having rules that contradict strong and valid intuition. The contradiction will bite you in the end, because you will end up thinking some program property holds based on your intuition, while the rules of the language contradict that intuition. Years of exposure to these misfeatures may have inured the mind to their toxicity, but they are toxic all the same.
However, there is much more harm, as we’ll see as this post progresses.
Case 2: Hiding static methods. The same sort of situation occurs with static methods. Static methods aren’t really methods at all, as no run time object is involved in determining the meaning of their invocations. No overriding can occur between static methods. The meaning of a static method is statically bound: static method is a misnomer for subroutine (as in FORTRAN subroutine circa 1955). And so Java allows static methods to hide each other much like fields. You can easily recreate the above example with static methods.
I’ve written elsewhere about the problems of all things static; add this to the list. Maybe you’re beginning to get the sense that violating subsumption correlates with iffy language constructs.
Case 3: Package privacy. This brings us back to Barbara Liskov. Her work on CLU in the mid 70s brought David Parnas’ notion of information hiding into programming languages, with support for abstract datatypes (ADTs). Packages are one of the mechanisms that Java uses to support ADTs. Packages have had all kinds of problems; lack of subsumption is at the root of many of them.
For example, recall that in Java, protected access implies package access. Package access is always relative the the current package. So when a protected member is inherited by a subclass in another package, it eliminates the access from the superclass package (while granting access to its own package):
package P1;
public class A {protected int foo(){return 91;}}
class C {static int i = (new P2.B()).foo();}
**
package P2;
public class B extends P1.A {}
This code is legal. If we then change B such that
public class B extends P1.A {protected int foo() {return 42;}}
The call in C is no longer allowed! We have not added or removed members in the subclass, or changed their accessibility (or have we? Did we have a choice?) and yet we’ve broken client code.
I don’t want to go into the details of an even more bizarre example (due, as far as I can recall, to David Chase) - but others have. Suffice to say that there was a huge bug tail around such examples; it took years to resolve them. Compiler engineers did not correctly diagnose the problems, because their intuition led them to try and preserve subsumption, to no avail. I have several patents to my name that deal with the mechanisms of implementing the desired behavior.
And why should you care? Because these issues led to subtle incompatibilities between compilers and VMs, across vendors and releases; these led to subtle program bugs, to jokes about write once debug everywhere etc.
Packages are one way in which Java supports ADTs. It turns out that you cannot have ADTs, inheritance and subsumption in one language. You must make a choice - and I argue that subsumption must always be preserved.
Case 4: Class-based Encapsulation.
Another mechanism that supports ADTs in all the mainstream OO languages is class based encapsulation. This the idea that privacy is per class type, not per object. This makes it possible to write code like
class C {
private int f(){return 42;}
public int foo(C c){return c.f();} // I can get at c’s f(), because we’re both C’s
}
It also makes it possible to violate subsumption:
public class A {
{
((A) new B()).f(); // 42
new B().f(); // “!”
}
private int f(){return 42;} // package private
}
public class B extends A {
public String f(){return “!”;}
}
An alternative to class-based encapsulation is object-based encapsulation. Privacy is per object. You can enforce it via context free syntax. For example, you can arrange that only method invocations on self (aka this) attempt to lookup private methods. You don’t need a typechecker (aka verifier) to prevent malicious parties accessing private methods or fields. Consequently, you don’t have to rely on a complex byte code verifier to prevent such access as one does in Java and .Net.
This fits perfectly with the object-capability security model. It is essentially what we do in Newspeak; but the ideas are universal.
Oh, and object-based encapsulation works with both subsumption and inheritance.
Case 5: Privileged access within “nests” of classes. Giving enclosing classes privileged access to their nested classes, and giving sibling classes privileged access to each other, is another violation of subsumption (but giving a nested class free access to its surrounding scope, including private elements of its enclosing class, is fine).
Another way to look at all this is that object-based encapsulation falls out naturally from the use of interfaces. Consider what would happen if the type C above was, by language definition, the public interface of C (as it is in Strongtalk, for example). Subsumption would fall out automatically.
Subsumption is a very good litmus test for language constructs; if a construct violates subsumption, it will cause grief. Ditch it.
Strict adherence to subsumption reduces cognitive dissonance for programmers, compiler writers, VM implementors and language designers. It helps avoid bugs in user code, compilers and VMs, enhancing reliability and portability. It makes it easier to build secure systems.
So if you design a type system for an OO language, preserve subsumption at all costs. Do not try and combine ADTs with inheritance (because then you lose subsumption).
Now if we can’t have ADTs and inheritance, we have to chose between them. I believe the benefits of inheritance far outweigh the costs. Those who feel differently are unlikely to do better than Luca Cardelli‘s sublimely beautiful Quest. On the other hand, if one choses inheritance, one should go with object-based encapsulation.
Put another way, if you take the adage program to an interface, not an implementation seriously, it will prevent a host of problems. This is what we are doing with Newspeak.
At a meta-level, the lesson is: pay attention to programming language theory; occasionally, you can learn from it.
Update: Yardena pointed out this discussion of Java packages. It goes through the problems in detail. I've added a link to it in the main post as well.
This is behavioral subtyping - instances of subtypes are semantically substitutable where supertypes are expected. This is a very strong requirement, but not something a practical programming language can enforce.
Closely related (and enforceable) is the subsumption rule of type systems for OO languages. This rule states that if an expression e has type T, and T is a subtype of S, then e has type S as well. This post is about the importance of subsumption in programming language design.
The subsumption rule is a cornerstone of formal type systems for OO. More importantly, it captures a deep intuition that programmers have about subtyping. After all, what could be more natural: if e: T and T <: S, e: S. I hope readers agree with me that this property is intuitive, perhaps even so obvious that you wonder why anyone would bother belaboring the point. *** So it is most unfortunate that mainstream programming languages violate subsumption left and right. Below, we’ll look at some examples, and draw some conclusions about language design. Case 1: Hiding fields. In Java (and C++) you can define a field in a subclass with type X, even though the superclass has a field of the same name, with unrelated type Y. The subclass’ field hides the superclass field.
class S { final int f = 42;}
class T extends S { final String f = “!”;}
new T().f; // “!”
((S) new T()).f; // 42
In case it isn’t obvious, any object of type S has the property that it’s f field has type int. An instance of T should also have the same property, but doesn’t. To get at the field defined by S, we have to coerce the T value into an S.
Tangent: It would be much better to ensure that fields are always private, or better yet, avoid field references altogether, as in Self or Newspeak.
A typical programmer in the Java/C++ tradition may ask what’s the harm in the rules as they are.
The harm is precisely in having rules that contradict strong and valid intuition. The contradiction will bite you in the end, because you will end up thinking some program property holds based on your intuition, while the rules of the language contradict that intuition. Years of exposure to these misfeatures may have inured the mind to their toxicity, but they are toxic all the same.
However, there is much more harm, as we’ll see as this post progresses.
Case 2: Hiding static methods. The same sort of situation occurs with static methods. Static methods aren’t really methods at all, as no run time object is involved in determining the meaning of their invocations. No overriding can occur between static methods. The meaning of a static method is statically bound: static method is a misnomer for subroutine (as in FORTRAN subroutine circa 1955). And so Java allows static methods to hide each other much like fields. You can easily recreate the above example with static methods.
I’ve written elsewhere about the problems of all things static; add this to the list. Maybe you’re beginning to get the sense that violating subsumption correlates with iffy language constructs.
Case 3: Package privacy. This brings us back to Barbara Liskov. Her work on CLU in the mid 70s brought David Parnas’ notion of information hiding into programming languages, with support for abstract datatypes (ADTs). Packages are one of the mechanisms that Java uses to support ADTs. Packages have had all kinds of problems; lack of subsumption is at the root of many of them.
For example, recall that in Java, protected access implies package access. Package access is always relative the the current package. So when a protected member is inherited by a subclass in another package, it eliminates the access from the superclass package (while granting access to its own package):
package P1;
public class A {protected int foo(){return 91;}}
class C {static int i = (new P2.B()).foo();}
**
package P2;
public class B extends P1.A {}
This code is legal. If we then change B such that
public class B extends P1.A {protected int foo() {return 42;}}
The call in C is no longer allowed! We have not added or removed members in the subclass, or changed their accessibility (or have we? Did we have a choice?) and yet we’ve broken client code.
I don’t want to go into the details of an even more bizarre example (due, as far as I can recall, to David Chase) - but others have. Suffice to say that there was a huge bug tail around such examples; it took years to resolve them. Compiler engineers did not correctly diagnose the problems, because their intuition led them to try and preserve subsumption, to no avail. I have several patents to my name that deal with the mechanisms of implementing the desired behavior.
And why should you care? Because these issues led to subtle incompatibilities between compilers and VMs, across vendors and releases; these led to subtle program bugs, to jokes about write once debug everywhere etc.
Packages are one way in which Java supports ADTs. It turns out that you cannot have ADTs, inheritance and subsumption in one language. You must make a choice - and I argue that subsumption must always be preserved.
Case 4: Class-based Encapsulation.
Another mechanism that supports ADTs in all the mainstream OO languages is class based encapsulation. This the idea that privacy is per class type, not per object. This makes it possible to write code like
class C {
private int f(){return 42;}
public int foo(C c){return c.f();} // I can get at c’s f(), because we’re both C’s
}
It also makes it possible to violate subsumption:
public class A {
{
((A) new B()).f(); // 42
new B().f(); // “!”
}
private int f(){return 42;} // package private
}
public class B extends A {
public String f(){return “!”;}
}
An alternative to class-based encapsulation is object-based encapsulation. Privacy is per object. You can enforce it via context free syntax. For example, you can arrange that only method invocations on self (aka this) attempt to lookup private methods. You don’t need a typechecker (aka verifier) to prevent malicious parties accessing private methods or fields. Consequently, you don’t have to rely on a complex byte code verifier to prevent such access as one does in Java and .Net.
This fits perfectly with the object-capability security model. It is essentially what we do in Newspeak; but the ideas are universal.
Oh, and object-based encapsulation works with both subsumption and inheritance.
Case 5: Privileged access within “nests” of classes. Giving enclosing classes privileged access to their nested classes, and giving sibling classes privileged access to each other, is another violation of subsumption (but giving a nested class free access to its surrounding scope, including private elements of its enclosing class, is fine).
Another way to look at all this is that object-based encapsulation falls out naturally from the use of interfaces. Consider what would happen if the type C above was, by language definition, the public interface of C (as it is in Strongtalk, for example). Subsumption would fall out automatically.
Subsumption is a very good litmus test for language constructs; if a construct violates subsumption, it will cause grief. Ditch it.
Strict adherence to subsumption reduces cognitive dissonance for programmers, compiler writers, VM implementors and language designers. It helps avoid bugs in user code, compilers and VMs, enhancing reliability and portability. It makes it easier to build secure systems.
So if you design a type system for an OO language, preserve subsumption at all costs. Do not try and combine ADTs with inheritance (because then you lose subsumption).
Now if we can’t have ADTs and inheritance, we have to chose between them. I believe the benefits of inheritance far outweigh the costs. Those who feel differently are unlikely to do better than Luca Cardelli‘s sublimely beautiful Quest. On the other hand, if one choses inheritance, one should go with object-based encapsulation.
Put another way, if you take the adage program to an interface, not an implementation seriously, it will prevent a host of problems. This is what we are doing with Newspeak.
At a meta-level, the lesson is: pay attention to programming language theory; occasionally, you can learn from it.
Update: Yardena pointed out this discussion of Java packages. It goes through the problems in detail. I've added a link to it in the main post as well.
Friday, February 27, 2009
Newspeak Prototype Escapes into the Wild
The Newspeak prototype is now available at http://newspeaklanguage.org/downloads/ . We had planned to release it in early January, but decided to complete a few more things - like better source control support, a fully functional Hopscotch based debugger and a new GUI for unit testing (shown below).

Newspeak is a long way from being finished - we are less than half way to realizing the vision of a robust, high performance, secure, network aware, high productivity platform. We can only get there if people care enough to use it and contribute to it. I hope you'll check it out.

Newspeak is a long way from being finished - we are less than half way to realizing the vision of a robust, high performance, secure, network aware, high productivity platform. We can only get there if people care enough to use it and contribute to it. I hope you'll check it out.
Tuesday, January 13, 2009
Apologia
Here’s a quick update on the status of the Newspeak prototype release. If anyone is keeping score - I had said we’d put something out in the first week of January 2009. So we’re behind schedule. I apologize, but what are you going to do - cut my funding?
Nevertheless, we will be putting a prototype out soon. I’m holding off so that a few small things can be added.
In the meantime, I wanted to take the opportunity to say a bit about the prototype. Obviously, due to our reduced circumstances, it is a lot less ambitious than I had once hoped.
We do have a native GUI binding on Windows; I had hoped to have native GUI bindings for Mac and Linux, but that will depend on future open source contributions. You can of course run the system on those platforms, but you’ll have to live with the Morphic binding for now.
The language itself remains incomplete as well - but the key features are there. I expect to rev the syntax and add the missing features (like object literals) over time. One of the reasons these features have been delayed is that I want to add them in the context of a completely new compiler. Such a compiler will also make it easier to do ports - e.g., Newspeak on V8.
One of the biggest deficits is the lack of libraries written in Newspeak. We continue to rely heavily on the existing Squeak libraries. We have made some progress on this front, largely through the efforts of some volunteers who got early access to the system. I want to take this opportunity to thank Yardena Meymann, David Pennell and Stephen Pair for their work on the library port. They are all busy professionals who were willing to take the time to do something concrete to help move Newspeak forward. I hope that we’ll see a lot more of that after the release!
What we’ve been doing is porting some of the core libraries from Strongtalk to Newspeak. These libraries have several important advantages: they are small, yet complete enough to run a real system; they are blue book compatible (give or take), so we have a good chance of replacing our uses of Squeak code with them, without excessive disruption; they are quite cleanly written; they have type annotations; and, it so happens, they have a liberal license.
On the other hand, these libraries were not purpose built for Newspeak; no thought has been given to security, the designs do not leverage features like mixins as much as one might etc.. Still, they provide us the with the most realistic path to getting a small stand alone system in the foreseeable future.
This library code will not be anywhere near complete and integrated when we put out the prototype. But it whatever we have will be available, and we’ll move on from there.
There are any number of other things lacking, and no doubt many will be eager to point them out. However, that scarcely matters. What matters is the potential this system has. It’s being put out, so those with the imagination, sophistication and, above all, taste to appreciate it can start using it and contributing to it - eventually realizing that potential.
So for that small elite that has shown interest and appreciation for Newspeak so far - thanks, and hang in there. It will be available, Real Soon Now :-)
Nevertheless, we will be putting a prototype out soon. I’m holding off so that a few small things can be added.
In the meantime, I wanted to take the opportunity to say a bit about the prototype. Obviously, due to our reduced circumstances, it is a lot less ambitious than I had once hoped.
We do have a native GUI binding on Windows; I had hoped to have native GUI bindings for Mac and Linux, but that will depend on future open source contributions. You can of course run the system on those platforms, but you’ll have to live with the Morphic binding for now.
The language itself remains incomplete as well - but the key features are there. I expect to rev the syntax and add the missing features (like object literals) over time. One of the reasons these features have been delayed is that I want to add them in the context of a completely new compiler. Such a compiler will also make it easier to do ports - e.g., Newspeak on V8.
One of the biggest deficits is the lack of libraries written in Newspeak. We continue to rely heavily on the existing Squeak libraries. We have made some progress on this front, largely through the efforts of some volunteers who got early access to the system. I want to take this opportunity to thank Yardena Meymann, David Pennell and Stephen Pair for their work on the library port. They are all busy professionals who were willing to take the time to do something concrete to help move Newspeak forward. I hope that we’ll see a lot more of that after the release!
What we’ve been doing is porting some of the core libraries from Strongtalk to Newspeak. These libraries have several important advantages: they are small, yet complete enough to run a real system; they are blue book compatible (give or take), so we have a good chance of replacing our uses of Squeak code with them, without excessive disruption; they are quite cleanly written; they have type annotations; and, it so happens, they have a liberal license.
On the other hand, these libraries were not purpose built for Newspeak; no thought has been given to security, the designs do not leverage features like mixins as much as one might etc.. Still, they provide us the with the most realistic path to getting a small stand alone system in the foreseeable future.
This library code will not be anywhere near complete and integrated when we put out the prototype. But it whatever we have will be available, and we’ll move on from there.
There are any number of other things lacking, and no doubt many will be eager to point them out. However, that scarcely matters. What matters is the potential this system has. It’s being put out, so those with the imagination, sophistication and, above all, taste to appreciate it can start using it and contributing to it - eventually realizing that potential.
So for that small elite that has shown interest and appreciation for Newspeak so far - thanks, and hang in there. It will be available, Real Soon Now :-)
Subscribe to:
Posts (Atom)