Saturday, September 20, 2008

Skinning Newspeak?

Whenever it comes to discussing language syntax, Parkinson’s law of triviality comes to mind.

Incidentally, the book is back in print! If you haven’t read it, check out this priceless classic.

Newspeak’s syntax currently resembles Smalltalk and Self. For the most part, this a fine thing, but I recognize that it can be a barrier to adoption. So, as we get closer to releasing Newspeak we may have to consider, with heavy heart, changes that could ease the learning curve.

One approach is the idea of syntactic skins. You keep the same abstract language underneath, but adjust its concrete syntax. In theory, you can have a skin that looks Smalltalkish, and one that looks Javanese, and another sort of like Visual Basic etc.

The whole idea of skins is closely related to Charles Simonyi’s notion of intentional programming. Cutting through the vapor, one of the few concrete things one can extract is the idea (not new or original with Simonyi) of an IDE that can present programs in a rich variety of skins, some of which are graphical. That, and support for defining DSLs for the domain experts to program in. This is all a fine thing, as long as you understand that’s all it is. This is still a pretty tall order.
In any case, Magnus Christerson is doing a superb job of making that vision a reality.

It is of course crucial that any program can be automatically displayed in any skin in the IDE. And designing skins requires thought, and is prone to abuse, which makes me hesitate.

Naming conventions that may make sense in one syntax may not really work in another, for example. Take maps (dictionaries in Smalltalk). In Smalltalk, the number of arguments to a method is encoded in the method name. So class Dictionary has a method called at:put:

aMap at: 3 put: 9.

In a javanese language, you’d tend to use a different name, say, put

aMap.put(3, 9);

However, with skins, you need to either use the very same name

aMap.at:put:(3, 9);

which looks weird and may even conflict with other parts of the syntax, or have some automated transformation

aMap.atPut(3, 9);

All of which looks odd and may have issues (after all at:Put: would be a distinct, legal Newspeak method name which would also map to atPut). And what happens if you start writing your code in the javanese syntax? How do I map put into a 2 argument Newspeak method name? p:ut:? pu:t:? Maybe in this case, it takes a single tuple as its argument:

aMap put: {3. 9}.

There may be a creative way out; Mads Torgersen once suggested a syntax like

aMap.at(3) put(9)

Or maybe we map all names into ratHole.

The standard procedure call syntax also has more substantial difficulties. Without a typechecker, it’s hard to get the number of arguments right. The Smalltalk keyword syntax, while unfamiliar to many, has a huge advantage in a dynamically typed setting - no arity errors (if you’ve written any javascript, you probably know what I mean).

In addition, the Smalltalk keyword notation is really nice for defining embedded DSLs, as Vassili notes in a recent post. This is a point that I want to expand upon at some future time.

So I’m pretty sure that regardless of whether we use skins or not, we’ll retain the keyword message send syntax across all skins. It’s just a good idea for this sort of language.

There are syntactic elements that are easy to tweak so that they are more familiar to a wider audience. In some cases, there are standard syntactic conventions that work well, and we can just adopt them. For example, using curly braces as delimiters for class and method bodies (and also closures), or using return: instead of ^. If these were the only issues, one might not really consider skins, since the differences are minor. The current draft spec mentions some of these.

Skins may be most valuable for issues in between the two extremes cited above. One of the most obvious is operator precedence. People have been taught a set of precedences in elementary school, and most have never recovered. Programmers have also learned C or something similar, they have even more expectations in this regard.

Newspeak, like Smalltalk, gives all binary operators equal precedence, evaluating them in order from left to right.

5 + 4 * 3 evaluates to 27 in Smalltalk, not to 17.

Now I have never, ever had a bug due to this, but many people get all worked up over this issue. Why not just give in, and follow standard precedence rules? Well, there is the question of whose rules - C, Java, Perl? What about operators those languages don’t have (ok, so Perl probably has all operators in the known universe and some to spare)?

Another issue is that Newspeak is designed to let you embed domain specific languages as libraries. Then the standard choices don’t always make sense. Allow people to set precedence explicitly you say? This is problematic. Newspeak aims to stay simple. This is a matter of taste and judgement. If you like an endless supply of bells and whistles, look elsewhere.

Skins might give us an out. Some skins would dictate the precedence of popular operators (leaving the rest left-to-right, as in Scala for example). This means your DSL may look odd in another skin, but maybe that’s tolerable.

Once you have skins, you can also address issues that otherwise aren’t worth dealing with - like dots. If you really feel the need to write aList.get instead of aList get, a suitable skin could be made available.

It looks like language skins can be used to bridge over minor syntactic differences, but not much more. On the other hand, if you don’t have skins, you have a better chance of establishing a shared lingua franca amongst a programming community.

Overall, my sense is that such skins are more trouble than they’re worth.

19 comments:

  1. I've been thinking a bit about mapping smalltalk-ish message calls to c-ish ones?

    By best idea so far is a bit of a odd one, but I do think it has its charms.

    For at:put:, define that as

    put(at, value)

    Then allow the names of the arguments also implicitly double as the names for named parameters.

    You are then allowed to use it either with or without named parameters.

    so aMap.put(1, 3) is ok, and so is aMap.put(at: 1, value: 3)

    Obviously any overriding definition would be required to use the same name for the parameters though.

    ReplyDelete
  2. I'm a bit undecided about that skinning idea. Back in the nineties, I once wrote a macro collection to make C code look like Modula-2. It was great to work with, well, for me at least. So making the skin a user (i.e. developer) option sounds mandatory to me.
    Some time ago, I blogged about multipiece method names for Java, which would be a great addition (being a Smalltalk fetishist myself ;)). But it also shows a major Java pitfall: VarArgs. Does Newspeak have such a thing at all? If not, would it make sense to introduce them? If so, don't do it Java-ish ;)
    Another point to consider is developers talking to each other. This may be much more difficult, if the common language being used has a different style wrt. each developer's preferences. So it may be nice to have it skinnable (and certainly something unique in computer languages), but may not be practical in use.

    ReplyDelete
  3. I'm with Christoffer on this one. I'm from a pure C/Javanese background, and admit that the sight of method calls in Smalltalk or Objective-C scare me a bit. As do the alternatives you suggest.

    Only something like aMap.put(at: 1, value: 3) or aMap.put(at=1, value=3) feel natural to my somewhat blinkered intuition. And only then if I can drop them when I there's no ambiguity.

    ReplyDelete
  4. I don't think one can come up with a name mangling scheme that will allow an isomorph transformation between e.g. Smalltalkish and Javanese, that also doesn't look totally weird and/or contains name clashes. The syntaxes are too far apart...

    Maybe you can use characters disallowed in Smalltalk names but present in Java (maybe '$' for example?) and the other way around. But that will probably look horrible in both languages.

    Maybe it's ok to settle on a partial mapping, with some cruft and crudity? Then you could allow users to write parts of the system in another language, like Java, which is then called to. This is of course a lot less nice, but it probably works. When users want to alter or understand the core code, they will need to learn the other language.

    I think the underlying runtime concepts of Newspeak (no global state etc.) are so attractive that this could be good enough.

    ReplyDelete
  5. Just don't do it. Have more spine.

    Newspeak has a lot of good new ideas, and I think that comes across well. This is a good chance to make people reconsider their prejudices about syntax.

    (I would prefer Lisp syntax, by the way. Who would have thought? ;)

    ReplyDelete
  6. My reaction is that syntax skins as a bridge to learning the true syntax of Newspeak is a good thing, while a skin as the final development language is not. To elaborate... my team has been learning more and more Groovy over the last year as a replacement for Java. As newcomers are introduced to the language their code looks exactly like Java, but as they mature it starts to look more and more different from Java. And eventually (months?) their mind converts from trying to map Java into Groovy and they start complaining about why Java can't be more Groovy.

    The .NET space perhaps offers a good counter example of not migrating a syntax skin into the base, best syntax. VB, C#, and F# are all arguably different skins over the same VM concepts and capabilities. But migrating from one to the other is a conscious decision that must happen dramatically. You don't decide to make your VB code more C#ish (to the best of my knowledge).

    F# offers an interesting capability. By default, the syntax is very close to OCaml. But there is a source code flag called #light (hash-light) that puts your code into 'clean mode' that eliminates many of the OCaml quirks (it makes whitespace matter and eliminates all those 'in' keywords, for one). I assume that the core is the OCaml syntax and #light is a layer on top of it but I really have no idea. The point is that I don't see anyone migrating from #light syntax to OCaml syntax. In fact, many F# examples I see that use #light looks much more like C# than OCaml. Use of mutable data and CLR datatypes are prime offenders. But I don't see a migration path from #light to OCaml core that I can tell.

    Anyway, those are a few examples to consider. Good luck!

    ReplyDelete
  7. I agree with Pascal,

    Just don't do it!

    Newspeak is a language that challenges in many ways besides syntax. If people aren't willing to look pass trivial syntax differences, then what hope is there of them coming to grips with many of the advanced features in Newspeak? I thought that the whole idea was to challenge and promote something different, not to allow people to carry on creating more of the same undisturbed. This is the mistake Java made IMO, and why OO is still not fully grasped by the majority of Java programmers even today.

    How about building the language you want? Being opinionated is all the rage right now, just talk to the Ruby crowd :)

    If people are willing to look seriously at Erlang then Newspeak with its Smalltalk roots has little to worry about.

    BTW. Being mega-popular is over rated. Just talk to the Java guys who are watching their jobs go to Mumbai. All Newspeak needs to do is be viable.

    I say stick to your guns, you'll attract the right crowd that way! As for the rest, who needs them? Besides, they are happy were they are now :)

    Paul.

    ReplyDelete
  8. I'm with you. Simple skins seem more trouble than they're worth.

    Same with operator precedence. Just stonewall it.

    As to keywords instead of symbols, avoid the temptation to provide alternatives with the same meaning. Pick the best choice and move on.

    ReplyDelete
  9. hi,

    I've been following Newspeak for a few months now. It has changed my outlook on programming languages. Thanks for that.

    Syntax is not the place where you want to offer options to programmers.

    The more syntax options you provide, the more programmers will have to spend time deciding on completely irrelevant issues ("Are curly braces on a new line?" "Spaces after opening braces or not?" etc) Then people start making "style checkers". IDEs offer "style options". A huge waste of time.

    I like the idea of F#'s light option, or Haskell, where whitespace and indentation matter. Take it to the extreme: decide on as much of syntax as possible. Every character matters. If code doesn't match the expected syntax, let the compiler reject it. I'm talking about cold, hard errors here, not warnings. You've put a space too many between arguments? Sorry, can't be parsed. You've indented with 3 spaces instead of 4? Sorry, can't parse that.

    There's a usability credo: offering options to users is cowardly design. Newspeak has guts. Keep it that way.

    Kurt

    ReplyDelete
  10. Syntax skins can also let us use another natural language (instead of English) in names of identifiers, methods, etc, in our programs, via a simple lookup table.

    ReplyDelete
  11. Echoing what others have said: Don't do it. It would further divide an already small community. I cannot imagine anyone who would not be able to intellectually make the switch from Java's syntax to Smalltalk's. A comparable switch did not stop people from adopting Ruby. Some of Smalltalk's *concepts* I found a lot harder to swallow, e.g. a lack of modules.

    If something has to give, opt for a single Python-like or Dylan-like syntax.

    ReplyDelete
  12. PyObjC changes the objective-c message name at:put: to at_put_, so a.at_put_(1, 2). It's a little awkward at first, but it's an established convention you might want to look into.

    ReplyDelete
  13. What's wrong with the (Curryfied)
    aMap.at(3).put(9) ?
    It is a bit like Mads' proposal, but parses just fine in the Java world, and is kinds readable, sortof ...

    ReplyDelete
  14. Oscar,

    I'm not clear how this works. Who does the currying? And how do I distinguish

    aMap at:3 put: 9

    from

    (aMap at:3) put: 9

    both of which seem to translate into the same Javanese text:

    aMap.at(3).put(9);

    in your proposal (most likely I've misunderstood it).

    Please feel free to send me a note off line if you really care to discuss this.

    ReplyDelete
  15. Gavin gets it! The big opportunity is not Javanese, it is Jappanese (or Hindi, or whatever non-english language works for you).

    I didn't get it until the other day listening to Robert Lefkowitz's brilliant PyCon 2007 keynote on "the importance of programming literacy".

    We have to get over this, "you can have any language you like, as long as it is english" thing. It really does make curly braces and no keywords seem insignifcant.

    Pete F

    ReplyDelete
  16. Gilad,

    You mentioned that syntax/program skinning within an IDE is not new. Any references? I would be interested in reading about it.

    - Geoff

    ReplyDelete
  17. Geoff,

    Heres a reference from 12 years ago at least

    http://www.dreamsongs.com/Files/PatternsOfSoftware.pdf#page=45

    You have to read between the lines a little, but its right there in the paragraph starting "The first problem goes away.."

    ReplyDelete
  18. I would like
    aMap at(3)put(10)
    note no space here between at() and put().

    If we'd write it in Chinese it looks like:
    序列 於(3)置(10)

    Natural since it feels like filling the blank exercise.

    aMap at(3)put(Map new(10)withAll(5))

    某序列 於(3)置(序列 新(10)皆(5))

    to have better feeling I'd prefer:
    aMap at<2>put<10>.

    What in common is they all close themselves, which is better than at:put: the open one.

    ReplyDelete
  19. Thanks for the useful feedback!
    Of course, this still may strike programmers as unfamiliar. For the time being, it's on the back burner.

    ReplyDelete