Room 101

A place to be (re)educated in Newspeak

Monday, January 06, 2020

The Build is Always Broken

Programmers are always talking about broken builds: "The build is broken", "I broke the build" etc. However, the real problem is that the very concept of the build is broken. The idea that every time an application is modified, it needs to be reconstructed from scratch is fundamentally flawed. A practical problem with the concept is that it induces a very long and painful feedback loop during development. Some systems address this by being wicked fast. They argue that even if they compile the world when you make a change, it's not an issue since it'll be done before you know it. One problem is that some systems get so large that this doesn't work anymore. A deeper problem is that even if you rebuild instantly, you find that you need to restart your application on every change, and somehow get it back to the stage where you were when you found a problem and decided to make a change. In other words, builds are antithetical to live programming. The feedback loop will always be too long. Fundamentally, one does not recreate the universe every time one changes something. You don't tear down and reconstruct a skyscraper everytime you need to replace a light bulb. A build, no matter how optimized, will never give us true liveness It follows that tools like make and its ilk can never provide a solution. Besides, these tools have a host of other problems. For example, they force us to replicate dependency information that is already embedded in our code: things like imports, includes, uses or extern declarations etc. give us the same file/module level information that we manually enter into build tools. This replication is tedious and error prone. It is also too coarse grain, done at the granularity of files. A compiler can manage these dependencies more precisely, tracking what functions are used where for example. Caveats: Some tools, like GN, can be fed dependency files created by cooperating compilers. That is still too coarse grain though. In addition, the languages these tools provide have poor abstraction mechanisms (compare make to your favorite programming language) and tooling support (what kind of debugger does your build tool provide?). The traditional response to the ills of make is to introduce additional layers of tooling of a similar nature, like Cmake. Enough!

A better response is to produce a better DSL for builds. Internal DSLs, based on a real programming language, are one way to improve matters. Examples are rake and scons, which use Ruby and Python respectively. These tools make defining builds easier - but they are still defining builds, which is the root problem I am concerned with here. So, if we aren't going to use traditional build systems to manage our dependencies, what are we to do? We can start by realizing that many of our dependencies are not fundamental; things like executables, shared libraries, object files and binaries of whatever kind. The only thing one really needs to "build" is source code. After all, when you use an interpreter, you can create only the source you need to get started, and then incrementally edit/grow the source. Using interpreters allows us to avoid the problems of building binary artifacts. The cost is performance. Compilation is an optimization, albeit an important, often essential, one. Compilation relies on a more global analysis than an interpreter, and on pre-computing the conclusions so we need not repeat work during execution. In a sense, the compiler is memoizing some of the work of the interpreter. This is literally the case for many dynamic JITs, but is fundamentally true for static compilation as well - you just memoize in advance. Seen in this light, builds are a form of staged execution, and the binary artifacts that we are constantly building are just caches. One can address the performance difficulties of interpreters by mixing interpretation with compilation. Many systems with JIT compilers do exactly that. One advantage is that we don't have to wait for the optimization before starting our application. Another is that we can make changes, and have them take effect immediately by reverting to interpretation, while re-optimizing. Of course, not all JITs do that; but it has been done for decades, in, e.g., Smalltalk VMs. One of the many beauties of working in Smalltalk is that you rarely confront the ugliness of builds. And yet, even assuming you have an engine with a JIT that incrementally (re)optimizes code as it evolves, you may still be confronted with barriers to live development, barriers that seem to require a build. Types. What if your code is inconsistent, say, due to type errors? Again, there is no need for a build step to detect this. Incremental typecheckers should catch these problems the moment inconsistent code is saved. Of course, incremental typecheckers have traditionally been very rare; it is not a coincidence that live systems have historically been developed using dynamically typed languages. Nevertheless, there is no fundamental reason why statically typed languages cannot support incremental development. The techniques go back at least as far as Cecil; See this paper on Scala.js for an excellent discussion of incremental compilation in a statically typed language. Tests. A lot of times, the build process incorporates tests, and the broken build is due to a logical error in the application detected by the tests. However, tests are not themselves part of the build, and need not rely on one - the build is just one way to obtain an updated application. In a live system, the updated application immediately reflects the source code. In such an environment, tests can be run on each update, but the developer need not wait for them. Resources. An application may incorporate resources of various kinds - media, documentation, data of varying kinds (source files or binaries, or tables or machine learning models etc.). Some of these resources may require computation of their own (say, producing PDF or HTML from documentation sources like TeX or markdown), adding stages that are seldom live or incremental. Even if the resources are ready to consume, we can induce problems through gratuitous reliance on file system structure. The resources are typically represented as files. The deployed structure may differ from the source repository. Editing components in the source repo won't change them in the built structure. It isn't easy to correct these problems, and software engineers usually don't even try. Instead, they lean on the build process more and more. It doesn't have to be that way. We can treat the resources as cached objects and generate them on demand. When we deploy the application, we ensure that all the resources are precomputed and cached at locations that are fixed relative to the application - and these should be the same relative locations where the application will place them during development in case of a cache miss. The software should always be able to tell where it was installed, and therefore where cached resources stored at application-relative locations can be found. The line of reasoning above makes sense when the resource is accessed via application logic. What about resources that are not used by the application, but made available to the user? In some cases, documentation and sample code and attached resources might fall under this category. The handling of such resources is not part of the application proper, and so it is not a build issue, but a deployment issue. That said, deployment is simply computation of a suitable object to serialized to a given location, and should be viewed in much the same way as the build; maybe I'll elaborate on that in separate post. Dealing with Multiple Languages. Once we are dealing with multiple languages, we may be pushed into using a build system because some of the languages do not support incremental development. Assuming that the heart of our application is in a live language, we should treat other languages as resources; their binaries are resources to be dynamically computed during development and cached.

Summary


  • Builds kill liveness.
  • Compilation artifacts are a form of cached resource, the result of staged execution.
  • To achieve liveness in industrial settings, we need to structure our development environments so that any staging is strictly an optimization
    • Staged results should be cached and invalidated automatically when the underlying basis for the cached value is out of date.
    • This applies regardless of whether the staged value is a resource, a shared library/binary or anything else. 
    • The data necessary to compute the cached value, and to determine the cache's validity, must be kept at a fixed location, relative to the application. 

It's high time we build a new, brave, build-free world.


Saturday, January 12, 2019

Much Ado About Nothing

What sweet nothing does the title refers to? It could be about null, but it in fact will say nothing about that. The nothing in question is whitespace in program text. Specifically, whether whitespace should be significant in a programming language.

My instinct has always been that it should not. Sadly, there are always foolish souls who will not accept my instinct as definitive evidence, and so one must stoop to logical arguments instead.

Significant whitespace, by definition, places the burden of formatting on the programmer. In return, it can be leveraged to reduce syntactic noise such as semicolons and matching braces. The alleged
benefit is that in practice, programmers often deal with both formatting and syntactic noise, so eliminating one of the two is a win.

However, this only holds in a world without civilized tooling, which in turn may explain the fondness for significant whitespace, as civilized tooling (and anything civilized, really),  is scarce. Once you assume proper tooling support, a live pretty printer can deal with formatting as you type, so there is no reason for you to be troubled by formatting. So now you have a choice between two inconveniences. Either:

  • You use classical syntax, and programmers learn where to put the semicolons and braces,  and stop worrying about formatting, Or
  • You make whitespace significant, cleanup the syntax, and have programmers take care of the formatting.

At this point, you might say this a matter of personal preference, and can devolve into the kind of religious argument we all know and love. To tip the scales, pray consider the line of reasoning below. I don’t recall encountering it before which is what motivated this post.

In the absence of significant whitespace, a pretty printing (aka code formatting) is an orthogonal concern. We can choose whatever pretty printing style we like and implement a tool to enforce it.  Such a pretty-printer/code-formatter can be freely composed with any code source we have - a programmer typing into an editor, an old repository, and most importantly, other tools that spit out code - whether they transpile into our language or generate code in some other way.

Once whitespace is significant, all those code sources have to be cognizant of formatting.  The tool writer has to be worried about both syntax and formatting, whereas before only syntax was a concern.

You might argue that the whitespace is just another form of syntax; the problem is that it is not always context-free syntax. For example, using indentation to nest constructs is context sensitive, as the number of spaces/tabs (or backspaces/backtabs) depends on context.

In short, significant whitespace (or at least significant indentation) is a tax on tooling. Taxing tooling not only wastes the time and energy of tool builders - it discourages tooling altogether. And so, rather than foist significant whitespace on a language, think in terms of a broader system which includes tools. Provide a pretty printer with your language (like in Go).  Ideally, there's a version of the pretty printer that live edits your code as you type.

As a bonus,  all the endless discussions about formatting Go away, as the designers of Go have noted.  Sometimes the best way to address a problem is to define it away.

There. That was mercifully brief, right? Nothing to it.

Saturday, October 06, 2018

Reified Generics: The Search for the Cure

Many have argued that run time access to generic type information is very important. A very bitter debate about this ensued when we added generics to Java. The topic recurs whenever one designs a statically typed object oriented language. Should one reify generic types, or erase them? Java chose erasure, .Net and Dart chose reification, and all three solutions are in my mind unsatisfactory for various reasons, including but not limited to the handling of erasure or its presumed alter ego, reification.

Pedantic note 1: Throughout this post, I will use the terms erasure and reification as shorthand for erasure and reification of generic type information.

In a well designed object-oriented language, erasure and reification are not contradictory at all. This statement might bear some explanation, so here we go ...

A while back, I discussed the problem of shadow language constructs. I gave examples of shadow constructs such as Standard ML modules, traditional imports etc. Here is another: reified generics.

Generics introduce a form a shadow parameterization. Programming languages all have a perfectly good mechanism for declaring parameterized constructs and invoking them. You may have heard of it - it is widely known by the name function, and it goes back to the 17th century.

Pedantic note 2: Yes, programming language functions are usually not mathematical functions. The parameterization mechanism is however, essentially the same.

Generics introduce a different form of formal and actual parameters. There is a purpose to that: static analysis. However, when languages try to provide run time access to these parameters (i.e., reification of generics), we are creating a lobotomized twin of the existing runtime parameter passing system. A new, redundant, confusing and costly set of mechanisms is added to the run time in order to declare, pass, store and access these parameters.

The first guiding principle of any solution is to avoid shadow constructs. We already have parameterization support, let's use it.


Generics are functions from types to types, typically classes to classes.

Pedantic note 3: If your language is prototype based, generics might be considered functions between prototypes. If your language has primitive types - well, you're up the creek without a paddle anyway. There is no justification for primitive types in an object oriented language.

If classes are expressions, we can write reified generics as ordinary functions. Here's some sample pseudo-code. It's given in a quasi-standard syntax, so I don't waste time explaining Newspeak syntax.

public var List = (function(T) {  
   return class {
    var hd, tl;
       class Link {
         public datum;
         public next;
       }
       public elementType() { return T}
       public add(e) {
           var tmp := Link new.
           tmp.datum := e;
           tmp.next := hd;
           tl := hd;
           hd := tmp;
           return e;
       }
 }
}).memoize();

 Here's a summary of what the above means:
  • We declare a variable named List, initialized to a closure.
  • The closure takes a class T as a parameter and returns a class as its result.
  • The result class is specified via a class expression, which implements a linked list.
  • The class expression includes a nested class declaration, Link.
  • The method memoize() is called on the closure to, well, memoize it. Memoize() returns a memoized version of its receiver.
Each call to List() returns a list class specialized to the actual parameter of List(). We can create a list of integers by saying

var lst := List(Integer).new();

and we can dynamically check what type of elements lst holds

lst.elementType(); // returns Integer, the class object reifying the integer type.

The reified element type is shared among all instances of a given list class, because it is stored in the closure surrounding the class. We avoid duplicating classes with the same parameters - this is just function memoization (and I assume a memoize() method on closures for this purpose). All this works independent of any static types. We are just using standard runtime mechanisms like closures, memoized functions, objects representing classes and yes, class expressions. Smalltalk had these, in essence, over 40 years ago.

What if I don't have class expressions? Well, don't you know that everything should be an expression? Besides, this works fine if you have the ability to define classes reflectively like Smalltalk, or have properly defined nested classes like Newspeak, though it may be a bit more verbose and require more library support to be palatable.

Now let's add types. In the code below, type annotations are completely optional, and have absolutely no runtime effect. They are shown in blue.

public var List = (function(T : Class) {  
   return class {
    var hd, tl : Link;
       class Link {
         public datum : T;
         public next : Link;
       }
       public elementType() : Class { return T}
       public add:(e : T) : T {
           var tmp: Link := Link new.
           tmp.datum := e;
           tmp.next := hd;
           tl := hd;
           hd := tmp;
           return e;
       }
 }
}).memoize();

You may notice one odd thing - we use the name of the formal parameter to the closure, T, as a type. This is justified by the following rule.

Rule 1: In any method or closure m, a formal parameter T of type Class (or any subtype thereof) implicitly defines a type variable T which is in scope in type expressions from the point T is declared to the end of the method/closure.

Next, we need to be able to use the information given by the declaration of List when we write types like List[Integer]. We use the following rule.

Rule 2: If e is an expression of function type with parameter(s) of type Class and return type Class, e's name can be used inside a type expression as a type function; an invocation of the type function e[T1, ..., Tn] denotes the type of the instances of the class returned by the value expression
e(T1, ..., Tn).

We can then write, and typecheck

var lst : List[Integer] := List(Integer).new();
var i : Integer := lst.add(3) + 4;

Oh, and we can still do this:

lst.elementType(); // returns Integer, the class object reifying the integer type.

Similarly, if you wish to create new instances of the incoming type parameters, you should be able to do that in the above regime - though you will have to confront the fact that different subtypes may have different constructors and plan around that explicitly - say, by defining a common construction interface for these types.

The beauty of this scheme is that no runtimes were harmed in the making of this reified generic type system. The type system is completely optional. And this is my point: reification was there all along. The typechecker simply needs to understand this fact and leverage it. The basic approach would work with any language with types reified as values, regardless of whether it has generics.

Interestingly, we now have reification of generics, and erasure, at the same time. The two are not in conflict. Reification is supported by the normal runtime mechanisms, independent of types, which are optional and always erased, carrying no runtime cost or semantics.


Reification of generics is now a choice for library implementers. If they think it is worthwhile to pay the costs, so that, for example, someone can cheaply test if a collection is a collection of integers or a collection of strings, they are free to do so.

If they don't want to pay a price for reification but still want to typecheck generics, they can do that too. Nothing prevents one from explicitly declaring type parameters (as opposed to the implicit ones derived from the class-valued value parameters used for reification).

Tangent, TL; DR: Now is the time to mention that traditional reification of generics - that is, runtime support for a shadow parameterization mechanism - is a disaster. It hurts performance in both space and time; Just ask the brave VM engineers who struggled with these issues on the Dart VM. Mitigating that introduces enormous complexity into the runtime and requires a huge effort, which would be better spent doing something good and useful instead.

In systems designed to support multiple programming languages, reification brings a different problem. All languages must deal with the complexity of reification; worse they must conform to the expectations of the reified generic type system of the "master language" (C# or Java, for example).

Consider .Net, the poster child of generic reification. Originally, .Net was intended to be a multi-language system, but dynamic language support there has suffered, in no small part due to reification. Visual Basic was a huge success until .Net came along and made it conform to C#. And what Iron Ruby/Python programmer ever enjoyed being forced to feed type arguments (whatever those might be) into a collection they are creating?

In contrast, the JVM was conceived as a monolingual system. Sun management deluded themselves that Java was the ultimate programming language (though yours truly did try to hint, ever so gently, that further progress in PL was at least conceivable). And yet, the JVM has become home to a wide variety of languages. This is due to multiple factors, invokedynamic not the least among them. But erasure plays a crucial and underappreciated role here as well. If not for erasure, the JVM would have the above-mentioned problems wrt dynamic languages, just like .Net.

Of course, generics have many issues that are independent of reification. The great difficulties with generics come up when they interact with subtyping. All the problems of variance, as well as inference, are rooted in that interaction. If you are happy with any existing approach, you should be able to incorporate it into the above reification strategy - but I am not aware of any pre-existing generic design that I would consider satisfactory.

I think I may now have a plausible approach to the typing issues, but the margins of this blog are too narrow to contain it.  A follow up post will either make it all clear, or confess that it hasn't worked out. The above comments on reification stand on their own in any case.

Monday, May 29, 2017

Dead Program's Society

In my last post I discussed live literate programming. I concluded the post by noting that the approach I had discussed had one glaringly obvious flaw. No one seems to have pointed out that flaw, so I was forced to point it out myself, in my Programming 17 keynote. The recording of the talk was a bit deficient, in that the camera operator focused on me rather than on the screen. Alas, I am not nearly as interesting as the screen; the screen was where all the action, demos etc. took place. To rectify that, this post will include a few video snippets that should be very close to the demos given in the talk.

But first, what of the glaring flaw? The flaw is that the mechanism I described for creating literate programs using Madoko and Ampleforth was not compositional. It allowed for embedding live widgets inside rich text, but only at one level. The widgets themselves would contain text, but there wasn't a way to include widgets in that embedded text recursively. If only the rich text editor could treat itself as a widget that it could self-embed, the system would compose to arbitrary depth.

Sadly, non-compositional text editors are the norm. Most widget sets have some sort of text widget, but that widget typically traffics in text only. Over twenty years ago, the Strongtalk system addressed this problem (and we were by no means the first to do so). I demo'ed this in my talk: here's a brief video recreating the main points of that demo.

The demo shows the embedding of live IDE components in both the ordinary text editors used to edit code, and in rich text generated from markup. One open question is when to use either approach, or how to integrate them. Another point I highlight is the failure of liveness in instance methods, something I've discussed in previous posts. Having raised that problem, I moved on to showing my approach to solving it by demoing Newspeak's exemplar mode.

The Strongtalk demo emphasized literate programming issues; the Newspeak demo focused on liveness. Ampleforth is aimed at addressing both of these, and so I went on to demonstrate Ampleforth, the system I described in my prior post. In the talk, I showed Ampleforth embedded in the presentation itself. Here's a recording of essentially the same thing.

Obviously, one cannot embed live programs in conventional presentation tools like PowerPoint or Keynote (or Prezi, for that matter). Instead, the presentation was built using Lounge, a system being developed by Bill Burdick. Bill shares a very similar vision for live literate programming, which he calls Illuminated programming. He uses Lounge to run Leisure, a purely functional, lazy language. Lounge certainly lacks the visual polish of commercial tools, but unlike those tools, its fundamental architecture is sound. A Lounge document (such as a presentation) is defined using emacs's Org mode format. Some adjustments were necessary to embed Ampleforth into Lounge - basically setting things up to run within an IFrame; these tweaks should make it easier to embed Ampleforth into other web based tools as well.

What of our glaring flaw? The flaw is not in Ampleforth itself. Rather, it lies in the text editors in which it is embedded and those which are embedded in the widgets we use. Fortunately, the DOM actually does better in this fundamental respect, and so on the web, this is fixable. Unfortunately, the web's basic text editing facility, content-editable text, is all but unusable. Lounge itself has a text editor that doesn't suffer from this weakness. In principle, we could access the Lounge editor from Newspeak, but in this case we embedded Ampleforth with its rudimentary editor (based on content-editable text). I'm planning to modify Newspeak's web environment to use CodeMirror as its default editor, which should help as well.

An important question that came up in the Q&A was whether liveness was actually desirable. After all, the code is supposed to be correct for all possible inputs, and relying on examples rather than abstract reasoning and proofs might be dangerous. My response was that liveness takes nothing away - you can go prove invariants to your heart's content. Liveness can be abused, but overall it makes people more productive by reducing the cycle between specifying intent and measuring actual results.

Dave Ungar later gave another, even better answer: that invariants themselves should be reified in the environment, so we can  remember them all, communicate them to others, ask them to monitor the program to see if they are violated etc.

To Life, Literacy and the Pursuit of Happiness!


Sunday, November 20, 2016

Illiterate Programming

I have long been a fan of literate programming, especially live literate programming. I wrote a brief note about the topic a while ago, but for various reasons did not distribute it. Recently, the early release of Eve (very nice work) has injected some new life in this area.  So I decided to belatedly post my musings on the subject.

Ironically, posting live programming content is difficult on many web publishing venues, such as this blog, or Medium.  So if you actually want the substance of this post, you'll have to follow this link.

Friday, November 28, 2014

A DSL with a View

In a previous post, I promised to explain how one might define UIs using an internal DSL. Using an internal DSL would allow us to capitalize on the full power of a general purpose programming language and avoid having to reinvent everything from if-statements to imports to inheritance.

Hopscotch, Newspeak's UI framework, employs such an internal DSL.  Hopscotch has been discussed in a paper and in a talk.  It will take more than one post to describe Hopscotch; here we will focus on its DSL, which is based on the notion of fragment combinators.

Fragments describe pieces of the UI; they may describe individual widgets, or views constructed from multiple pieces, each of which is in turn a fragment. A fragment combinator is then a method that produces a fragment, possibly from other fragments.

One of the simplest fragment combinators would be label:, which takes a string. The expression

label: 'Hello Brave New World' 

would be used to put up the string "Hello Brave New World" on the screen.  Other examples might be

button: 'Press me' action: [shrink]

which will display a button



that will call the method shrink when invoked.  The combinator button:action: takes two arguments - the first being a string that serves as the label of the button, and the second being a closure that defines the action taken when the button is pressed. Closures in Newspeak are delimited with square brackets, and need not provide a parameter list if no parameters are required.  This is the most lightweight syntax for literal functions you will find. Along with the method invocation syntax, where method names embed colons to indicate where arguments should be placed, this gives a very readable notation for many DSLs.

Further examples:

row: {
  button: 'Press me' action: [shrink]. 
  button: 'No, press me' action: [grow].
}

The row: combinator takes a tuple of fragments (tuples in Newspeak are delimited by curly braces, and their elements are separated by dots) as its argument and lays out the elements of the tuple horizontally:


the column: combinator is similar, except that it lays things out vertically

column: {
  button: 'Press me' action: [shrink]. 
  button: 'No, press me' action: [grow]
}

produces:



In mainstream syntax (Dart, in this case) the example could be written as

column(
  [button('Press me', ()=> shrink), 
   button('No, press me', () => grow)]
 )

The Newspeak syntax is remarkably readable though, and its advantage over mainstream notation becomes more pronounced as examples grow. Of course none of this works at all if your language doesn't support both closures and literal lists/arrays.

So far, this is very standard stuff, much like building a tree of views in most systems.  In most UI frameworks, we'd write something like

new Column([new Button('Press me', ()=> shrink), 
            new Button('No, press me', () => grow)]
    )

which is less readable and more verbose.  Since allocating an instance is more verbose than calling a method in most languages, the fact that fragment combinators are represented via methods, which act as factories for various kinds of views, helps make things more concise. It's a simple trick, but worth noting.

The advantage of thinking in terms of fragments becomes clearer once you consider  less obvious fragments such as draggable:subject:image:, which takes a fragment and allows it to be dragged and dropped. Along with the fragment, it takes a subject (what you might call a controller) and an image to use during the drag. Making drag-and-drop a combinator means everything is potentially draggable. Conventional designs would make this a special attribute of certain things only, losing the compositionality that combinators provide.

Presenters are a a specific kind of fragment that is especially important. Presenters provide user-defined views in the Model-View-Controller sense.  To define your own view, you subclass Presenter. Because presenters are fragments, any user defined view can be part of a predefined compound fragment like column: or draggable:subject:image:.

A presenter has a method definition which computes a fragment tree which is used to render the presenter. The fragment DSL we discussed is used inside of presenters. All the combinators are methods of Presenter, so they are inherited by any class implementing a view, and are therefore in scope inside a presenter class.  In particular, combinators are available in the definition method.

To see how all this works, imagine implementing the well known todoMVC example.  We'll define a subclass of Presenter called todoMVCPresenter  to represent the todoMVC UI.  The UI is supposed to present a list of todo items. It consists of a column with:


  1. A header in large text saying "todos" 
  2. An input  zone where new todos are added.
  3. A list of todos, which can be filtered using controls given in (4). 
  4. A footer, that is empty if there are no todos at all. It materializes as a set of controls once there are todos.

We can translate these requirements directly:

definition = (
     ^column: {
      (label: 'todos') hugeFont.
      inputZone.
      todoList.
      footer.
     }
)

More notes on syntax:  methods are defined by following their header with an equal sign and a body delimited by parentheses; ^ signifies return; method invocations that take no parameters list no arguments, e.g., inputZone, not inputZone(); chained method invocation does not require a dot - so it's 
(label: 'todos') hugeFont rather than label('todos').hugeFont.

We haven't yet specified what inputZone, todoList and footer do. They are all going to be defined as methods of todoMVCPresenter.  We can define the UI in such a top down fashion  because we are working with a language that supports procedural abstraction. You get it for free in an internal DSL.  

We can then define the pieces, such as

footer = (
 ^subject todos isEmpty 
 ifTrue: [nothing]
 ifFalse: [controls]
)

Here, we use conditionals to determine what view to produce, depending on the state of the application.  The application logic is embodied in the controller, subject, which we query for the todos list. The nothing combinator does exactly what it says; controls is a method we would have to define in todoMVCPresenter, detailing what should appear in the footer if it is visible.  Again, the code corresponds closely to the natural language description in bullet (4) above.

To elaborate todoList we'll need a loop or recursion or something of that nature; in fact, we'll use the higher order method collect:, which is Newspeak's version of map

todoList = (^list:[subject todos collect: [:todo | todo presenter]])

The list: combinator packages a list of fragments into a list view. We pass list: a closure that computes the list to todo items. 

Aside: We could have passed it the list itself, computed eagerly. Often, fragment combinators take either suitable fragment(s) a closure that would compute them.

To compute the fragment list we compute a presenter for each individual todo item by mapping over the original list of todos.
The closure we pass to collect: takes a single parameter, todo. Formal parameters to closures are introduced prefixed by a colon, and separated from the closure body by a vertical bar.

What are the odds that higher order functions (HOFs) were part of your external DSL? Even if they were, one would have to define a suite of useful  HOFs. One should factor the cost of defining useful libraries into any comparison of internal and external DSLs.

The Hopscotch DSL has other potential advantages. Because fragment combinators are methods, you can override them to customize their behavior.
We believe we can leverage this  to customize the appearance of things, a bit like CSS. To make this systematic, we expect to define whole groups of overrides in mixins. I'm not showing examples where Hopscotch is used this way because we have done very little in that space (and this post is already too long anyway). And we haven't spoken about the other advantages of Hopscotch.
such as its navigation model, lack of modality and very clean embodiment of MVC.

Ok, now it's time for the caveats.


  1. First and foremost,  Hopscotch currently lacks a good story for reactive binding.  In our example, that means you'd have to put explicit logic to refresh the display in some of the controls.  This makes things less declarative and harder to use. We always planned to solve that problem; I hope to address it in a later post.  But the high order bit is that we have code in a general purpose language that gives a very readable, declarative description of the UI. It corresponds directly to the natural language description of the requirements.
  2. Hopscotch lacks functionality in order to support richer UIs, but the design is naturally extensible: one adds more fragment combinators.
  3. We also want more ports, especially to mobile/touch platforms. However, Hopscotch has already proven quite portable: it runs on native Win32, on Squeak's Morphic and on HTML (the latter port is still partial, but that is just an issue of engineering resources).   More ports would help us deal with another controversial goal - defining a UI platform that works well across OS's and devices.


Regardless of the current engineering limitations, the point here is simply to show the advantages of a well-designed internal DSL for UI.  The lessons of Newspeak and Hopscotch apply to other languages and systems, albeit in an attenuated fashion.