<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-2447174102813539049</id><updated>2011-12-16T22:52:03.392-08:00</updated><category term='Reflection'/><category term='Aspects Modularity'/><category term='Web Platform and Objects as Software Services'/><category term='Newspeak'/><category term='constructors'/><category term='Modularity'/><title type='text'>Room 101</title><subtitle type='html'>A place to be (re)educated in Newspeak</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>56</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5432238908376918714</id><published>2011-06-05T04:33:00.000-07:00</published><updated>2011-06-05T04:47:25.207-07:00</updated><title type='text'>Types are Anti-Modular</title><content type='html'>Last week I attended a workshop on language design. I made the off-the-cuff remark that types are actually anti-modular, and that comment resonated enough that I decided to tweet it. This prompted some questions, tweets being  a less than perfect format for elaborate explanation of ideas (tweets are anti-communicative?). And so, I decided to expand on this in a blog post.&lt;br /&gt;&lt;br /&gt;Saying that types are anti-modular doesn’t mean that types are bad (though it certainly isn’t a good thing). Types have pros and cons, and this is one of the cons.  Anyway, I should explain what I mean and how I justify it.&lt;br /&gt;&lt;br /&gt;The specific point I was discussing when I made this comment was the distinction between &lt;span style="font-style:italic;"&gt;separate compilation&lt;/span&gt; and &lt;span style="font-style:italic;"&gt;independent compilation&lt;/span&gt;.  Separate compilation allows you to compile parts of a program separately from other parts. I would say it was a necessary, but not sufficient, requirement for modularity. &lt;br /&gt;&lt;br /&gt;In a typed language, you will find that the compiler  needs some information about the types your compilation unit is using. Typically, some of these types originate outside the compilation unit. Even if your program is just: &lt;span style="font-weight:bold;"&gt;print(“Hello World”)&lt;/span&gt;, one may need to know that string literals have a type &lt;span style="font-weight:bold;"&gt;String&lt;/span&gt;, and that the argument type of &lt;span style="font-weight:bold;"&gt;print&lt;/span&gt; is &lt;span style="font-weight:bold;"&gt;String&lt;/span&gt;.  The definition of &lt;span style="font-weight:bold;"&gt;String&lt;/span&gt; comes from outside the compilation unit.  This is a trivial example, because it is common for &lt;span style="font-weight:bold;"&gt;String&lt;/span&gt; to be part of the language definition. However, substantial programs will tend to involve user-defined types at the boundaries of compilation units (or of module declarations,which may or may not be the same thing).&lt;br /&gt;&lt;br /&gt;A consequence of the above is that you need some extra information to actually compile.  This could come in the form of interface/signature declaration(s) for any type(s) not defined within your compilation unit/module, or as binary file(s) representing the compiled representation of the code where the type(s) originated. Java class files are an example of the latter. &lt;br /&gt;&lt;br /&gt;Whatever the form of the aforementioned type information, you depend on it to compile your code - you cannot compile without it. In some languages, this introduces ordering dependencies among compilation units. For example, if you have a Java package &lt;span style="font-weight:bold;"&gt;P1&lt;/span&gt; that depends on another package &lt;span style="font-weight:bold;"&gt;P2&lt;/span&gt;, you cannot compile &lt;span style="font-weight:bold;"&gt;P1&lt;/span&gt; before compiling &lt;span style="font-weight:bold;"&gt;P2&lt;/span&gt;. You either compile them simultaneously (giving up on even separate compilation) or you must compile &lt;span style="font-weight:bold;"&gt;P2&lt;/span&gt; first so you have class files for it around. The situation is better if the language supports separate signature declarations (like Modula-3 or ML) - but you still have to have these around before you compile. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Semi-tangent: Of course, you can fake signature declarations by dummy package declarations. Java chose to avoid the conceptual overhead of separate signature declarations, on the assumption that pragmatically, one could get by without them.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Contrast this with independent compilation, where you compile your module/compilation-unit independently of anything else. The code that describes the values (and types) that your module requires may not even exist yet. Obviously, independent compilation is more modular than separate compilation. How do you achieve this in the presence of types? The short answer is you don’t. &lt;br /&gt;&lt;br /&gt;Wait: we are blessed, and live in a world where the gods have bestowed upon us such gifts as type inference. What if I don’t have to write my types at all, and have the compiler just figure out what types I need? The problem is that inference only works well within a module (if that). If you rely on inference at module boundaries, your module leaks implementation information. Why? Because if you infer types from your implementation, those types may vary when you modify your implementation. That is why, even in ML, signatures are declared explicitly.&lt;br /&gt;&lt;br /&gt;Wait, wait: Surely optional types avoid this problem? Not exactly. With an optional type system you can compile independently - but you cannot typecheck independently. Indeed, this is the point: there is no such thing as modular typechecking. If you want typechecking across modules, you need to use some of the same types in across modules. You can either replicate the types or place them in some specific module(s). The former clearly isn’t very modular. The latter makes some modules dependent on declarations defined elsewhere which means they cannot be typechecked independently. In the common case where types are mandatory, modules can not be compiled independently.&lt;br /&gt;&lt;br /&gt;Now, there is an argument to be made that modules have dependencies regardless, and that we cannot reason about them without being aware of these dependencies. Ergo, the types do not make change things fundamentally.  All true. Even in dynamic language we have some notion of type or signature in our head. Formalizing that notion can be helpful in some ways, but it has downsides.  One such downside is that formalizing the types reduces our ability to manage things in a perfectly modular way.  You cannot typecheck modules in isolation (except in trivial cases) because types capture cross-module knowledge.&lt;br /&gt;&lt;br /&gt;One often hears the claim that types are in fact valuable (or even essential) to modularity because they can describe the interface(s) between modules.  There lies the problem: the type cannot serve this purpose unless it is available in more than one module. Types are inherently non-local - they describe data that flows &lt;span style="font-style:italic;"&gt;across&lt;span style="font-weight:bold;"&gt;&lt;/span&gt;&lt;/span&gt; module boundaries. The absence of types won’t buy you modularity on its own though. Far from it. But types and typechecking act as an inhibitor - a countervailing force to modularity.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5432238908376918714?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5432238908376918714/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5432238908376918714' title='68 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5432238908376918714'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5432238908376918714'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2011/06/types-are-anti-modular.html' title='Types are Anti-Modular'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>68</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-4744769120922988271</id><published>2011-03-20T10:05:00.000-07:00</published><updated>2011-03-20T10:28:03.746-07:00</updated><title type='text'>The Truthiness Is Out There</title><content type='html'>For the past 5 years or so, I (like many others) have argued that Javascript is the assembly language of the internet platform.  Over this period, some of the obstacles that limit the applicability of said platform have been slowly pushed aside. Things like client side storage, or decent performance.&lt;br /&gt;&lt;br /&gt;However, Javascript remains a seriously limited language for platform implementation. Here are some of the problems.&lt;br /&gt;&lt;br /&gt;Concurrency primitives. There aren’t any. Now I really should be thankful for that, as the last thing I want is another shared-state concurrency threading model a la Java.  And yet, ingrate that I am, I remain dissatisfied.  Yes, I can write my own scheduler to provide pseudo-concurrency, but there are no primitives that let me find out how much true concurrency is available and to let me use it.  Nor is there any efficient way for me to preempt an activity. &lt;br /&gt;&lt;br /&gt;This lack of appropriate primitives for platform construction is a recurring theme. Take serialization for example. If I need to write a serializer that can incrementally store and retrieve arbitrary objects (say, because I want to implement orthogonal persistence) , I hit difficulties with things like closures. A closure in Javascript is a black box. This makes sense most of the time - but not for the system designer. One wants mechanisms that permit manipulation of the structure of all program elements - closures, prototypes, what have you. &lt;br /&gt;&lt;br /&gt;Of course, the challenge is to do this while preserving security. Not everyone should be able to do this - but a program should be able to do it on its own objects, for example.&lt;br /&gt;&lt;br /&gt;Another, somewhat related, problem area is stack manipulation. I want to implement an efficient debugger with fix-and-continue debugging for example.  Or resumable exceptions. &lt;br /&gt;&lt;br /&gt;Weak pointers are another problem. For example, Newspeak mixins need to track all their invocations, so that when a mixin definition is modified, all classes derived from can be updated. You’d like to use a weak collection of these mixin invocations for that purpose.&lt;br /&gt;&lt;br /&gt;I’ve never been happy with the approach that says that the only true encapsulation mechanism in the language is closures. I find that very low level. I want objects that can hide their internals directly (private members) - and of course, I want a mechanism to get around that in some ways, so I can program the system in itself (and write things like serialization).&lt;br /&gt;&lt;br /&gt;I miss &lt;span style="font-weight:bold;"&gt;doesNotUnderstand:&lt;span style="font-style:italic;"&gt;&lt;/span&gt;&lt;/span&gt;, which I can emulate by going through certain hoops. There is work going on to alleviate this with proxys but I don’t see them doing what I really want.  I can however, use them to implement a mechanism that does.&lt;br /&gt;&lt;br /&gt;All of this may be too much to ask of a language where false can sometimes be interpreted as true, and where equality is non-transitive.  But it isn’t too much to ask for the backbone of internet programming. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;&lt;span style="font-weight:bold;"&gt;Tangent:&lt;/span&gt; the occasional truthiness of false is a case study in the pitfalls of language design. It stems from the interaction of two bad decisions. First, we have the implicit coercion of any type to a boolean - a nasty C legacy. Then we have primitive types, which leads to (non-transparent) autoboxing. Since any object is truthy, and autoboxing false creates an object, you can end up with an automatic, hidden conversion that interprets false as true.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I know that there is a lot of ongoing work to resolve this on the ECMAScript standards committee, whose members seem well aware of many of these issues. The timeline for addressing these problems is however, rather depressing. Between the time it takes to revise a standard, and the time it takes for it to be implemented and widely adopted (so you can actually rely on it)  we may see these things fixed by the early 2020s (I kid you not).&lt;br /&gt;&lt;br /&gt;Will that make Javascript a language a human should program in? I doubt it, but that shouldn’t be the goal.   The goal should be to provide a foundation that will help in building more attractive languages on top of Javascript and the browser.&lt;br /&gt;&lt;br /&gt;In this vein, work continues on Newspeak for the browser. We have a pretty solid Newspeak-to-Javascript compiler, though we still need to improve performance and add key platform functionality. At some point, I hope we can release this.&lt;br /&gt;&lt;br /&gt;The vision for the Ministry of Truthiness goes beyond just a compiler of course - we want Hopscotch as well of course. Calling the DOM API from Newspeak is possible of course, but not really attractive. We also want the IDE in the browser as well. At least as much of it as possible - debugging might require using a browser extension or something due to the difficulties cited above.&lt;br /&gt;&lt;br /&gt;Doing all this on top of Javascript has proven tedious and frustrating, and I hope things improve more quickly in the future; but we will get there in time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-4744769120922988271?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/4744769120922988271/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=4744769120922988271' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4744769120922988271'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4744769120922988271'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2011/03/truthiness-is-out-there.html' title='The Truthiness Is Out There'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-1347145938334382360</id><published>2011-02-28T18:49:00.000-08:00</published><updated>2011-02-28T21:16:57.601-08:00</updated><title type='text'>The Ministry of Nesting &amp; Testing</title><content type='html'>Unit testing was introduced to the OO world by Kent Beck, in his &lt;a href="http://www.xprogramming.com/testfram.htm"&gt;seminal work on SUnit&lt;/a&gt;, the Smalltalk unit testing framework. Other languages have introduced their own unit testing frameworks following SUnit’s lead.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Tangent:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; Unit testing was part of the overall introduction of extreme programming/agile development, which is just one of the major trends Smalltalk has brought to the world. Along with refactoring (which we all know can’t be done without static types, which is why it was invented in a dynamically typed language), IDEs, reflective OO APIs, GUI builders, pop-up menus and bitmapped GUIs in general. Smalltalk is the veritable Prometheus of OO, and its destiny seems not dissimilar.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Newspeak started out with an adaptation of SUnit, NSUnit, which is what you’ll find in the public release.  It has a rather nice Hopscotch based GUI integrated into the IDE, but we always felt we could improve upon it.&lt;br /&gt;&lt;br /&gt;Minitest is our revised unit testing framework, which we’ve been using since, oh, mid-2010 or so.  Minitest takes the opportunity to rationalize the way we structure unit tests and takes advantage of Newspeak’s support for nesting to make things simpler and easier to use.&lt;br /&gt;&lt;br /&gt;Minitest was designed by Vassili Bykov, and the examples below are shamelessly lifted from the superb documentation Vassili wrote for the Minitest class.&lt;br /&gt;&lt;br /&gt;In Minitest, you define a &lt;span style="font-style: italic;"&gt;testing module&lt;/span&gt;, that is designed to test a particular interface (not a particular implementation).  To run tests, one needs to feed the testing module with the particular implementation(s) that one wishes to test. A &lt;span style="font-style: italic;"&gt;test configuration&lt;/span&gt; module does just that. Newspeak naturally enforces this separation of interface and implementation.&lt;br /&gt;&lt;br /&gt;The testing module class’s factory typically takes three arguments: the Newspeak platform, the testing framework (a Minitest instance) and a factory for the object under test.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class ListTesting usingPlatform: platform minitest: minitest listClass: listClass = (&lt;br /&gt;&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;|&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt; private TestContext = minitest TestContext.&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt; private List = listClass.&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;|&lt;br /&gt;)(&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;      class ListTests = TestContext (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;      | list = List new. |&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt; &lt;br /&gt;     ) (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;          testAddition = (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;        &lt;br /&gt;             list add: 1.&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;        &lt;br /&gt;             assert: (list includes: 1)&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;          )&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;          testRemoval = (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;        &lt;br /&gt;             list add: 1; remove: 1.&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;        &lt;br /&gt;             deny: (list includes: 1)&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;          )&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt; &lt;br /&gt;    ) : (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;TEST_CONTEXT = ()&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;)&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The example shows a hypothetical (and rather simplistic) module definition for testing lists.  I’m sure all readers of this blog are fluent in Newspeak, but just in case, the module definition has a factory method that takes the 3 parameters mentioned above: &lt;span style="font-style: italic; font-weight: bold;"&gt;platform&lt;/span&gt; (the Newspeak platform, from which all kinds of generally useful libraries might be obtained), &lt;span style="font-style: italic; font-weight: bold;"&gt;minitest&lt;/span&gt; (an instance of Minitest, naturally) and &lt;span style="font-style: italic; font-weight: bold;"&gt;listClass&lt;/span&gt;, a factory that will produce lists for us to test.&lt;br /&gt;&lt;br /&gt;Nested inside the testing module is a &lt;span style="font-style: italic;"&gt;test context&lt;/span&gt; (aka &lt;span style="font-style: italic;"&gt;test fixture&lt;/span&gt;) class &lt;span style="font-style: italic; font-weight: bold;"&gt;ListTests&lt;/span&gt;, inside of which you write your tests. Test methods are identified by the convention that their names begin with &lt;span style="font-weight: bold;"&gt;test&lt;/span&gt;. Each test will be executed in a test context; that is, for each test method being run, Minitest will instantiate a fresh &lt;span style="font-weight: bold; font-style: italic;"&gt;ListTests&lt;/span&gt;  object.  That is why &lt;span style="font-style: italic; font-weight: bold;"&gt;ListTests&lt;/span&gt; is called a test context - it provides a context for a single test.&lt;br /&gt;&lt;br /&gt;It is common to define test context classes like &lt;span style="font-style: italic; font-weight: bold;"&gt;ListTests&lt;/span&gt; as subclasses of the class &lt;span style="font-weight: bold; font-style: italic;"&gt;TestContext&lt;/span&gt; defined by the Minitest framework, but that is not essential.  &lt;span style="font-style: italic; font-weight: bold;"&gt;TestContext&lt;/span&gt; provides useful methods like &lt;span style="font-weight: bold; font-style: italic;"&gt;deny:&lt;/span&gt;, so it is convenient to use it.  However, what identifies &lt;span style="font-style: italic; font-weight: bold;"&gt;ListTests&lt;/span&gt; as a test context is the marker class method &lt;span style="font-weight: bold; font-style: italic;"&gt;TEST_CONTEXT&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Minitest will do its work by examining the nested classes of the test module and seeing which are test contexts (that is, which have a class method named &lt;span style="font-style: italic; font-weight: bold;"&gt;TEST_CONTEXT&lt;/span&gt;). For each test context tc, Minitest will list all its test methods (the ones with names beginning with &lt;span style="font-weight: bold;"&gt;test&lt;/span&gt;) and for each of those, it will instantiate tc and call the selected method on it, gathering data on success or failure.&lt;br /&gt;&lt;br /&gt;Minitest does away with concepts like &lt;span style="font-style: italic; font-weight: bold;"&gt;TestResource&lt;/span&gt;. that are typically used to hold data for tests.&lt;br /&gt;&lt;br /&gt;In the simple case above, the data for the test gets created by the instance initializer of &lt;span style="font-style: italic; font-weight: bold;"&gt;ListTests&lt;/span&gt; . However, what if the data for the test needs to be shared among multiple tests (say, because it is expensive to create)?&lt;br /&gt;&lt;br /&gt;As an example, suppose we want to test a compiler, and setting up the compiler is relatively costly.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class CompilerTesting usingPlatform: platform&lt;br /&gt;                   minitest: minitest&lt;br /&gt;                   compilerClass: compilerClass = (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;| Compiler = compilerClass. |&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt; )&lt;br /&gt;(&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;    class CompilerHolder = (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;      | compiler = Compiler configuredInAParticularWay. |&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;   &lt;br /&gt; )(&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;    class StatementsTests ( ...) (....): ( TEST_CONTEXT = ())&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;  )&lt;br /&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Minitest leverages Newspeak’s nested structure in these cases. A test context (&lt;span style="font-weight: bold; font-style: italic;"&gt;StatementTests&lt;/span&gt; above) does not have to be a direct nested class of the test module. Instead, we can nest it more deeply inside another nested class (&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;CompilerHolder&lt;/span&gt;).  That nested class will serve to hold any state that we want to share among multiple tests - in our case, an instance of the compiler, which it will create and store as part of its initialization.&lt;br /&gt;&lt;br /&gt;As you can see there is no need for a special &lt;span style="font-style: italic; font-weight: bold;"&gt;setUp&lt;/span&gt; method or a test resource class. Newspeak’s nesting structure and built-in instance initializers take care of all that. If the shared resource is just an object in memory, then it will also be disposed of via garbage collection after the test is run. Of course, some resources cannot be just garbage collected. In that case, one should define a method named  &lt;span style="font-weight: bold; font-style: italic;"&gt;cleanUp&lt;/span&gt; in the test context class.&lt;br /&gt;&lt;br /&gt;As mentioned in the beginning of the post, we need a test configuration to run the tests, as the test module definition is always parametric with respect to any implementation that we would actually test.&lt;br /&gt;&lt;br /&gt;A test configuration module is defined by a top level class with the factory method&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;packageTestsUsing: ideNamespace&lt;/span&gt;&lt;br /&gt;The factory takes a namespace object that should provide access to the testing module declaration and to any concrete classes or objects we want to test. This arrangement is very similar to how we package applications from within the IDE.&lt;br /&gt;&lt;pre&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;/span&gt;&lt;blockquote&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class ListTestingConfiguration packageTestsUsing: ideNamespace = (&lt;/span&gt;   &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;|&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;private ListTesting = ideNamespace ListTesting.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;private Collections = ideNamespace Collections.&lt;/span&gt;  &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;|&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt; )( ‘required’&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;    testModulesUsingPlatform: platform minitest: minitest = (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;      &lt;br /&gt;     ^{&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;ListTesting usingPlatform: platform&lt;br /&gt;                    minitest: minitest&lt;br /&gt;                    listClass: (Collections usingPlatform: platform) LinkedList.&lt;/span&gt;  &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;      }&lt;br /&gt; )&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;)&lt;/span&gt;&lt;/blockquote&gt;&lt;/pre&gt;&lt;br /&gt;The method &lt;span style="font-style: italic; font-weight: bold;"&gt;testModulesUsingPlatform:minitest:&lt;/span&gt;  must be provided by the configuration. It will be called by Minitest to produce a set of testing modules, each of which will be processed by the framework as outlined above (i.e., searched for test contexts to be run). In the example, only one test module is returned, but if we wanted to process multiple &lt;span style="font-style: italic; font-weight: bold;"&gt;List&lt;/span&gt; implementations (say &lt;span style="font-weight: bold; font-style: italic;"&gt;ArrayList&lt;/span&gt; as well as &lt;span style="font-weight: bold; font-style: italic;"&gt;LinkedList&lt;/span&gt;) we could write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;/span&gt;&lt;blockquote&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class ListTestingConfiguration packageTestsUsing: ideNamespace = (&lt;/span&gt;   &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;|&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;private ListTesting = ideNamespace ListTesting.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;private Collections = ideNamespace Collections.&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;|&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt; )( ‘required’&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;    testModulesUsingPlatform: platform minitest: minitest = (&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;&lt;br /&gt;     | collections = Collections using: platform. |      &lt;br /&gt;     ^{&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;ListTesting usingPlatform: platform&lt;br /&gt;                    minitest: minitest&lt;br /&gt;                    listClass: collections LinkedList.&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;        &lt;br /&gt;        ListTesting usingPlatform: platform&lt;br /&gt;                    minitest: minitest&lt;br /&gt;                    listClass: collections ArrayList.&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;    &lt;br /&gt;      }&lt;br /&gt; )&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;)&lt;/span&gt;&lt;/blockquote&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The IDE recognizes test configurations based on the name of the factory method - that is, a class with a class method &lt;span style="font-style: italic; font-weight: bold;"&gt;packageTestsUsing:&lt;/span&gt; is considered a test configuration, and the IDE will provide a &lt;span style="font-weight: bold; color: rgb(51, 102, 255);"&gt;run tests&lt;/span&gt; link in the (upper right hand corner) class browser in that case, as shown in the screenshot below (click on it to enlarge).&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/-CPuX26CGxgI/TWxlAm_fn6I/AAAAAAAAAJ0/y0ia_B0Of_M/s1600/CollectionsTestConfiguration.png"&gt;&lt;img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 400px; height: 223px;" src="http://4.bp.blogspot.com/-CPuX26CGxgI/TWxlAm_fn6I/AAAAAAAAAJ0/y0ia_B0Of_M/s400/CollectionsTestConfiguration.png" alt="" id="BLOGGER_PHOTO_ID_5578945099546468258" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Clicking on the link will call the &lt;span style="font-style: italic; font-weight: bold;"&gt;packageTestsUsing:&lt;/span&gt; method on the class with an argument representing the IDE’s namespace, and feed the results into Minitest.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-2F-lo81mT78/TWxja680fvI/AAAAAAAAAJs/WdVbxTIirlA/s1600/CollectionsTestResults.png"&gt;&lt;img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 400px; height: 185px;" src="http://2.bp.blogspot.com/-2F-lo81mT78/TWxja680fvI/AAAAAAAAAJs/WdVbxTIirlA/s400/CollectionsTestResults.png" alt="" id="BLOGGER_PHOTO_ID_5578943352557305586" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This is all you need to know to use Minitest. Actually, it’s considerably more than what  you need to know, as I’ve also explained how a bit about how the framework goes about its business.&lt;br /&gt;&lt;br /&gt;It is worth noting how Minitest cleanly breaks down the multiple roles an SUnit &lt;span style="font-weight: bold; font-style: italic;"&gt;TestCase&lt;/span&gt; has.  The definition of a set of tests is done by a test context. The actual configuration is done a test configuration.  And the actual command to run a specific test (the thing that should be called &lt;span style="font-weight: bold; font-style: italic;"&gt;TestCase&lt;/span&gt;) is not the user’s concern anymore - the test framework handles it but need not expose it.  In SUnit these three roles are conjoined. Perhaps this is why I never really felt comfortable with SUnit.&lt;br /&gt;&lt;br /&gt;Likewise, no need to worry over test resources and special set up methods. The net result is a framework that is very easy to use and simple to understand.&lt;br /&gt;&lt;br /&gt;It’s intriguing to note that one could actually structure a Java unit testing framework this way; we rely on introspection, interfaces and nested (inner) classes. However, it is not natural to do so in Java. Nested classes in Java are usually (and often rightly) regarded a trap to be avoided. A design like Minitest is much more likely to crop up in a setting where nesting is idiomatic, like Newspeak. Language influences thought - or lack of thought, as the case may be.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-1347145938334382360?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/1347145938334382360/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=1347145938334382360' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1347145938334382360'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1347145938334382360'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2011/02/ministry-of-nesting-testing.html' title='The Ministry of Nesting &amp; Testing'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-CPuX26CGxgI/TWxlAm_fn6I/AAAAAAAAAJ0/y0ia_B0Of_M/s72-c/CollectionsTestConfiguration.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-4046839287275397177</id><published>2011-01-23T14:31:00.000-08:00</published><updated>2011-01-23T15:08:07.483-08:00</updated><title type='text'>Maybe Monads Might Not Matter</title><content type='html'>This post isn’t really about the Maybe Monad of course. It is more focused on the State Monad, but I have a weakness for alliteration.&lt;br /&gt;&lt;br /&gt;What do&lt;br /&gt;&lt;br /&gt;space suits&lt;br /&gt;nuclear waste containers&lt;br /&gt;romantic conquests&lt;br /&gt;monsters&lt;br /&gt;macros&lt;br /&gt;containers&lt;br /&gt;conversations&lt;br /&gt;&lt;br /&gt;have in common? They’ve all been used as metaphors for monads.&lt;br /&gt;&lt;br /&gt;Last time I looked, the &lt;a href="http://www.haskell.org/haskellwiki/Monad_tutorials_timeline"&gt;Haskell wiki&lt;/a&gt; listed 29 tutorials on the subject, and that is where all these allusions come from.&lt;br /&gt;&lt;br /&gt;Such a wealth of explanatory fauna demands its own (meta-)explanation. Maybe monads are so wildly popular that there is monadic gold rush to cash in on the monad education and training market. And yet, the long-awaited landmark tome, “Category Theory for Dummies in 21 days and 1001 nights” is nowhere to be found.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: There is of course, &lt;a href="http://www.cis.upenn.edu/%7Ebcpierce/"&gt;Benjamin Pierce&lt;/a&gt;'s  &lt;a href="http://www.amazon.com/Category-Computer-Scientists-Foundations-Computing/dp/0262660717"&gt;delightfully slim book&lt;/a&gt; on the topic which is as close to a gentle introduction as one can come.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Could it just be that people just have a hard time understanding monads? If so,  what are the prospects of mass adoption? Or making Just(something) out of Nothing am I?&lt;br /&gt;&lt;br /&gt;By now you realize that if monads were a stock, I’d be shorting it.  I’m going to go get myself in a huge amount of trouble now, just as I did when I took a hideously pragmatic tack on continuations some years ago.&lt;br /&gt;&lt;br /&gt;The most important practical contribution of monads in programming is, I believe, the fact that they provide a mechanism to interface pure functional programming to the impure dysfunctional world.&lt;br /&gt;&lt;br /&gt;The thing is, you don’t really need them for that. Just use actors. Purely functional actors can interact with the stateful world, and this has been known since before Haskell was even conceived.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: Before you crucify me for being so  narrow minded,  pray consider the mitigating circumstance that I have used the words "practical" and "pure functional programming" in the same sentence. There are many who regard that, rather than my disrespectful attitude toward monads, as grounds for my institutionalization.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Some kind soul will doubtless point out to me how you can view actors as monads or some such. Be that as it may, it is beside the point. You can invent, build and most importantly, use, actors without ever mentioning monads. &lt;a href="http://knol.google.com/k/carl-hewitt-s-homepage-http-carlhewitt-info#"&gt;Carl Hewitt&lt;/a&gt; and his students did that decades ago.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: I have to say how amazing that is. Actors were first conceived by Hewitt  in 1973(!), and &lt;a href="http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&amp;amp;tid=9756"&gt;Gul Agha's thesis&lt;/a&gt; has been around for 25 years.  I firmly believe actors are the best answer to our concurrency problems, but that is for another post. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;You can write an actor in a purely functional language, and have it send messages to file systems, databases or any other other stateful actor. Because the messages are sent asynchronously, you never see the answer in the same activation (aka &lt;span style="font-style: italic;"&gt;turn&lt;/span&gt;) of the actor, so the fact that these actors are stateful and may give different answers to the same question at different times does not stain your precious snow white referential transparency with its vulgar impurity. This is pretty much what you do with a monad as well - you bury the stateful filth in a well marked shallow grave and whistle past it.&lt;br /&gt;&lt;br /&gt;Of course, your troubles are by no means over. Actors or monads, the state is out there and you will have to reason about it somewhere. But better you reason about it in a well bounded shallow grave than in C.&lt;br /&gt;&lt;br /&gt;What is important to me is that the notion of actors is intuitive (a pesky property of Dijkstra’s hated anthropomorphisms, like self) for many people. Yes, there are many varieties of actors and I have my preferences - but I’ll take any one of them over a sheaf of categories.&lt;br /&gt;&lt;br /&gt;Speaking of those preferences, look at the &lt;a href="http://erights.org/"&gt;E&lt;/a&gt; programming language (I often point at &lt;a href="http://research.google.com/pubs/author35958.html"&gt;Mark Miller&lt;/a&gt;’s &lt;a href="http://erights.org/talks/thesis/index.html"&gt;PhD thesis&lt;/a&gt;) or on &lt;a href="http://soft.vub.ac.be/amop/"&gt;AmbientTalk&lt;/a&gt;. I would like to have something similar in Newspeak (and in its hypothetical functional subsets, &lt;a href="http://gbracha.blogspot.com/2010/01/avarice-and-sloth.html"&gt;Avarice and Sloth&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;Of course, there is much to be said for a programming culture that excludes anyone without at least the potential of a PhD. Indeed, if you can surround yourself with such people, you can do amazing things with just Java, C++ and Python (though they will still be more productive if they have the good taste to use something nicer).  So perhaps the true value of monads lies in their exclusionary nature.&lt;br /&gt;&lt;br /&gt;Nevertheless, there is more work to be done than some small, celebrated priesthood can or will do all by itself. There is real value in functional programming in some contexts, and it needs to integrate with stateful programming. Actors provide a model that is much easier for most humans to relate to.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-4046839287275397177?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/4046839287275397177/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=4046839287275397177' title='47 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4046839287275397177'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4046839287275397177'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2011/01/maybe-monads-might-not-matter.html' title='Maybe Monads Might Not Matter'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>47</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-6183408182308465326</id><published>2010-12-11T17:57:00.000-08:00</published><updated>2010-12-11T19:05:36.733-08:00</updated><title type='text'>Reflecting on Functional Programming</title><content type='html'>In this post, I wanted to make a case for reflection in the context of pure functional programming. I don’t know that pure functional languages should be different than other languages in this regard, but in practice they are: they generally do not have reflection support.&lt;br /&gt;&lt;br /&gt;To demonstrate the utility of reflection, I’m going to revisit one of my favorite examples, parser combinators. In particular, we’ll consider how to implement executable grammars. Executable grammars are a special flavor of a parser combinator library that allows semantic actions to be completely separated from the actual grammar.  I introduced executable grammars as part of the Newspeak project.&lt;br /&gt;&lt;br /&gt;Consider the following grammar:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;statement -&gt; ifStatement | returnStatement&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;ifStatement -&gt; ‘if’ expression ‘then’ expression ‘else’ expression&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;returnStatement -&gt; ‘’return’ expression&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;expression -&gt; identifier | number&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;In Newspeak, we’d write:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class&lt;/span&gt;&lt;span style="font-style: italic;"&gt; G = ExecutableGrammar ( |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;(* lexical rules for identifier, number, keywords elided *)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;(* The actual syntactic grammar *)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    statement = ifStatement | returnStatement.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    ifStatement = if, expression, then, expression, else, expression.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    returnStatement = returnSymbol, expression.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    expression = identifier | number.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;|)()&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Now let’s define some semantic action, say, creating an AST.  The Newspeak library let’s me do this in a subclass, by overriding the code for the production thus:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class&lt;/span&gt;&lt;span style="font-style: italic;"&gt; P = G ()(&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    ifStatement = (&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        super ifStatement wrap:[:if :e1 :then :e2 :else :e3 | &lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;            IfStatementAST if: e1 then: e2  else: e3&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;            ].&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    )&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    returnStatement = (&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        super returnStatement wrap:[:return :e | ReturnStatementAST return: e].&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    )&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;No prior parser combinator library  allowed me to achieve a similar separation of grammar and semantic action. In particular, I don’t quite see how to accomplish this in a functional language. &lt;br /&gt;&lt;br /&gt;In the functional world, I would expect one function would define the actual grammar, and another would  perform the semantic actions (in our example, build the AST).  The latter function would transform the result of basic parsing as defined by the grammar, producing an AST as the result.  We’d use pattern matching to define this function. I’d want to write something like:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;makeAST = &lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;span style="font-weight: bold;"&gt;fun&lt;/span&gt;  ifStatement(ifKw, e1, thenKw, e2, elseKw, e3) = &lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        IfStatementAST(makeAST(e1), makeAST(e2), makeAST(e3)) |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;span style="font-weight: bold;"&gt;fun&lt;/span&gt; returnStatement(returnKw, e) = ReturnsStatementAST(makeAST(e)) |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;span style="font-weight: bold;"&gt;fun&lt;/span&gt; identifier(id) = IdentifierAST(id)&lt;/span&gt; |&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;span style="font-weight: bold;"&gt;fun&lt;/span&gt; number(n) = NumberAST(id)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;where &lt;span style="font-style: italic;"&gt;makeAST&lt;/span&gt; maps a concrete parse tree into an abstract one. Which in this case looks pretty easy.&lt;br /&gt;&lt;br /&gt;The question arises: where did the patterns &lt;span style="font-style: italic;"&gt;ifStatement&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;returnStatement&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;number&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;identifier&lt;/span&gt; come from?&lt;br /&gt;&lt;br /&gt;Presumably, our parser combinator library defined them based on our input grammar.  The thing is, the library does not know the specifics of our grammar in advance. It cannot predefine data constructors for each conceivable production. Instead, it should create these data constructors dynamically each time it processes a specific grammar.&lt;br /&gt;&lt;br /&gt;How does one create datatypes dynamically in a traditional functional language? I leave that as an exercise for the reader.&lt;br /&gt;&lt;br /&gt;Ok, so while it is clear that creating datatypes on the fly would be very helpful here, it is also clear that it isn’t easy to do in the context of such languages. How would you describe the type of the library? The datatype it returns is created per grammar, and depends on the names of the grammar production functions. Not easy to characterize via Hindley-Milner. And yet, once the library created the datatype, we actually could utilize it in writing  type safe clients.&lt;br /&gt;&lt;br /&gt;Instead, our library will probably generate values of some generic datatype for parse trees. A possible representation is a pair, consisting of a tag of type string representing the name of the production used to compute the tree, and a list consisting of the elements of the tree, including vital information such as where in the input stream a given token was found and what string exactly represented it.  We cannot elide such lexical information, because some users of our library will need it (say, pretty printers). Then I can write:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;makeAST = &lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;fun&lt;/span&gt;&lt;span style="font-style: italic;"&gt;  parsetree(“if”, [ifKw, e1, thenKw, e2, elseKw, e3]) = &lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        IfStatementAST(makeAST(e1), makeAST(e2), makeAST(e3)) |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;fun&lt;/span&gt;&lt;span style="font-style: italic;"&gt; parsetree(“return”, [returnKw, e]) = ReturnsStatementAST(makeAST(e)) |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;fun&lt;/span&gt;&lt;span style="font-style: italic;"&gt; parsetree(“id”,[id]) = IdentifierAST(id)&lt;/span&gt; |&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;fun&lt;/span&gt;&lt;span style="font-style: italic;"&gt; parsetree(“number”,[in]) = NumberAST(in)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Obviously, we’ve lost the type safety of the previous version.   Ironically, the inability of the language to generate types dynamically forces code to be less statically type safe.&lt;br /&gt;&lt;br /&gt;Now ask yourself - how does our combinator library produce values of type &lt;span style="font-style: italic;"&gt;parsetree&lt;/span&gt; with an appropriate tag? For each &lt;span style="font-style: italic;"&gt;parsetree&lt;/span&gt; value &lt;span style="font-style: italic;"&gt;p(tag, elements)&lt;/span&gt;, the tag is a string corresponding to the name of the production that was used to compute &lt;span style="font-style: italic;"&gt;p&lt;/span&gt;. How does our library know this tag? The tag is naturally specified via the name of the production function in the grammar. To get at it, one would need some introspection mechanism to get the name of a function at run time. Of course, no such mechanism exists in a standard functional language.  It looks like you’d have to force the user to specify this information redundantly as a string, in addition to the function name (you still need the function name so that other productions can refer to it).&lt;br /&gt;&lt;br /&gt;You might argue that we don’t really need the string tags - just return a concrete parse tree and distinguish the cases by pattern matching.  However, it isn’t generally possible to tell the parse tree for a number from that for an identifier without re-parsing. Even when you can tell parse trees apart, the resulting code is ugly and inefficient, as it is repeating some of the parser’s work.&lt;br /&gt;&lt;br /&gt;We could approach the problem via staged execution, writing meta-program that statically transformed the grammar into a program that would provide us with the nice datatype constructors I suggested in the beginning. If one goes that route, you might as well define an external DSL based on BNF or PEGs.&lt;br /&gt;&lt;br /&gt;So, I assert that reflection is essential to this task, and dynamic type generation would be helpful as well, which would require dependent types and additional reflective functionality.  However, maybe I’ve overlooked something and there is some other way to achieve the same goal. I’m sure someone will tell me - but remember, the  library must not burden the user by requiring redundant information or work, it must operate independent of the specifics of a given grammar, and it must keep semantic actions entirely separate.&lt;br /&gt;&lt;br /&gt;In any case, I think there is considerable value in adding at least a measure of introspection, and preferably full reflection, to traditional functional languages, and  interesting work to be done fleshing it out.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-6183408182308465326?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/6183408182308465326/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=6183408182308465326' title='19 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6183408182308465326'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6183408182308465326'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/12/reflecting-on-functional-programming.html' title='Reflecting on Functional Programming'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>19</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-3627304551852394907</id><published>2010-07-31T15:47:00.000-07:00</published><updated>2010-08-04T19:27:06.503-07:00</updated><title type='text'>Meta Morphosis</title><content type='html'>Recently, I was pointed at &lt;a href="http://antimatter15.com/misc/rotatedgooglecss3.html"&gt;rotated Google&lt;/a&gt;. This is cool in a perverse sort of way, and it immediately reminded me of Morphic.&lt;br /&gt;&lt;br /&gt;For those who don’’t know, Morphic is the name of the Squeak (and in earlier times, Self) GUI. John Maloney (who nowadays does &lt;a href="http://info.scratch.mit.edu/About_Scratch"&gt;Scratch&lt;/a&gt;) introduced the original Morphic GUI back in the halcyon days of Self, and later adapted it to Squeak Smalltalk. The latest incarnation of a Morphic-style UI is Dan Ingalls’ &lt;a href="http://www.lively-kernel.org/"&gt;lively kernel&lt;/a&gt;, which adapted the ideas to Javascript and the web. You can &lt;a href="http://www.lively-kernel.org/repository/lively-wiki/example.xhtml"&gt;check it out in your browser&lt;/a&gt; right now.&lt;br /&gt;&lt;br /&gt;What makes Morphic interesting is that it is compositional. The basic building block is a &lt;span style="font-weight: bold;"&gt;morph&lt;/span&gt;, which is just a graphical entity. The key is that everything in Morphic is a morph - including not just the basic morphs like lines and curves, polygons, circles, ellipses but also text, buttons, lists, windows ... you name it.&lt;br /&gt;&lt;br /&gt;All morphs support pretty general graphical combinators - translation, rotation, scaling, non-linear warping, changing color,  grouping/ungrouping etc. It follows that one can interactively rotate, scale or non-linearly warp an entire window running a live application.&lt;br /&gt;&lt;br /&gt;One of my favorite Squeak demos is a class browser that’s been animated so that it floats around the screen, rotating as it goes, coupled with sound effects (a croaking frog is my preference).  Of course you can keep using the browser and add methods or remove instance variables on the fly while it’s doing that.  It’s an amazing display of the power of compositionality in action. It’s also perfectly useless (like rotated Google).&lt;br /&gt;&lt;br /&gt;When running Morphic, you can always interactively ungroup a composite morph and get at its pieces. So you can disassemble the UI and find out what its made of. You can also do the opposite and assemble a UI out of simpler morphs; in a sense, &lt;span style="font-weight: bold; font-style: italic;"&gt;the GUI is the GUI builder&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The situation is quite analogous to the physical world. A real window (the kind used to let light into your house) is assembled from physical pieces, and can be disassembled as well. The window as a whole, and each of its components, can be manipulated in space in uniform ways.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Thankfully, the laws of physics are compositional, since they were not designed by software engineers on a standards committee.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Put another way, if the universe was built like most software, it would have crashed long ago; the big bang would have a different meaning.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;As a demonstration of good computer science, Morphic is brilliant. However, as a working UI it is problematic. You don’t really want your windows to fall apart in the user’s hands because they accidentally pressed some control sequence.&lt;br /&gt;&lt;br /&gt;Looking at how physical windows work, we see that when they are assembled, they are secured so they are not disassembled too easily.  Things are held together with glue or screws or whatever, and you need to make an effort to take the structure apart, perhaps using special tools.&lt;br /&gt;&lt;br /&gt;This points at the way morphic interfaces should evolve. It’s great to have the underlying flexibility that they give you, but we want mechanisms to prevent accidents. We don’t want our applications decomposing by mistake. We also don’t want loose windows rotating by mistake. We need the equivalent of screws to hold things in place. The nice thing about screws is that they can be be used to build things up from parts compositionally, and they can be unscrewed when necessary. That way, we can take advantage of the flexibility of the underlying framework and do cool things with it, while keeping it safe for the end-user.&lt;br /&gt;&lt;br /&gt;As rotating Google and (more significantly) Lively show, the web opens up the possibility of such UIs reaching a broad audience.  I am sure we will get versions of morphic that are more refined, usable, attractive and polished  - all less than three decades since they were introduced in Self. Instant progress!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-3627304551852394907?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/3627304551852394907/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=3627304551852394907' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/3627304551852394907'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/3627304551852394907'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/07/meta-morphosis.html' title='Meta Morphosis'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-9149510859663683241</id><published>2010-07-11T11:09:00.000-07:00</published><updated>2010-07-11T11:20:01.702-07:00</updated><title type='text'>Converting Smalltalk to Newspeak</title><content type='html'>One of the things that has surprised me working with Newspeak is how easy it is to convert Smaltalk code to Newspeak.  We have converted most of  the core libraries of Smalltalk (aka ‘The Blue Book Libraries’) into Newspeak. Most of the conversion was done by people who had never coded in Newspeak before. In some cases, they had never coded in Smalltalk before.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;The syntactic differences are fairly trivial (though they will grow in time) and easily handled - either automatically by tools or by very simple transformations in a text editor. Semantically  moving from Smalltalk to Newspeak is easy (see this document for a detailed discussion of the mechanics), but the opposite is harder.&lt;div&gt;&lt;br /&gt;The main issues one has are API differences between libraries; but this is no different than converting from one Smalltalk implementation to another.&lt;div&gt;&lt;br /&gt;The converted code may not be the most idiomatic Newspeak. The easiest conversion path sticks a bunch of related Smalltalk classes inside a Newspeak module class, without creating any interesting substructure. Nevertheless, the converted code is invariably better than the original. It becomes very clear what the code’s external dependencies are, what the module boundaries should be, who is responsible for initialization etc.  There is no longer any static state.  It’s much easier to tie libraries together (or tear them apart), test them independently and so forth. Once we start enforcing access control, it should be much clearer what the public API really is. All this without losing the flexibility and power of the original.&lt;div&gt;&lt;br /&gt;This is very encouraging, because it means Newspeakers can take advantage of the large body of Smalltalk code that already exists. Any useful Smalltalk library that is publicly available can be converted to Newspeak in pretty short order. It also means that Smalltalkers can take the plunge and migrate to Newspeak relatively easily.&lt;div&gt;&lt;br /&gt;Of course, I don’t expect all the world’s Smalltalkers to instantly convert to Newspeak.   Even if they did, Newspeak would still be a niche language.  For Newspeak to be a success,  we need to reach out to programmers in a variety of communities.  This post, however, is aimed squarely at the Smalltalk community. &lt;div&gt;&lt;br /&gt;There are no doubt many Smalltalk projects that are tied to a specific Smalltalk. There are also Smalltalkers who are too conservative, and don’t want to deal with any language changes.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Still, if you are (or were, in some happier time) a Smalltalker and want to move into the future rather than dwelling on the glorious past, I assert that Newspeak is for you.  If you are using an open source Smalltalk, it is likely you could do better using Newspeak. Newspeak explicitly addresses Smalltalk’s weaknesses: modularity, security, interoperability. Of course, some people aren’t bothered by these weaknesses. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;&lt;b&gt;Tangent:&lt;/b&gt; Arguably, the Smalltalk community is, by natural selection, composed largely of such people; if they were bothered by Smalltalk’s deficiencies, they wouldn’t use it.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;Newspeak should interest those who appreciate the power of Smalltalk but want to move forward.&lt;div&gt;&lt;br /&gt;Of course, you have to be an early adopter by nature. Things will evolve and change under your feet. The syntax will become less Smalltalk-ish over time. Most importantly, you’ll have to learn something new. This is good for your neurons. You may have to port some libraries you rely on. You may have to make changes so that you do not rely on libraries you wouldn’t want to port (like Morphic). However, the result will likely be a product that is easier to deploy, more visually appealing and better integrated with its surrounding environment. Your code will be much more maintainable and better structured.&lt;br /&gt;&lt;br /&gt;I’ve written a simple &lt;a href="http://bit.ly/9kiZxr"&gt;guide that highlights how to go about converting Smalltalk to Newspeak&lt;/a&gt;. Give it a try!&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-9149510859663683241?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/9149510859663683241/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=9149510859663683241' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/9149510859663683241'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/9149510859663683241'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/07/converting-smalltalk-to-newspeak.html' title='Converting Smalltalk to Newspeak'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-6078982268712772069</id><published>2010-06-19T05:33:00.000-07:00</published><updated>2010-06-19T10:47:31.027-07:00</updated><title type='text'>Patterns as Objects in Newspeak</title><content type='html'>I now want to show, concretely, how pattern matching works in the Newspeak extension I mentioned in an earlier post. &lt;span style="font-weight: bold;"&gt;The material from here on is more or less stolen from Felix Geller’s thesis&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Here are some simple pattern literals&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;&lt;1&gt; // matches 1&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;‘a’&gt; // matches ‘a’&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&amp;lt _&amp;gt // matches anything&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Each of these is simply a sugared syntax for an instance of class &lt;span style="font-weight: bold;"&gt;Pattern&lt;/span&gt; (or some subclass thereof). &lt;span style="font-weight: bold;"&gt;Pattern&lt;/span&gt; supports a protocol for matching. When a pattern is asked to match an object, it invokes the object’s &lt;span style="font-weight: bold;"&gt;match:&lt;/span&gt; method, sending itself (the pattern) as the argument. It’s then up to the the &lt;span style="font-weight: bold;"&gt;match:&lt;/span&gt; method to decide if the pattern is something that matches the object.  This is somewhat similar to how extractors work in Scala; you should be able to see that such a protocol preserves data abstraction.&lt;br /&gt;&lt;br /&gt;Any class can implement matching logic as needed, just as any class can declare support for an interface in a statically typed setting.&lt;br /&gt;&lt;br /&gt;Patterns support methods that correspond to combinators, such as&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;p1 | p2 // matches anything that either p1 or p2 matches&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;p1 &amp;amp; p2 // matches whatever both p1 and p2 match&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;p not // matches if p doesn’t&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;p =&gt; actionBlock // matches if p matches; if so, execute actionBlock&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;As a simple example&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;fib: n = (&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    n case: &amp;lt 1&amp;gt | &amp;lt 2&amp;gt =&gt; [^n-1]&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;       otherwise: [^(fib: n-1) + (fib: n-2)]&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The method &lt;span style="font-weight: bold;"&gt;case:otherwise:&lt;/span&gt; is defined in &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt;. It takes a pattern as its first argument, and a closure as its second. If the pattern does not match the receiver, the closure is invoked .&lt;br /&gt;&lt;br /&gt;In our example &lt;span style="font-weight: bold;"&gt;&amp;lt 1 &amp;gt | &amp;lt 2 &amp;gt &lt;/span&gt; matches 1 or 2, as you expect. &lt;span style="font-weight: bold;"&gt;&amp;lt 1 &amp;gt | &amp;lt 2 &amp;gt =&gt; [^n-1]&lt;/span&gt; matches 1 or 2 as well; if that succeeded, it will invoke the closure &lt;span style="font-weight: bold;"&gt;[^n-1]&lt;/span&gt;. Evaluating the closure &lt;span style="font-weight: bold;"&gt;[^n-1]&lt;/span&gt; will cause the enclosing &lt;span style="font-weight: bold;"&gt;fib:&lt;/span&gt; method to return with the result &lt;span style="font-weight: bold;"&gt;n-1&lt;/span&gt;. So &lt;span style="font-weight: bold;"&gt;fib: 1&lt;/span&gt; yields 0, and &lt;span style="font-weight: bold;"&gt;fib: 2&lt;/span&gt; yields 1, as expected. For &lt;span style="font-weight: bold;"&gt;k&lt;/span&gt; &gt; 2,  &lt;span style="font-weight: bold;"&gt;fib:k&lt;/span&gt;  yields &lt;span style="font-weight: bold;"&gt;(fib: n-1) + (fib: n-2)&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Note:&lt;/span&gt; much of the examples below are derived from the Scala &lt;a href="http://lamp.epfl.ch/%7Eemir/written/MatchingObjectsWithPatterns-TR.pdf"&gt;extractor paper&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Here are some other pattern literals, known as &lt;span style="font-style: italic; font-weight: bold;"&gt;keyword patterns&lt;/span&gt;:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&amp;lt num: 1&amp;gt // a keyword pattern (user-defined)&lt;/num:&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&amp;lt multiply: left by: &amp;lt num: 1&amp;gt&amp;gt // nested patterns - this is where you win over visitors&lt;/multiply:&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A keyword pattern literal evaluates to an instance of  &lt;span style="font-weight: bold;"&gt;KeywordPattern&lt;/span&gt;, a subclass of &lt;span style="font-weight: bold;"&gt;Pattern&lt;/span&gt;. A keyword pattern literal specifies a method name (such as &lt;span style="font-weight: bold;"&gt;num:&lt;/span&gt; in the first example, and &lt;span style="font-weight: bold;"&gt;multiply:by:&lt;/span&gt; in the second); the resulting keyword pattern object supports a method of the same name.  For example &lt;span style="font-weight: bold;"&gt;&amp;lt num: 1&amp;gt &lt;/span&gt; &lt;span style="font-weight: bold;"&gt;&lt;/span&gt; responds to &lt;span style="font-weight: bold;"&gt;num:&lt;/span&gt;. What this method does is check if its argument matches the argument specified in the pattern literal - in our example, checking that it equals 1.  If the pattern specifies a nested pattern literal as an argument, the method matches the argument against that pattern recursively.&lt;br /&gt;&lt;br /&gt;An object that wants to match &lt;span style="font-weight: bold;"&gt;&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;&amp;lt num: 1&amp;gt&lt;/span&gt; will define its &lt;span style="font-weight: bold;"&gt;match:&lt;/span&gt; method thus:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;match: p = (^p num: 1)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Similarly,  &lt;span style="font-weight: bold;"&gt;&amp;lt multiply:&amp;lt _&amp;gt by: 1&amp;gt &lt;/span&gt; supports a method &lt;span style="font-weight: bold;"&gt;multiply:by:&lt;/span&gt; that tests its arguments to see if they match the pattern. In this example, the first argument of &lt;span style="font-weight: bold;"&gt;multiply:by:&lt;/span&gt; would be recursively matched against the nested pattern &lt;span style="font-weight: bold;"&gt;&amp;lt _&amp;gt&lt;/span&gt; (which always succeeds) and the second argument would be tested for equality against 1.&lt;br /&gt;&lt;br /&gt;Imagine a class hierarchy for arithmetic expressions.  There are several kinds of terms: numbers, products of terms, and likely other things, like variables. Assume numbers match patterns of the form &lt;span style="font-weight: bold;"&gt;&amp;lt num: n&amp;gt &lt;/span&gt; for some &lt;span style="font-weight: bold;"&gt;n&lt;/span&gt;, and assume products are represented using a class &lt;span style="font-weight: bold;"&gt;Product&lt;/span&gt; with two slots for the product term’s subtrees, &lt;span style="font-weight: bold;"&gt;operand1&lt;/span&gt; and &lt;span style="font-weight: bold;"&gt;operand2&lt;/span&gt;. Instances of &lt;span style="font-weight: bold;"&gt;Product&lt;/span&gt; will match patterns of the form &lt;span style="font-weight: bold;"&gt;&amp;lt multiply: x by: y &amp;gt&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;&lt;/span&gt; for some &lt;span style="font-weight: bold;"&gt;x&lt;/span&gt; and &lt;span style="font-weight: bold;"&gt;y&lt;/span&gt;. The &lt;span style="font-weight: bold;"&gt;match:&lt;/span&gt; method for &lt;span style="font-weight: bold;"&gt;Product&lt;/span&gt; can be written as&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;match: pat = (&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    ^pat multiply: operand1 by: operand2&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The result of a match is a &lt;span style="font-weight: bold; font-style: italic;"&gt;binding&lt;/span&gt;. This is an object that can tell you what the original object being matched was and how it matched - what values are associated with the various names specified in the pattern.&lt;br /&gt;&lt;br /&gt;We’ll use pattern matching to define a method &lt;span style="font-weight: bold;"&gt;simplify:&lt;/span&gt; that transforms product terms of the form &lt;span style="font-style: italic;"&gt;X*1&lt;/span&gt; to &lt;span style="font-style: italic;"&gt;X&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;simplify: expr = ( &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;^expr case: &amp;lt multiply: ?x by: &amp;lt num: 1&amp;gt&amp;gt =&gt; [x] &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;otherwise: [expr].&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If &lt;span style="font-weight: bold;"&gt;expr&lt;/span&gt; is a product of some term &lt;span style="font-weight: bold;"&gt;t&lt;/span&gt; and the number 1,  &lt;span style="font-weight: bold;"&gt;simplify:&lt;/span&gt; will return &lt;span style="font-weight: bold;"&gt;t&lt;/span&gt;, otherwise it will return &lt;span style="font-weight: bold;"&gt;expr&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Note:&lt;/span&gt; The syntax &lt;span style="font-weight: bold;"&gt;?id&lt;/span&gt; is allowed inside pattern literals and will make the corresponding matched value matched available under the name &lt;span style="font-weight: bold;"&gt;id&lt;/span&gt; in other parts of the pattern. How? Read &lt;a href="http://bit.ly/cvMRBQ"&gt;the tech report&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;How does &lt;span style="font-weight: bold;"&gt;x&lt;/span&gt; end up available in the closure? Well, the &lt;span style="font-weight: bold;"&gt;=&gt;&lt;/span&gt; combinator manipulates the scope  of the closure to ensure that the desired accessors are available to it. Groovy programmers will recognize this idea, as will Smalltalkers familiar with GLORP.  If it turns your stomach - well, I had reservations at first too, but all power corrupts, and dynamic languages give you a lot of power :-). This kind of trick must be used with great restraint. In Newspeak, the object capability model is intended to give us control over who can access the required reflective capabilities, so it is not a free-for-all.&lt;br /&gt;&lt;br /&gt;The only language extension needed here are the pattern literals. All the rest is library code. We could dispense with the language extension entirely and just use the library, but things would be slightly more awkward - say&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Pattern multiply: (Pattern variable: #x) by:  (Pattern num: 1) &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;instead of&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&amp;lt multiply: ?x by:&amp;lt num: 1&amp;gt&amp;gt &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In principle, you could add such a library to a mainstream language such as Java. Of course, the typechecking, the absence of &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt;, the inability to add methods dynamically etc. would be crippling to the point where you wouldn’t really want to pursue the idea.&lt;br /&gt;&lt;br /&gt;There is a lot of potential for refinement here. Patterns can be used as first class queries in LINQ like APIs that connect to databases or Prolog style rule engines, for example.&lt;br /&gt;&lt;br /&gt;Check out &lt;a href="http://bit.ly/cvMRBQ"&gt;Felix’s work&lt;/a&gt; if you want to understand it all.  Or wait for the updated Newspeak documentation that will accompany the next release.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-6078982268712772069?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/6078982268712772069/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=6078982268712772069' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6078982268712772069'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6078982268712772069'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/06/patterns-as-objects-in-newspeak.html' title='Patterns as Objects in Newspeak'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-8427279275091527485</id><published>2010-06-03T20:43:00.000-07:00</published><updated>2010-06-03T21:10:44.740-07:00</updated><title type='text'>A Nest of Classes</title><content type='html'>Nested classes, like classes and OO in general, come to us from Scandinavia. The language Beta introduced nested classes in a stunningly elegant way; Java popularized nested classes in a rather different way; and Newspeak builds on them pervasively, in a manner similar to Beta, but still distinct. In this post, I want to explore these variations on nested classes.&lt;br /&gt;&lt;br /&gt;Traditional OO is built on classes.  We note an obvious property:&lt;br /&gt;&lt;br /&gt;(1) Classes contain methods.&lt;br /&gt;&lt;br /&gt;Beta’s core idea is something they call a &lt;span style="font-weight: bold; font-style: italic;"&gt;pattern&lt;/span&gt;. A pattern in this context has nothing to do with pattern matching as we know it in functional programming. Instead, a pattern unifies classes and methods (and types and functions and procedures, but never mind all that).&lt;br /&gt;&lt;br /&gt;Since class = pattern = method, we can perform some substitutions on the remark given above&lt;br /&gt;&lt;br /&gt;(2) Patterns contain patterns.&lt;br /&gt;&lt;br /&gt;This is true in Beta, even if reading this right now, you aren’t clear what it means. Let’s do a few other substitutions of equals for equals:&lt;br /&gt;&lt;br /&gt;(3) Classes contain classes.&lt;br /&gt;(4) Methods contain classes.&lt;br /&gt;(5) Methods contain methods&lt;br /&gt;&lt;br /&gt;Item (3) gives us member classes (using Java terminology); (4) gives us local classes; and (5) gives us nested procedures, like Pascal used to have.&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;The nice things is that all of these features come for free! This is the power of composition. Furthermore, they are all governed by consistent rules, because they fall out of the same definitions.&lt;br /&gt;&lt;br /&gt;One can push this even further - the language gBeta unifies patterns with mixins as well.&lt;br /&gt;&lt;br /&gt;But wait - isn’t this going a bit too far? After all, there is abundant evidence suggesting that the distinction between nouns and verbs, or objects and procedures, is fundamental to human cognition. I know some very good (former) Beta programmers who confess that the uniformity can be confusing.&lt;br /&gt;&lt;br /&gt;That is why Newspeak doesn’t unify classes and methods this way. However, thinking about the unification is valuable even if you don’t go through with it. It will help you produce a consistent and flexible result.&lt;br /&gt;&lt;br /&gt;To be fair, there is a school of thought that argues that different things must be kept very different, and that they can then be specialized so that they are finely tuned to a specific need.  The Modula-3 report made this point nicely with the following quote:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Look into any carpenter's tool-bag and see how many different hammers, chisels, planes and screw-drivers he keeps there - not for ostentation or luxury, but for different sorts of jobs.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt; - Robert Graves and Alan Hodges&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Language design isn't carpentry however.&lt;br /&gt;&lt;br /&gt;Beta’s ideas served as inspiration for Java’s nested classes.  However,  Java is a language in the mainstream tradition, with a philosophy closer to Modula than to Beta. When nested classes were being added to Java, Java already had distinct concepts for classes, methods and variables, and even distinct namespaces for them. This makes it hard to ensure that the rules are uniform and consistent. One tries; some of the smartest people I’ve ever met recommend building tables and cross checking systematically - but really, it is incredibly difficult.&lt;br /&gt;&lt;br /&gt;Hence, the rules for nested classes are quite different from those for methods nested in classes for example. Consider the effect of a modifier such as &lt;span style="font-weight: bold;"&gt;final&lt;/span&gt;. A &lt;span style="font-weight: bold;"&gt;final&lt;/span&gt; class means one thing (a class that may not be subclassed) a &lt;span style="font-weight: bold;"&gt;final&lt;/span&gt; variable another (an immutable variable) and a &lt;span style="font-weight: bold;"&gt;final&lt;/span&gt; method yet another (a method that may not be overridden).&lt;br /&gt;&lt;br /&gt;One of the most important discrepancies has to do with the instance vs. static distinction. A static variable is property of a class, but an instance variable is a property of a specific instance. Conceptually, this is true of methods as well - instance methods are considered a property of an object. That is why they must be dynamically bound - two objects might have different methods. Therein lies the power of object-oriented programming.  However, in Java, a nested class is never a property of an individual object. This is different from Beta (and Newspeak) and carries major implications.&lt;br /&gt;&lt;br /&gt;If a class is a property of an object, then virtual classes arise naturally. Furthermore, the power of polymorphism applies to classes as well. Since we can abstract over objects, we can abstract over their members; those members are typically methods, which is why object oriented and functional programming are not as a different as some would make them out to be. If the members of an object (rather than a class) can be classes, we can abstract over classes. This can help avoid the difficulties that dependency injection frameworks try so awkwardly to address. We’ve realized these benefits in Newspeak.&lt;br /&gt;&lt;br /&gt;Altogether, lack of uniformity prevents a language’s constructs from composing together easily to produce an exponential takeoff in expressivity. Instead, the rules themselves compose, yielding an exponential takeoff in interactions between the them. That is why simplicity is so crucial in language design.&lt;br /&gt;&lt;br /&gt;Newspeak obtains its power not from unifying classes and methods, but from unifying the mechanisms by which they are referenced.  Names are always treated the same - as properties of objects, which are therefore late bound and subject to override by a single set of rules.&lt;br /&gt;&lt;br /&gt;a. A name refers to the nearest lexically enclosing declaration of that name, if there is one; otherwise, it refers to a declaration inherited from the immediately surrounding classes’ superclass.&lt;br /&gt;&lt;br /&gt;b. Every declaration is subject to override; if you override a declaration in a subclass, it takes effect wherever the overridden declaration is used.&lt;br /&gt;&lt;br /&gt;That’s it. Note that for a name defined by an enclosing class to be visible, it must be &lt;span style="font-weight: bold;"&gt;declared&lt;/span&gt; in the lexically surrounding environment; it isn’t enough for it to be inherited. You must be able to see the declaration in the surrounding classes. If you don’t see it, it isn’t there.&lt;br /&gt;&lt;br /&gt;This is deliberate, and different from the rules you may know from Java (assuming you ever figured out what those really are) or even Beta. One advantage of this rule is that you cannot capture a name referred to in a nested class when you add a member to a superclass.&lt;br /&gt;&lt;br /&gt;From these very simple rules (and the lack of a global namespace) we get virtual classes, mixins, class hierarchy inheritance, and powerful modularity, as I’ve described in a number of forums.&lt;br /&gt;&lt;br /&gt;There is the small issue of specifying this behavior precisely, and beyond that, the small matter of making it work. These are not totally trivial. The spec describes the rules. As for the implementation, I plan to describe it in a paper; in the meantime, the source is out there. You want to look at changes we’ve made to the Squeak Interpreter, in particular the &lt;span style="font-style: italic;"&gt;pushImplicitReceiver&lt;/span&gt; byte code as implemented in Squeak’s &lt;span style="font-weight: bold;"&gt;Interpreter&lt;/span&gt; class.&lt;br /&gt;&lt;br /&gt;I hope I have convinced you that nested classes are at once simpler and more powerful than you might have imagined. As always, keeping language design simple results in a more powerful language.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-8427279275091527485?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/8427279275091527485/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=8427279275091527485' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8427279275091527485'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8427279275091527485'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/06/nest-of-classes.html' title='A Nest of Classes'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5195174296060658187</id><published>2010-05-12T18:23:00.000-07:00</published><updated>2010-05-12T18:30:45.422-07:00</updated><title type='text'>Patterns of Dynamic Typechecking</title><content type='html'>Dynamic type tests are ubiquitous in programming. Regardless of whether the programming language is statically or dynamically typed, object-oriented or functional, the need to establish what sort of value you are dealing with at run time is always there.&lt;br /&gt;&lt;br /&gt;In object oriented languages constructs like Java’s &lt;span style="font-weight: bold;"&gt;instanceOf&lt;/span&gt; (or C# &lt;span style="font-weight: bold;"&gt;typeOf&lt;/span&gt;, or Smalltalk &lt;span style="font-weight: bold;"&gt;isKindOf:&lt;/span&gt;) serve this need - as do casts.&lt;br /&gt;&lt;br /&gt;Some functional languages put a large emphasis on being “completely statically typed”. In fact, these languages rely heavily on dynamic typechecking in the guise of pattern matching. Somewhere in the middle lie constructs like &lt;span style="font-weight: bold;"&gt;typecase&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;In most languages, constructs for dynamic typechecking undermine data abstraction.  They allow one to test for the implementation type of a value.  Checking whether an object is an instance of a particular class, or pattern matching against a specific datatype constructor suffer from this problem. In contrast, testing whether a value supports an interface is harmless. Dynamically typed languages don’t support that however.&lt;br /&gt;&lt;br /&gt;An old trick is to add methods that support these kind of type queries. Where you might, in Java, define a class &lt;span style="font-weight: bold;"&gt;A&lt;/span&gt; that implements an interface &lt;span style="font-weight: bold;"&gt;T&lt;/span&gt;, you would simply define a method &lt;span style="font-weight: bold;"&gt;isT&lt;/span&gt; on &lt;span style="font-weight: bold;"&gt;A&lt;/span&gt;, that returns true. The problem is that you then need to define a version of &lt;span style="font-weight: bold;"&gt;isT&lt;/span&gt; that returns false for all other types that you might possibly encounter. To be safe, you’d add &lt;span style="font-weight: bold;"&gt;isT&lt;/span&gt; to &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt;. This is monkey patching of the highest order, and readers of this blog know my view “People don’t let Primitive Primates Program”.&lt;br /&gt;&lt;br /&gt;However, there is an out. Define the default version of &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; to check if the method name is of the form &lt;span style="font-weight: bold;"&gt;isX&lt;/span&gt;, where &lt;span style="font-weight: bold;"&gt;X&lt;/span&gt; has the form of a type identifier. If so, return false. Now all you need to do is define &lt;span style="font-weight: bold;"&gt;isT&lt;/span&gt; for the classes that actually support the &lt;span style="font-weight: bold;"&gt;T&lt;/span&gt; interface. Credit to Peter Ahe for proposing this. It’s in the current Newspeak implementation.&lt;br /&gt;&lt;br /&gt;Pattern matching is much more general than &lt;span style="font-weight: bold;"&gt;isT&lt;/span&gt; however; using &lt;span style="font-weight: bold;"&gt;isT&lt;/span&gt; is equivalent to a simple nullary pattern - except for this annoying issue with data abstraction.&lt;br /&gt;&lt;br /&gt;Recently, we’ve seen approaches that overcome the data abstraction problem: Scala and F# introduce pattern matching mechanism that allow you to preserve data abstraction. It was Scala’s influence, combined with demand from users at Cadence, that prompted me to examine the idea of supporting pattern matching in Newspeak.&lt;br /&gt;&lt;br /&gt;One can’t just go graft a pattern matching construct on to Newspeak however. The only operations allowed in Newspeak are message sends. If we can’t express the matching constructs via messages, they cannot go into the language. This led me to the idea of pattern literals: a special syntax for patterns, that denotes a pattern object. We already have special syntax for number objects, string objects,  closure objects etc., so adding pattern objects fits in quite naturally. The nice thing about such an approach is that patterns are first class values. Patterns can be abstracted over in methods, stored in slots and data structures, serialized to disk or what have you. You can’t do that with a pattern in ML or Haskell.&lt;br /&gt;&lt;br /&gt;So what exactly would pattern matching in Newspeak look like? Well, Felix Geller has recently completed his masters thesis at HPI Potsdam on just that question. Felix has implemented an experimental language extension supporting pattern literals and integrated it into the Newspeak IDE. The new language features should be available in the next Newspeak refresh (coming to a computer near you, summer 2010).&lt;br /&gt;&lt;br /&gt;In the new extension (NS3) patterns are matched against objects via a protocol very similar to Scala’s extractors, thereby preserving data abstraction. However, there is no special construct for pattern matching. Instead, complex patterns can be composed out simpler ones using combinators.&lt;br /&gt;&lt;br /&gt;Of course, a quick scan of the literature shows that the functional community got there first. Mark Tullsen proposed first class patterns with combinators for Haskell a decade ago, and there are several other papers dealing with the idea. However, having a proposal does not guarantee that a feature is in fact included in a language; in practice, none of the popular functional languages supports first class patterns.&lt;br /&gt;&lt;br /&gt;First class patterns combined with data abstraction are a potent combination. We mean to use them as first class queries that can be sent to objects of various sorts - ordinary in-memory collections, or databases, or logic programs. This is reminiscent of LINQ, except that the queries are first class so they can be abstracted over. You can write code that takes queries as parameters, accepts queries as input from the user, serializes and deserializes queries to persistent storage (allowing you to do meta-queries) and so on. And the beauty of a language like Newspeak is that all this requires but a fraction of the machinery you need in a mainstream setting.&lt;br /&gt;&lt;br /&gt;Felix’s thesis will soon be available as a technical report. I am reluctant to steal his thunder before that, but once the report is out, I'll put out a follow up post that illustrates how this works and why it is neat.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5195174296060658187?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5195174296060658187/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5195174296060658187' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5195174296060658187'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5195174296060658187'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/05/patterns-of-dynamic-typechecking.html' title='Patterns of Dynamic Typechecking'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5564390686079525730</id><published>2010-04-02T14:04:00.000-07:00</published><updated>2010-04-02T15:07:21.669-07:00</updated><title type='text'>The Brave New World of Full Service Computing</title><content type='html'>I’ve been trying to explain &lt;a href="http://gbracha.blogspot.com/2007/03/sobs.html"&gt;this&lt;/a&gt; for some six years with little success, but I’ll try again.  It may be easier this time.&lt;br /&gt;&lt;br /&gt;For the past generation, we have been living in a world of self-service computing, more commonly knows as personal computing. The person using/owning a personal/self service computer is responsible for managing that computer themselves. All the digital chores - installing and updating software,  preventing malware infection, backing up storage etc. are the user’s responsibility. Most users hate that.&lt;br /&gt;&lt;br /&gt;Worse, most users really can’t handle these chores at all.  They run ancient versions of applications because they are deathly afraid to tamper with anything that somehow works. Their computers are warrens of digital disease, crawling with viruses. Their data is rarely backed up properly.&lt;br /&gt;&lt;br /&gt;The evolution of network technology enables a new model: full service computing.  Full service computing means that you abdicate a level of fine control over the computer to a service that handles your digital housekeeping.  I discussed some implications of this in a &lt;a href="http://gbracha.blogspot.com/2010/01/closing-frontier.html"&gt;post back in January&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;You still choose applications that you want - but you must get them from the service; the service installs these applications for you; it keeps them up to date as long as you use the service; and it it ensures that they are safe to use. The service also backs up your data automagically, allowing you to easily search through old versions as needed. In the extreme case, the service actually provides you with the computers you use. That helps it ensure seamless software/hardware compatibility.&lt;br /&gt;&lt;br /&gt;We can increasingly  see vignettes of this brave new world. Web apps running in the cloud are one step in this direction; web apps storing data locally are a step further. A complete platform designed to hide the raw machine from the user is  even closer to this  vision. Such platforms started out in the world of mobile phones, and are clawing their way upwards from there: Android and iPhone/iPad. Windows 7 mobile is quite similar to the iPhone model (odd, isn’t it?). And ChromeOS is not dissimilar.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Tangent:&lt;/span&gt; How does Google keep Android and ChromeOS aligned? It doesn’t - I believe they are making it up as they go along. Eventually they should converge.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We still haven't seen an integration of the app store model with backup, but it can't be that far off. An option for MobileMe(insert Apple trademark here) subscribers to have an image of their iPad backed up on Apple's servers and browsable via a Time Machine (and again, Apple trademark) style interface is easy to imagine. This can then extend to data you want to share collaboratively, while still retaining a local copy for offline use.&lt;br /&gt;&lt;br /&gt;In time, most users will gravitate away from the self service computers they use today to software service platforms.  This goes hand in hand with simpler UIs such as the iPad’s. One of the technical features in this evolution is the disappearance of the file system in favor of a searchable database of objects, as I mentioned in &lt;a href="http://gbracha.blogspot.com/2010/02/nail-files.html"&gt;a previous post&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Tangent: &lt;/span&gt;&lt;span style="font-style: italic;"&gt;The move from personal computing to software services accessed via simpler devices is what is truly significant about the iPad. Those who worry about details such as whether it has a camera or Flash are completely missing the point.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;All this is already happening, which makes it easier to explain the idea of &lt;a href="http://bracha.org/objectsAsSoftwareServices.pdf"&gt;objects as software services&lt;/a&gt; (see the &lt;a href="http://www.youtube.com/watch?v=_cBGtvjaLM0"&gt;video&lt;/a&gt;) now than in 2004/2005. Consider the trend of programming languages toward a pure object model: all data are objects.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Tangent:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; This is orthogonal to functional programming. Objects can be immutable, as in, e.g., Avarice.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So, we now have programs dealing with objects in primary storage running on platforms  whose secondary storage consists of objects as well. Initially, programmers find that they must explicitly manage the transition of data from primary to secondary storage.&lt;br /&gt;&lt;br /&gt;The programmer’s task is simplified if that transition can be automated. This is an old idea known as orthogonal persistence.  The Achilles’ heel of orthogonal persistence was the divergence of in-memory data representation and persistent data representation.  This argued for manual management of the transition between primary and secondary storage.&lt;br /&gt;&lt;br /&gt;However, circumstances are changing. The difference between transient (in-memory) and persistent (secondary storage) representations was driven by several factors, none of which apply in the full service computing model.&lt;br /&gt;&lt;br /&gt;The first factor was that programs changed their internal representation while persistent data lived forever. Orthogonal persistence broke down when old data became incompatible with program data structures.&lt;br /&gt;&lt;br /&gt;In our brave new world, programs co-evolve with their data so this is not an issue. The service manages both the application and secondary storage, updating them in lock step. Thus, we have data that is not only orthogonally persistent, but orthogonally synchronized.&lt;br /&gt;&lt;br /&gt;The second factor was that persistent data formats were a medium of exchange between different programs with different internal data representations. This relied on the file system, may it rest in peace.  The new applications are in fact compelled to keep their persistent data in silos tied to the application, and communicate with other applications in constrained ways.&lt;br /&gt;&lt;br /&gt;This sets the stage for a model where programs are written in a purely object oriented language that supports orthogonal synchronization of both programs and data. This model eases the programmer’s task, and sits perfectly in the world of full service computing.&lt;br /&gt;&lt;br /&gt;I have been discussing language support for this vision of software services publicly since&lt;a href="http://www.bracha.org/oopsla05-dls-talk.pdf"&gt; my talk&lt;/a&gt; at DLS 2005, and privately long before. Newspeak’s modularity and reflection features dovetail perfectly with that.&lt;br /&gt;&lt;br /&gt;Newspeak allows programs to be represented as objects; these objects are compact and self contained, thanks to the modularity of Newspeak code. Newspeak programs can be updated on the fly, atomically. Together, these two features provide a foundation for synchronizing programs. And the Newspeak VM supports shallow object immutability, which helps us synchronize data. On this basis, we have implemented experimental support for orthogonal synchronization, though it is very immature. I hope that in the future, we can provide an open software service based on this idea.&lt;br /&gt;&lt;br /&gt;Of course, the radical edge the objects as software services vision is the complete abolishment of versions for libraries as well as applications. That is, I admit,  untested, controversial and futuristic. However, with respect to applications, it is already starting to happen.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5564390686079525730?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5564390686079525730/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5564390686079525730' title='22 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5564390686079525730'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5564390686079525730'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/04/brave-new-world-of-full-service.html' title='The Brave New World of Full Service Computing'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>22</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-8414586797829070643</id><published>2010-03-05T08:33:00.000-08:00</published><updated>2010-03-05T08:46:53.863-08:00</updated><title type='text'>Through the Looking Glass Darkly</title><content type='html'>In January, I gave a guest lecture in a class on reflection and metaprogramming at HPI Potsdam. A &lt;a href="http://www.hpi.uni-potsdam.de/hirschfeld/events/past/media/100105_Bracha_2010_LinguisticReflectionViaMirrors_HPI.mp4"&gt;screencast&lt;/a&gt; of the talk is now available. It’s an introduction to the concept of mirrors, which is the goodthink way of doing reflection. It’s mostly language neutral, but there is a brief demo using mirrors in Newspeak.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Because it’s a screencast rather than a video, occasionally some detail may be unclear, but by and large it is the most comprehensive introduction to mirrors available other than the OOPSLA paper.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Some people may not have an hour to watch the entire screen cast, and &lt;a href="http://www.bracha.org/mirrors.pdf"&gt;the paper&lt;/a&gt; is by no means an easy read, so I’ve decided to post the executive summary here.&lt;br /&gt;&lt;br /&gt;The classic approach to reflection in object-oriented programming languages originates with Smalltalk, and is used in most class based languages that support reflection: define a reflective API on Object. Typically, Object supports an operation like &lt;span style="font-style: italic;"&gt;getClass()&lt;/span&gt; which returns an object representing the class of the receiver. The API of classes defines most additional reflective operations available. For example, in Java, you can get reflective descriptors for a class’ methods (&lt;span style="font-weight: bold;"&gt;java.reflect.Method&lt;/span&gt;), fields (&lt;span style="font-weight: bold;"&gt;java.reflect.Field&lt;/span&gt;) and constructors (&lt;span style="font-weight: bold;"&gt;java.reflect.Constructor&lt;/span&gt;). You can even use these descriptors to evaluate program code &lt;span style="font-style: italic;"&gt;dynamically&lt;/span&gt; - say, ask the user for the name of a method and invoke it. In Smalltalk, you can also add and remove methods and fields, change a class’ superclass, remove classes from the system etc.&lt;br /&gt;&lt;br /&gt;Another approach is used in many scripting languages. The language constructs themselves introduce code on the fly, modifying the program as they are executed. For example, a class comes into being when a class declaration is evaluated, and might change if another declaration of a class with the same name is executed later.&lt;br /&gt;&lt;br /&gt;The third approach is that of mirrors, and originates in Self.  Mirrors have been used in class based systems such as Strongtalk, and even in the Java world. JDI, the Java Debugger Interface, is a mirror based reflective API. Here, the reflective operations are separated into distinct objects called &lt;span style="font-style: italic;"&gt;mirrors&lt;/span&gt;. This seemingly minor restructuring has significant implications. Reflection is no longer tied into the behavior of every object in the system (as it is via &lt;span style="font-style: italic;"&gt;getClass()&lt;/span&gt;) or (even worse) into the very syntax of the language. Instead, it resides in separable components that can be removed or replaced. Reflection is now a distinct capability, in the sense of the object capability model.&lt;br /&gt;&lt;br /&gt;If you are worried about security, this is good news. If you don’t provide a program with the means to manufacture mirrors (e.g., you do not provide the mirror factory object), said program cannot do any reflection. You can also provide mirrors with limited capabilities - say mirrors that only reflect the program’s own code, or mirrors that do not allow you to modify code or access non-public members etc.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Caveat:&lt;/span&gt; The truth is, mirrors have not really been used for security. Their utility for security seems clear, but a working API has yet to be demonstrated. &lt;br /&gt;&lt;br /&gt;Mirrors are good news for other reasons. Say your program doesn’t use reflection, and needs to fit into a small footprint such as an embedded device. It is easy to take it out. Another advantage is that you can easily plug in alternate implementations of reflection - so if you need to reflect on remote objects, you can do so.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Historical note: &lt;/span&gt;This is why JDI uses mirrors; indeed, it is why JDI had to be introduced. The original intent was that Java reflection would be used to support debugging; but once you need to deal with cross-process debugging, you need a distinct implementation of reflection; core reflection is tied to a single built in implementation.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Mirrors support a clear boundary between the base-level of your program (the level which deals with the problem domain your program is intended to solve) and the meta-level (the level where your program is discussing itself, where reflection takes place). The classic design, where the class is the main repository of reflective information, tends to blur these lines. Classes often have both base level functionality (like creating instances) and meta-level functionality (reflection).  This is most acute in languages like Smalltalk and CLOS. In Java, the base level roles of classes are often supported by specialized constructs like constructors (which have &lt;a href="http://gbracha.blogspot.com/2007/06/constructors-considered-harmful.html"&gt;their own, worse, problems&lt;/a&gt;) and&lt;br /&gt;static members (&lt;a href="http://gbracha.blogspot.com/2008/02/cutting-out-static.html"&gt;likewise&lt;/a&gt;). Even in Java, class objects may be used in a base level capacity (as type tokens, for example).&lt;br /&gt;&lt;br /&gt;There is much work to be done in this area. No mirror API has yet fulfilled all my claims and ambitions - least of all the Newspeak mirror API, which needs extensive revisions. Still, I hope you’re curious enough to watch the talk and/or read the paper.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-8414586797829070643?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/8414586797829070643/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=8414586797829070643' title='23 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8414586797829070643'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8414586797829070643'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/03/through-looking-glass-darkly.html' title='Through the Looking Glass Darkly'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>23</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-2105840890987435034</id><published>2010-02-23T15:53:00.000-08:00</published><updated>2010-02-23T16:23:06.684-08:00</updated><title type='text'>Serialization Killer</title><content type='html'>Way back in &lt;a href="http://gbracha.blogspot.com/2009/10/image-problem.html"&gt;October 2009&lt;/a&gt;, I threatened to write a post about how serialization can serve as a binary format. The moment of reckoning has arrived.&lt;br /&gt;&lt;br /&gt;Object serialization is probably most widely known due to Java serialization, but of course has a long history before that.  Modula-2+ supported pickling long before Java, for example, as did Smalltalk systems.&lt;br /&gt;&lt;br /&gt;Java serialization serializes objects in a most un-object-oriented way: it separates the object’s data from its behavior. Only the data is actually serialized. The object’s behavior (namely its class) is represented symbolically (as fully qualified class names; more on that later). During deserialization, the symbolic class information is used to reconstruct the classes of objects.&lt;br /&gt;&lt;br /&gt;The problem is that this only works properly when both serializer and deserializer agree on the interpretation of the symbolic class information. For example, when two VMs running identical versions of the code communicate via RMI (the original use of Java serialization).&lt;br /&gt;&lt;br /&gt;If the code in the deserializer differs from that in the serializer, as is very often the case (say, when one wants to load old serialized data) problems arise. The serialized data may not describe instances of the class on the deserializing side at all, because the  private representation of the class may have changed.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Tangent:&lt;/span&gt; Java serialization introduced an extra-linguistic mechanism for creating instances, that was not considered as part of the language design, which only foresaw objects being created via constructor calls. This too is problematic. What if the invariants imposed by the constructor change over time?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;To deal with these problems, one may opt to store data using a more stable schema than the in-memory representation (e.g., a database, an agreed external data format etc.).&lt;br /&gt;&lt;br /&gt;Alternatively, one can add conversion routines that map old representations into new ones. This requires identifying the version of the object’s class (aka the &lt;span style="font-style: italic;"&gt;serialVersionUID&lt;/span&gt;) when serializing an object.  This approach is problematic however. Each change of representation requires a new version number, and a new conversion routine. These must be in place before the objects are serialized.&lt;br /&gt;&lt;br /&gt;The reliance on class names is also an issue. What of anonymous classes? This is a problem in any case, but aggravated due to the reliance on names.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Tangent:&lt;/span&gt; the serialization team was, however, perfectly justified in assuming every class had a well defined name. They were working with Java 1.0, before the introduction of inner classes. Likewise, the inner class team was working on a system without serialization. No one saw the conflict until after the release combining the two - when it was far too late to do much about it. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In contrast, if one actually serializes the objects rather than just the data (that is, one serializes the data and the behavior), the serialized objects are much more self contained. (at some point one still wants to cut things off, but at stable APIs like Object).&lt;br /&gt;&lt;br /&gt;If you want to bring old objects up to date, you must convert them; but:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;You don’t have to; they work exactly as they did the day they were serialized, just like a mummy come to life.&lt;/li&gt;&lt;li&gt;You can add the conversion after the fact, at any time; for example, you can deserialize and then convert. The only requirement is that the necessary information is available via the object’s public API.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;The serialization team didn’t have the option of serializing the classes in the manner just described. The Java byte code verifier makes that impossible. The verifier imposes a nominal type system, which means you cannot have two classes with the same name running in the same class loader.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Tangent:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; The wonders of byte code verification probably deserve a post of their own. For now, just note this as another example of the kind of difficult-to-foresee interactions that occur between seemingly unrelated parts of a complex system.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Assume we have a system where we can serialize objects including their behavior.  Can we use the serialization format as a binary format for code? Specifically, can we use serialized class objects as our binary format?&lt;br /&gt;&lt;br /&gt;In Newspeak, top level classes, also known as module declarations,  are stateless. Hence the serialized form of these class objects is stateless as well, fulfilling a key requirement for a binary code representation.&lt;br /&gt;&lt;br /&gt;Module declarations have no external dependencies, so we needn’t serialize a great tangle of objects as is often the case with object serialization.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Tangent:&lt;/span&gt; This is what a module is supposed to be: something that can be built &lt;span style="font-weight: bold;"&gt;independently&lt;/span&gt;! This also means that you can load modules declarations in any order. I note with glee that this runs against the entire tradition of ADTs as the basis for software modularity.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Entire applications can be represented this way as well - it’s simply a matter of creating an object that ties together the various module declarations used by the application. This source form of this object acts like your makefile, and its serialized form is analogous to an executable (or a JAR or whatever).  To make this more concrete:&lt;br /&gt;&lt;br /&gt;A Newspeak application is an object conforming to a standard API. This API consists of a single method, &lt;span style="font-style: italic; font-weight: bold;"&gt;main:args:&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class&lt;/span&gt;&lt;span style="font-style: italic;"&gt; BraveNewWorldExplorerApp fileBrowserClass: fb = (&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    | BraveNewWorldExplorer = fb. |&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;)(&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;“MAIN METHOD”&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;public&lt;/span&gt;&lt;span style="font-style: italic;"&gt;     &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;main:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; platform &lt;/span&gt;&lt;span style="color: rgb(51, 51, 255); font-weight: bold; font-style: italic;"&gt;&lt;nsplatform&gt;&lt;/span&gt;&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;args:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; argv &lt;/span&gt;&lt;span style="color: rgb(51, 51, 255); font-weight: bold; font-style: italic;"&gt;&lt;list[string]&gt;&lt;/span&gt;&lt;span style="font-style: italic;"&gt; = (&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    | fn &lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(51, 51, 255); font-style: italic;"&gt;&lt;string&gt;&lt;/span&gt;&lt;span style="font-style: italic;"&gt; |&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    fn:: argv at: 1 ifAbsent:[ 'C:/Users'].&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    platform hopscotch core HopscotchWindow openSubject: &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        ((BraveNewWorldExplorer usingPlatform: platform)&lt;/span&gt;&lt;span style="font-style: italic;"&gt;  FileSubject onModel: fn&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;            )&lt;/span&gt;&lt;span style="font-style: italic;"&gt;)&lt;/span&gt;&lt;span style="font-style: italic;"&gt;)&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;You need not understand every detail here; what is important is the following:&lt;br /&gt;&lt;br /&gt;An serialized instance of &lt;span style="font-weight: bold; font-style: italic;"&gt;BraveNewWorldExplorerApp&lt;/span&gt; acts as our binary. The Newspeak runtime loads such a serialized object, deserializes it, and invokes its main:args: method. The latter invocation is very similar to what a JVM does when it loads the main class of a program and calls its &lt;span style="font-style: italic;"&gt;main()&lt;/span&gt; method, or for that matter, what C does with the &lt;span style="font-style: italic;"&gt;main()&lt;/span&gt; function.&lt;br /&gt;&lt;br /&gt;The method is invoked with two parameters (here we differ from the mainstream). The second of these represents any (command line) arguments to the program, just like &lt;span style="font-style: italic;"&gt;argv&lt;/span&gt; in a C program. What is different is the first argument, &lt;span style="font-style: italic;"&gt;platform&lt;/span&gt;, which represents the Newspeak platform.  The precise meaning of the expression inside the method is relatively unimportant. What matters is that we use the platform argument in two places: first, to instantiate the file browser module, so that it can make use of platform code; and second to access the GUI (&lt;span style="font-style: italic;"&gt;platform HopscotchFramework&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;In this case, the application instance is created with a single parameter, the module declaration for the file browser.&lt;br /&gt;More complex applications tie several modules together; in that case, the app module would be instantiated with a series of parameters, one for each module declaration required by the application.&lt;br /&gt;&lt;br /&gt;To make it easy for developers, our IDE uses a standard convention for instantiating and serializing application objects. If a top level class has a class method &lt;span style="font-style: italic; font-weight: bold;"&gt;packageUsing:&lt;/span&gt;, the IDE will assume the class represents an application, and allows us to create a deployable app with the push of a button.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;public&lt;/span&gt;&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;packageUsing:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; ideNamespace = (&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    ^BraveNewWorldExplorerApp fileBrowserClass: ideNamespace BraveNewWorldExplorer&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The IDE will call  the class method, passing it a namespace object as a parameter. The method can use that namespace to look up any available module declarations that it needs to gather into the application, and compute an application object that references them all. This application object is then serialized. This packaging process is somewhat analogous to constructing a JAR file. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Semi-tangent:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; We also allow you to output more common/mundane deployment formats like Windows executables.  Likewise, MacOS apps or Linux rpms can (and likely will) be added; a small matter of programming. Most interesting, and still  in flight, deployment as web pages.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We have serialized/deserialized applications, such as the compiler and the IDE, into binary objects a few hundred kilobytes in size. However, this isn’t our standard modus operandi yet. Right now, we are still slowly untangling ourselves from the Squeak environment.&lt;br /&gt;&lt;br /&gt;What then is the moral of the story? Well, one moral is that a running application can be thought of as an object, combining state and behavior; moreover, classical binary formats like a.out can be thought of as serialized objects. &lt;br /&gt;&lt;br /&gt;Why is this profitable? Because we can cover more ground with less concepts, and less implementation effort. For example, rather than class files, JAR files and serialized objects, we can do with serialized objects alone. Moreover, we can do better with this one mechanism than we did with the other three combined. Less is more. And that is the moral of many stories.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-2105840890987435034?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/2105840890987435034/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=2105840890987435034' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/2105840890987435034'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/2105840890987435034'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/02/serialization-killer.html' title='Serialization Killer'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-7827736697957440885</id><published>2010-02-01T08:26:00.000-08:00</published><updated>2010-02-01T11:37:37.859-08:00</updated><title type='text'>Nail Files</title><content type='html'>Files are extremely important in current computing experience. Much too important. Files should be put in their place; they should be put away.&lt;br /&gt;&lt;br /&gt;There are two aspects to this: the user experience, and the programmer experience. These are connected. Let’s start with the user experience.&lt;br /&gt;&lt;br /&gt;Users see a hierarchical file system (HFS), with directories represented as folders. The idea of an HFS goes way back. The folder was popularized by Apple - first with the Apple Lisa , the ill-fated precursor of the mac, and then with the mac itself.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Historical Tangent:&lt;/span&gt; The desktop metaphor comes from Xerox PARC.  I know some of that history is controversial, but one thing Steve Jobs did NOT see at the legendary 1979 Smalltalk demo was a folder.  Smalltalk had no file system to put folders in. To be fair though, Smalltalk had a category hierarchy navigated via a multi-pane browser much like the file browsers we see in MacOS today. The folder came later, with the Xerox Star (1981 or so).&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;David Gelernter has said that computers are turning us into filing clerks. Sadly, his attempt to fix this was a commercial failure, but his point is well taken. We have seen attempts to improve the situation -  things like Apple’s spotlight and Google desktop search - but this is only a transition.&lt;br /&gt;&lt;br /&gt;Vista was supposed to have a database as a file system. This is where we’re going. Web apps don’t have access to the file system. Instead we see mechanisms like persistent object stores and/or databases. Future computers will abstract away the underlying file system - just like the iPhone and iPad. Jobs gave us the folder (i.e., the graphical/UI metaphor for the HFS) and Jobs taketh away.&lt;br /&gt;&lt;br /&gt;This trend is driven in part by an attempt to improve the user experience, but there are also other considerations. One of these is security - and better security is also better user experience. Ultimately, it is about control: If you don’t have a file system, it becomes harder for you to download content from unauthorized sources. This is also good for security, and in a perverse way, for the user experience. And it’s also good for software service providers.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Tangent:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; This is closely tied to my &lt;/span&gt;&lt;a style="font-style: italic;" href="http://gbracha.blogspot.com/2010/01/closing-frontier.html"&gt;previous post&lt;/a&gt;&lt;span style="font-style: italic;"&gt; regarding the trend toward software services that run on restricted clients.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Which brings us to the programmer experience. File APIs will disappear from client platforms (as in the web browser).  So programmers will become accustomed to working with persistent object stores as provided by HTML 5, Air etc. And as they do, they will get less attached to files as code representation as well.&lt;br /&gt;&lt;br /&gt;Most programmers today have a deep attachment to files as a program representation. Take the common convention of representing the Java package hierarchy as a directory hierarchy, with individual classes as files within those directories.&lt;br /&gt;&lt;br /&gt;Unlike C or C++, there is nothing in the Java programming language that requires this. Java wisely dispensed with header files, include directives and the C preprocessor culture. This is a great help in fighting bloat, inordinately long compilation times, platform dependencies etc.&lt;br /&gt;&lt;br /&gt;A Java program consists of packages, which in turn consist of compilation units. There are no files to be found.  And yet, the convention of using directories as a proxy for the package hierarchy persists.&lt;br /&gt;&lt;br /&gt;Of course, it’s not just Java programmers. Programmers in almost any language waste their time fretting over files. The only significant exception is (bien sur!) Smalltalk (and its relatives).&lt;br /&gt;&lt;br /&gt;Files are an artifact that has nothing to do with the algorithms your program uses, or its data structures, or the problem the program is trying to solve. &lt;span style="font-style: italic; font-weight: bold;"&gt;You don’t need to know how your code is scattered among files anymore than you need to know what disk sector it’s on.&lt;/span&gt;  Worrying about it is just unnecessary cognitive load. Programmers need not be filing clerks either.&lt;br /&gt;&lt;br /&gt;With modern IDEs, one can easily view the structure of the program instead. In fact, the IDE can load your Java program that much faster if it doesn’t use the standard convention.  You can still export your code in files for transport or storage but that is pretty much the only use for them.&lt;br /&gt;&lt;br /&gt;I suspect these comments will spur a heated response. Most programmers have used the file system as a surrogate IDE for so long that they find it hard to break old habits and imagine a cleaner, simpler way of doing business. But do note that I am &lt;span style="font-weight: bold;"&gt;not&lt;/span&gt; arguing for the Smalltalk image model here - I’ve discussed its strengths and weaknesses &lt;a href="http://gbracha.blogspot.com/2009/10/image-problem.html"&gt;elsewhere&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;What I am saying is that your data - including but not limited to program code  - should be viewed in a structured, searchable, semantically meaningful form. Not text files, not byte streams, but (collections of) objects.&lt;br /&gt;&lt;br /&gt;As file systems disappear from the user experience, and from client APIs, newer generations of coders will be increasingly open to the idea of storing their code in something more like a database or object store. It will take time, and better tooling (especially IDEs and source control systems) but it will happen.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-7827736697957440885?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/7827736697957440885/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=7827736697957440885' title='39 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7827736697957440885'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7827736697957440885'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/02/nail-files.html' title='Nail Files'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>39</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5998640595021731128</id><published>2010-01-24T15:23:00.000-08:00</published><updated>2010-01-24T15:37:46.899-08:00</updated><title type='text'>Closing the Frontier?</title><content type='html'>How is a programming language interpreter like pornography? They’re both banned at the iPhone App store. The restrictions on interpreters disturb me (this blog does not deal with the other topic).&lt;br /&gt;&lt;br /&gt;What if I want to write an iPhone app in my favorite programming language? Or my second favorite? The freedom to program in the language of my choice is a big concern of mine. Even though Squeak Smalltalk runs on the iPhone, and Newspeak runs on top of Squeak, I’m out of luck.  Packaging the entire runtime with each app is not attractive.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Tangent:&lt;/span&gt; In principle, one should be able to write a compiler that spits out Objective-C. After all, Objective-C incorporates Smalltalk style message sends, down to the keyword syntax. In practice, Objective-C on the iPhone doesn’t support true garbage collection, so it just seems too painful. This may change soon - maybe as soon as Wednesday., with iPhone OS 4 and/or the iTablet (pick your favorite rumor). &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The iPhone is a prime example of a trend where our computing platforms become more restricted. As we move toward software as a service rather than an artifact, the computer is no longer as personal; it is very much under control of the service provider. In this case Apple, in other cases Amazon or Google or Microsoft. I’d be surprised if the rumored iTablet won’t work on the same model: rather than an open version of MacOS, a semi-closed world with an app store.&lt;br /&gt;&lt;br /&gt;I completely understand why service providers place these restrictions.  What if my app is a storefront for apps? Whither the app store’s revenue? And then there are security concerns.&lt;br /&gt;&lt;br /&gt;On the iPhone, there are administrative restrictions. Other platforms are more open in that respect - but you tend to run into technical problems.  Take Android - you can distribute a language runtime, but it is the Javanese VM that’s in charge.&lt;br /&gt;&lt;br /&gt;No one will prevent you from implementing your own system on a JVM or on .Net. However, the VM, by its architecture, will make things difficult. If it does not support the right primitives, you find that your language is a second or third class citizen. &lt;br /&gt;&lt;br /&gt;Examples: the typed VM instruction sets are a problem for dynamically typed languages. On .Net, the DLR provides some support, but you’ll never get to peak performance using it as it stands today.  On the JVM, &lt;span style="font-weight: bold;"&gt;invokedynamic&lt;/span&gt; will ship some sunny day, and that will be better. Even then, if you want real dynamism, you’ll find it very hard. How do you change the class of an object? How do you change the shape of a class? Or its superclass? The solutions are convoluted and/or suboptimal.&lt;br /&gt;&lt;br /&gt;On the desktop, you can choose to bypass these platforms, but the action is increasingly moving elsewhere.  Looking ahead, the dominant platforms won’t be traditional operating systems like Windows, MacOS and Linux; far fewer people will use them directly than today. Instead, we’ll have the web (as in, e.g., ChromeOS), and semi-enclosed service platforms: The moral equivalent of iTablet OS (regardless of what Apple does this Wednesday). The “real OS” may still be there underneath, but virtually no one will care.&lt;br /&gt;&lt;br /&gt;Each of these service platforms will combine a client side, running on phones, tablets, e-books, laptops (e.g., iPhone, Android, Kindle, ChromeOS) and desktops; and a server side (e.g., Azure, Google, E3, iTunes/MobileMe) supporting storage/backup, software update and distribution and services we haven't invented yet.&lt;br /&gt;&lt;br /&gt;The most open of these virtual platforms is of course, the web browser. Mercifully, it is relatively free of administrative restrictions and its machine language is much more flexible than MSIL or JVM byte codes (JVML). Which is not to say that you don’t have to go through some very convoluted and costly hoops. Think of weak pointers/finalization;  or stack manipulation (debuggers, tail recursion) etc.&lt;br /&gt;&lt;br /&gt;There are two threads here:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The narrow technical one, which is about providing abstractions that are good enough to support alternate ways of doing things. One wants flexible designs, especially in the languages that are supposed to be at the heart of general purpose open platforms. Of course, this is much easier said than done. &lt;/li&gt;&lt;li&gt;The broad technological/social/commercial one: the trend toward software services and the cloud is pulling us away from the personal computer and the individual control it entails.  Of course, one of the great attractions here is the idea of a service that relieves much of the responsibility for digital housekeeping. The vast majority of people don’t want all that control with the trouble that goes with it. Balancing this with the freedom to innovate at the platform and language level, and the freedom to choose among languages, is hard.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;Long term, the platforms that have more flexibility will benefit more from new ideas, which gives me a basis for unnatural optimism :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5998640595021731128?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5998640595021731128/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5998640595021731128' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5998640595021731128'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5998640595021731128'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/01/closing-frontier.html' title='Closing the Frontier?'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-4481513246413300319</id><published>2010-01-02T12:32:00.000-08:00</published><updated>2010-01-17T17:33:01.584-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Avarice and Sloth</title><content type='html'>&lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;Are these my new year resolutions? No (ok; maybe, but we won't go there), they are the names of two hypothetical programming languages.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;The world is slowly digesting the idea that object-oriented and functional programming are not contradictory concepts. They are orthogonal, and can be arranged to be rather complementary. In that light, I’ve been dwelling on the idea of a purely functional subset of Newspeak: Avarice.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;There is only one place in Newspeak where mutable state can be introduced: slot declarations. Newspeak supports both mutable and immutable slots. For example,&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: italic; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;x ::= 0. &lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;introduces a mutable slot; the declaration implicitly introduces both getter and setter methods (named x and x: respectively).  An immutable slot declaration looks like this&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: italic; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;y = 0.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;In this case, only a getter is created, so there is no way to modify &lt;span style="font-style: italic;"&gt;y&lt;/span&gt; after its introduction.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;Disallow mutable slots, and you’re almost there. &lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;There is one catch - the order of slot initialization is observable, and so you might detect a slot in its uninitialized state. &lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: italic; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;x = y.&lt;/span&gt;&lt;/p&gt;&lt;p   style="margin: 0px 0px 12px; font-style: italic; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;y = 0.&lt;/span&gt;&lt;/p&gt; &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;will set &lt;span style="font-style: italic;"&gt;x&lt;/span&gt; to nil. So much for referential transparency.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;However, we also have simultaneous slot declarations, which are a bit like &lt;span style="font-weight: bold;"&gt;letrec&lt;/span&gt;.  Alas, they aren’t implemented, and the spec needs tightening. Nevertheless, whatever variant we implement will prevent you from observing the uninitialized value of a slot.  Once we add this feature, we can subset Newspeak and produce Avarice. &lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;Of course, to make it work, we’d need to change a lot of the libraries. So maybe this is a little speculative at this point. This post will get even more speculative as we go along.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;There is one more little nagging issue: reflection. What happens if we reflectively modify our program? &lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;One answer is that we don’t; if we are serious about referential transparency, we cannot tolerate this sort of thing.  This means that reflection is restricted to introspection, just like Java and C#. That’s still more than one get in existing pure functional programming.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;Another answer is that we allow reflective change; we’re only functional at the base level, not the meta-level.  &lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;Note that the language doesn’t dictate a choice here: it depends what reflective library your implementation provides. In principle, you can provide multiple libraries, i.e., one that only supports introspection, and one that does the full thing. Lots to think about here.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;Even restricted to introspection, Avarice would be rather different from most purely functional languages. It would carry none of the cultural baggage commonly associated with such languages: no currying, no Hindley-Milner type inference (indeed, no mandatory type system at all), no quasi-mathematical syntax. Is such a beast workable? Well, Erlang is.  However, Avarice would also be object-oriented. Who knows, it might be the best of both worlds.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;By now, my remaining readers are wondering what I’ve been smoking. Well, I don’t smoke, but red wine is good for you. In that spirit, we can take things further. &lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;What if we make things lazy, like Haskell?  This is &lt;span style="text-decoration: underline;"&gt;&lt;/span&gt;not a Newspeak subset anymore - the semantics are rather different. However, syntactically, this language is identical to Avarice. We’ll call the lazy language Sloth, in contrast to its eager, dare I say even greedy, cousin Avarice. Sloth would be purely functional, and at the same time purely object oriented.&lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;At this point, Avarice and Sloth are just imaginary constructions. They only exist in my mind, in this post, and in these &lt;a href="http://bracha.org/newspeak-WG2.8.pdf"&gt;&lt;span style="text-decoration: underline;"&gt;slides&lt;/span&gt;&lt;/a&gt;. However, they make a nice thought experiment, and I hope to make them real in the fullness of time. &lt;/span&gt;&lt;/p&gt;  &lt;p   style="margin: 0px 0px 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal; font-size-adjust: none; font-stretch: normal;font-family:Helvetica;font-size:12px;"&gt;&lt;span style="letter-spacing: 0px;"&gt;Regardless, sooner or later, someone will build a purely functional object oriented language. It will probably have a Javanese syntax and be rather hacky; but I hope Avarice and Sloth will be out there as well, for those who appreciate the finer things in life.&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-4481513246413300319?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/4481513246413300319/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=4481513246413300319' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4481513246413300319'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4481513246413300319'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2010/01/avarice-and-sloth.html' title='Avarice and Sloth'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-8547216328564804808</id><published>2009-12-09T21:13:00.000-08:00</published><updated>2009-12-09T21:39:12.809-08:00</updated><title type='text'>Chased by One’s Own Tail</title><content type='html'>Functional programming languages generally allow you to program without resorting to explicit loops. You can use recursion instead. It’s more elegant and higher level. In particular, it allows one to code certain nonterminating computations (such as state machines) as (a set of mutually) recursive calls.&lt;br /&gt;&lt;br /&gt;In order to reliably support tail calls, implementations convert tail recursion into iteration. As a result, the activation frames for the recursive calls never come into existence, preventing nasty stack overflows.&lt;br /&gt;&lt;br /&gt;I recently came across a &lt;a href="http://projectfortress.sun.com/Projects/Community/blog/ObjectOrientedTailRecursion"&gt;truly brilliant post&lt;/a&gt; by Guy Steele arguing that tail recursion is actually essential to object oriented programming. Go read it, as well the &lt;a href="http://portal.acm.org/citation.cfm?doid=1639949.1640133"&gt;terrific essay by William Cook &lt;/a&gt;that it references. &lt;br /&gt;&lt;br /&gt;And yet, object oriented languages usually don’t do tail call optimization (TCO for short). What’s their excuse?&lt;br /&gt;&lt;br /&gt;One excuse, especially in the Java world, is security. Specifically, stack frames carry information about protection domains that the the classic J2SE model depends upon. Since I am not (and have never been) a fan of that particular model, I’ll dispense with that argument. There are better ways to get security.&lt;br /&gt;&lt;br /&gt;A more important excuse is debugging. Stack traces are very useful for debugging of course, and if stack frames are optimized away via tail recursion, key information is lost. &lt;br /&gt;I myself want  complete stack traces, but I also want to write tail recursive programs and know that they will reliably work.&lt;br /&gt;&lt;br /&gt;One possibility is that upon stack overflow, we compress the stack  - eliminating any tail recursive frame that is at its return point. This should be controllable via debugging flags. The excess frames are eliminated, but only when strictly necessary. Programs that work well without tail call optimization won’t be affected. So if you think a reliable stack trace is more important than tail call optimization, you have no need for concern.&lt;br /&gt;&lt;br /&gt;Programs that rely on tail calls might suffer performance loss in this mode, but they will still run - and be easier to debug. If this is a problem, one can specify a different  configuration which optimizes tail calls away. So if you favor tail call optimization over stack traces, you need not be alarmed either. &lt;br /&gt;&lt;br /&gt;Now let us change perspective slightly. Keep in mind that the stack compression suggested above is really just a particular form of garbage collection.&lt;br /&gt;&lt;br /&gt;It’s good to recall that the whole notion of the call stack is simply an optimization. One can allocate activation records on the heap and dispense with a call “stack” altogether. If you did that, you could make each activation point at its caller.  Tail recursive calls might produce activations that pointed not at the caller, but at the caller of the first activation of the tail recursive function. All intermediate frames would be subject to garbage collection.&lt;br /&gt;&lt;br /&gt;In the above model, aggressive garbage collection corresponds to traditional TCO. On the other hand, the idea of "collecting" the stack only when it runs out, gives us as much debugging information as a conventional non-TCO implementation. We only lose information in situations that simply don't work without TCO.&lt;br /&gt;&lt;br /&gt;Now let's pursue the analogy between stack frames and  heap storage further. We’d like to have all our past states in the heap, so we could examine them with a time traveling debugger, but that’s usually too costly.  If we are lucky enough to have a time traveling debugger at all, we must configure how much of the past is available. &lt;br /&gt;&lt;br /&gt;The past history of the call stack (i.e., what tail recursive calls took place) is not fundamentally different. If we are chasing our own tail, we can go back in time and see it chasing us. So my ideal development environment would allow me to recover stack frames that had been eliminated by TCO if I really wanted to.&lt;br /&gt;&lt;br /&gt;As usual, it’s easier to see the common abstraction if one considers everything, including activations, as an object.&lt;br /&gt;&lt;br /&gt;I therefore argue that:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt; Support for tail recursion should be required by the language specification. This is essential: if tail recursive programs only run on certain implementations, one cannot write portable code that relies on tail recursion.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;How&lt;/span&gt; tail recursion is supported is an implementation detail. Whether tail recursive activations never come into being, or exist on the stack or the heap and get collected by one policy or another, is up to the implementation.&lt;/li&gt;&lt;li&gt;During development, the IDE should let you choose policy on retaining computation history.  In particular, it should allow you to retain as much of the history of your computation as possible - on the stack and the heap (e.g., time traveling debugging, full stack traces etc.).&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;I’ve endeavored to support this outlook in Newspeak - but only partially so far.&lt;br /&gt;The Newspeak spec states that method or closure activations are objects, and a called activation is passed a reference to its &lt;span style="font-style: italic;"&gt;continuation activation&lt;/span&gt; (not its continuation!), which is either its calling activation, or the continuation activation of the caller if there is no further code to execute in the caller (i.e., the caller is a tail call).&lt;br /&gt;&lt;br /&gt;This meets requirements (1) and (2). Alas, the implementation does not yet comply (so forget about  (3)). I hope we can correct this in time.&lt;br /&gt;&lt;br /&gt;We need not compromise on either debugging or tail recursion. We do need to seek out common abstractions (like objects) and insist on better tools and languages rather than settling for the dubious services of the mainstream.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-8547216328564804808?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/8547216328564804808/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=8547216328564804808' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8547216328564804808'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8547216328564804808'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/12/chased-by-ones-own-tail.html' title='Chased by One’s Own Tail'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-4274862447783368029</id><published>2009-11-21T16:07:00.000-08:00</published><updated>2009-11-22T12:39:12.261-08:00</updated><title type='text'>Objects are not Hash Tables</title><content type='html'>Hash tables are objects, not the other way around, as they are modeled in many popular scripting languages.&lt;br /&gt;&lt;br /&gt;Ok, so objects aren't hash tables. They aren't cats either. What are they then? What would be a good definition of the term object as used in modern programming languages? I’d start with saying that an object is a self-referential record:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;point = &lt;span style="font-weight: bold;"&gt;record&lt;/span&gt; {&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        rho = 0;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        theta = 0;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        x = &lt;span style="font-weight: bold;"&gt;fun&lt;/span&gt;(){return cos(point.rho, point.theta)};&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;        y = &lt;span style="font-weight: bold;"&gt;fun&lt;/span&gt;(){return sin(point.rho, point.theta)}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Notice how the self reference is explicit, via the name &lt;span style="font-style: italic;"&gt;point&lt;/span&gt;.  The example does not rely on static typing, classes, prototypes, inheritance, delegation or on whether the language is imperative or not (the example above could easily be in either). So we can discuss them without getting into all those things.&lt;br /&gt;&lt;br /&gt;Self reference, on the other hand,  is absolutely essential - a record (or struct, for those who’ve used C for too long) is not in itself an object.&lt;br /&gt;&lt;br /&gt;Popular scripting languages introduce a variation on this theme: they replace records with hash tables. The hash tables must still be self referential of course.&lt;br /&gt;&lt;br /&gt;Hash tables have huge advantages. The first is that they are indexed by first class values, which means you can abstract over their keys. Hence you can access an object member without statically knowing its name - the equivalent of Smalltalk’s &lt;span style="font-style: italic;"&gt;perform:&lt;/span&gt;, and a close cousin of &lt;span style="font-style: italic;"&gt;Method.invoke()&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;Field.get()/set()&lt;/span&gt; in Java. You can also iterate over the keys (&lt;span style="font-style: italic;"&gt;getMethods()&lt;/span&gt; etc.). You get reflection for free.&lt;br /&gt;&lt;br /&gt;In an imperative language, hash tables are a mutable data structure. You can add, remove or change elements. So you can  do schema changes on your objects and modify their behavior. Again, all this reflection comes for free - you need not even design an API for it.&lt;br /&gt;&lt;br /&gt;Therein lies the rub of course. The trick of exposing the data structures of your implementation at the language level is immensely powerful - but it does expose them, and that has very real disadvantages. It is very hard to typecheck, optimize, or (especially)  make any kind of security guarantees, without losing or restricting this great power.&lt;br /&gt;&lt;br /&gt;Hence the title of this post. There is more to objects than hash tables (even with self reference).  We should not confuse objects with the data structures that might implement them.&lt;br /&gt;&lt;br /&gt;Let’s revisit the definition at the top of this post. It has the advantage that it is general enough to fit anything that people might call an object, but it isn’t good enough. For me, an object is an &lt;span style="font-style: italic; font-weight: bold;"&gt;encapsulated&lt;/span&gt; self referential record. Encapsulation is open to interpretation: some people take it to mean “bundled together” while others expect it implies some sort of data abstraction/information hiding. In the above context, I would hope that it would be clear that I mean the latter, as the word  record already implies bundling/aggregation.&lt;br /&gt;&lt;br /&gt;This definition excludes the objects we find in common scripting languages; these rely on closures to encapsulate data. The definition also excludes mainstream statically typed languages, where encapsulation is enforced at the type level, by the type system:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class&lt;/span&gt;&lt;span style="font-style: italic;"&gt; C { &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;private&lt;/span&gt;&lt;span style="font-style: italic;"&gt; internals; // so delicate, so secret&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;public&lt;/span&gt;&lt;span style="font-style: italic;"&gt; slashAndBurn(C victim){victim.internals = rubbish}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;As the above code illustrates, class or type based encapsulation does not protect one object from another. However, if your type system is strictly based on interfaces, than you do get object based encapsulation.&lt;br /&gt;&lt;br /&gt;I don’t advocate relying on a type system to ensure encapsulation: I’ve argued elsewhere against relying on mandatory type systems for security or any other crucial property.&lt;br /&gt;&lt;br /&gt;Instead, I view encapsulation as inherent to objects themselves. Objects expose a procedural interface, independent of any type system, and &lt;span style="font-style: italic;"&gt;explicitly&lt;/span&gt; define which members are public and which are hidden.&lt;br /&gt;&lt;br /&gt;This is of course the model of objects used in the Smalltalk family of languages: Smalltalk, Self and now Newspeak. Self and Newspeak go further, and require that even the hidden members be accessed via a procedural interface.&lt;br /&gt;&lt;br /&gt;Needless to say, this doesn’t imply losing the reflective power of the “hash table languages”. It does, however, force us to come up with a reflective API. Having a reflective API imposes structure that is lacking in typical scripting languages; this structure is a good thing. Making the reflective API secure is an open problem, but the fundamental approach of using &lt;a href="http://www.bracha.org/mirrors.pdf"&gt;mirrors&lt;/a&gt; means it is at least possible; but that is for another post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-4274862447783368029?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/4274862447783368029/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=4274862447783368029' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4274862447783368029'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4274862447783368029'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/11/objects-are-not-hash-tables.html' title='Objects are not Hash Tables'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5018406936943857111</id><published>2009-10-31T11:47:00.001-07:00</published><updated>2010-01-17T17:37:37.571-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><category scheme='http://www.blogger.com/atom/ns#' term='Reflection'/><title type='text'>Atomic Install</title><content type='html'>A few months ago, I wrote a &lt;a href="http://gbracha.blogspot.com/2009/07/miracle-of-become.html"&gt;post&lt;/a&gt; about the &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; operation in the Smalltalk family of languages. The post elicited a good discussion. One of the interesting points that came up, was that you could implement &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; by changing the class of objects, provided all the changes were atomic - changing the class, modifying the object’s schema accordingly, and copying the data between the objects.&lt;br /&gt;&lt;br /&gt;Setting the discussion of &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; aside, I observe that in general, reflective change should be atomic. What I mean by this, is that one should be able to group an arbitrary number of distinct reflective changes to the program, and install them into the running program all at once, in one atomic operation. I will use the term &lt;span style="font-style: italic;"&gt;atomic install&lt;/span&gt; to refer to this ability.&lt;br /&gt;&lt;br /&gt;Oddly enough, this point is not well understood. Most languages that support reflective change do not provide a way for atomically performing a set of changes.&lt;br /&gt;&lt;br /&gt;Smalltalk provides a reflective API centered around classes. You can ask a class to modify itself in a variety of ways - adding or removing variables, methods or changing its superclass. There is, however, no way to change several classes at once.&lt;br /&gt;&lt;br /&gt;The CLOS APIs are similar in this respect.&lt;br /&gt;&lt;br /&gt;Scripting languages don’t really have a reflective API as such. Instead, the reflective changes may come about as the result of executing some code (e.g., an assignment may add a variable). In other cases, the reflective capacity of the language comes about directly from exposing the data structures of the language implementation back to the language.&lt;br /&gt;&lt;br /&gt;In all these cases, one cannot perform a set of program changes as an atomic unit.&lt;br /&gt;&lt;br /&gt;Why do you need to atomically apply a set of changes instead of sequentially applying one change after another?&lt;br /&gt;&lt;br /&gt;One reason is that you’d like to apply the changes as a transaction. If a change fails (say, because you created a cycle in the class graph, or duplicated an instance variable) you don’t want any further changes to take place. It’s easier if you don’t have to catch exceptions etc. after each step.&lt;br /&gt;&lt;br /&gt;Another reason is that the changes are often dependent on each other. Applying one change without the other leaves your program in a broken state. The fact that intermediate states may be inconsistent even though the overall set of changes is correct means that it isn't sufficient to wrap a series of reflective changes in a transaction.&lt;br /&gt;&lt;br /&gt;Of course, most people don’t rely on reflective modification to develop their programs. Rather, they suffer through the classic edit-compile-run-debug cycle. In the absence of anything better, you typically edit source code in files. You then compile these files and load the resulting program. This actually has one big advantage: the load is atomic - all the changes that resulted from your edits are loaded as a unit.&lt;br /&gt;&lt;br /&gt;Evolving your program reflectively has the advantage that you can make corrections to the running program. Often, this is done during fix-and-continue debugging. Even then, in most cases, the “program” in question is an application that is stopped in the debugger, and you can apply fixes sequentially. But as noted above, it’s still easier if you can apply the changes as a transaction.&lt;br /&gt;&lt;br /&gt;More interesting cases are when you are modifying a program, and need to run it between the individual modifications. Unlike most people, I run into this often, when modifying the IDE I am using. However, there are additional situations of this nature.&lt;br /&gt;&lt;br /&gt;Long lived applications that must provide continuous service have this flavor. Erlang allows you to replace an entire module as a unit in these situations. In Java, you use class loaders to get the desired effect; it’s complicated, but it’s your only way out. A scenario I'm especially interested in is software services: the service updates applications on the fly without shutting them down - and in particular,  I may want to update the update mechanism itself.&lt;br /&gt;&lt;br /&gt;If you can’t make all the changes in one go, you find that you have to break the transition into a series of steps, each of which leaves the system in a consistent state while leading to the desired final program. This is tricky and error prone: you need to apply the right changes in just the right order.&lt;br /&gt;&lt;br /&gt;Scripting languages like Ruby often go through a series of program changes during program start up. Different modules are loaded, modifying the program in the process. This process is also order dependent, and therefore brittle. In many cases, this sort of reflective change isn’t actually essential; rather it’s an artifact of the language semantics. However, I suspect there are situations where it is used to real advantage.&lt;br /&gt;&lt;br /&gt;Overall, one can live for a long time without the ability to atomically apply a set of program changes. And yet it seems that there are some situations where atomic install seems to be very useful. It also has another advantage: batching the changes is more performant. Often one doesn’t care, but it doesn’t hurt to be faster, and on some occasions it actually can matter. One might as well see if one can come up with a reflective API that supports atomic install.&lt;br /&gt;&lt;br /&gt;In Strongtalk, the VM supported atomic install as a primitive operation in the VM. More recently, in the latest &lt;a href="http://newspeaklanguage.org/the-newspeak-programming-language/downloads/"&gt;Newspeak update release&lt;/a&gt;, I added an atomic install facility written entirely in Newspeak. Exactly 372 lines of code (whitespace and copious comments included). It is a tribute to the Squeak design that this is doable without any privileged access to the VM .&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent1: Thanks to Peter Ahe and Eliot Miranda for the discussions that led to this scheme; and to Lars Bak, for the discussions that led to the original notion of atomic install.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent2: Squeak’s existing mechanism for changing classes, the ClassBuilder, is, on the other hand, rather unattractive. It’s three times as long, vastly more complicated, and provides only a subset of the functionality. It shows how tricky this kind of reflective change can get if you don’t conceptualize it the right way.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Naturally, the actual atomic step here is done using a variant of &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt;. Specifically, it’s a one-way &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; on an array of objects. Given two object arrays &lt;span style="font-style: italic; font-weight: bold;"&gt;a&lt;/span&gt; and &lt;span style="font-weight: bold; font-style: italic;"&gt;b&lt;/span&gt;, both of size &lt;span style="font-style: italic; font-weight: bold;"&gt;n&lt;/span&gt;,  all references to &lt;span style="font-weight: bold; font-style: italic;"&gt;a[i]&lt;/span&gt; are changed to refer to &lt;span style="font-weight: bold; font-style: italic;"&gt;b[i]&lt;/span&gt;, for &lt;span style="font-style: italic; font-weight: bold;"&gt;i =1 .. n&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The atomic installation process nice and simple. The input is a list of mirrors describing mixins, and a namespace listing which of those mixins already exist in the system. The namespace parameter is crucial BTW, as there is no global namespace in Newspeak.&lt;br /&gt;&lt;br /&gt;We then produce a fresh set of mixins based on the input list. For any existing mixins, we locate all classes that are invocations of those mixins, and all subclasses thereof. We make fresh versions of these as well, reflecting the changed mixins involved. For each such class, we locate all its instances (does your system have &lt;span style="font-weight: bold;"&gt;allInstances&lt;/span&gt;?) and produce new instances based on the new descriptions. Each new object is associated with the old object by keeping them in parallel arrays. Then we just do the &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; and we’re done.&lt;br /&gt;&lt;br /&gt;Unlike Smalltalk, in Newspeak we never have to recompile code that hasn’t been modified at the source level -  because Newspeak code is representation independent.&lt;br /&gt;&lt;br /&gt;It’s clearly harder to do atomic install in a system that does sophisticated JIT compilation and inlining - but as I said, Strongtalk supported it (albeit at the VM level, largely because of the extar complexity involved) in the mid-90s.&lt;br /&gt;&lt;br /&gt;How do we get this kind of reflective power on mainstream platforms? It is harder, because the widely used platforms don’t support the needed abstractions as well as Smalltalk VMs do.&lt;br /&gt;&lt;br /&gt;At least on the web browser, I expect to be able to get similar effects with Javascript, albeit with a very different implementation.&lt;br /&gt;&lt;br /&gt;In Java, the absence of the necessary primitives tends to force one to build one’s own custom object representation, which is costly in both developer time and machine time.&lt;br /&gt;&lt;br /&gt;Ironically (but not coincidentally), the bulk of the necessary machinery already exists in the Hotspot JVM, which is capable of changing at least method implementations on the fly (via JVMDI), including deoptimizing compiled code that may have inlined a method that has been modified. The problem is exposing it to user - and especially exposing it securely. Mirrors can help  here - but that is for a future post.&lt;br /&gt;&lt;br /&gt;On .Net, the DLR helps one construct one’s own custom representation. Conversely, there’s little support for deoptimization etc. in the CLR itself.&lt;br /&gt;&lt;br /&gt;Of course, one goal of this post is to encourage implementors to add such support, and do it in the right way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5018406936943857111?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5018406936943857111/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5018406936943857111' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5018406936943857111'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5018406936943857111'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/10/atomic-install.html' title='Atomic Install'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-6528819210793023627</id><published>2009-10-06T15:38:00.000-07:00</published><updated>2009-10-06T16:54:21.438-07:00</updated><title type='text'>An Image Problem</title><content type='html'>Imagine a library, say in Java,  that provided a universal facility for  saving the exact runtime state and configuration of any application. It would essentially capture the heap, all threads and their stacks, and save them to a file. &lt;br /&gt;&lt;br /&gt;You could then load this file from disk and reconstitute your application just as it was when it was saved: For example, windows would re-open just where they were (adjusting their location if the screen size differed).  The system would cope with open file handles and sockets in a reasonable fashion, re-opening them if they were available. All this would work irrespective of the underlying platform, and it would be fast as well. &lt;br /&gt;&lt;br /&gt;The facility would be transparent to the programmer - you wouldn’t need to change your program in any way, beyond calling the library function. Pretty neat.&lt;br /&gt;&lt;br /&gt;I don’t know of such a facility in Java or .Net. However, Smalltalk has had such a beast for over 30 years. In Smalltalk parlance, it’s called an image. Most Smalltalks are inseparably coupled to the notion of such an &lt;span style="font-style:italic;"&gt;image&lt;/span&gt;. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Tangent: There are exceptions of course, notably &lt;a href="http://strongtalk.org/"&gt;Strongtalk&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The idea hasn’t caught on. Why?  Should it catch on, and if so, how?&lt;br /&gt;&lt;br /&gt;To be sure, the image is a very powerful mechanism, and has significant advantages. For example, in the context of the IDE, it gives the ability to save and share debugging sessions.&lt;br /&gt;&lt;br /&gt;One advantage of images is rapid start-up. If your program builds up a lot of state as it starts, it can be intolerably slow coming up. This is one the problems that killed Java on the client. One should avoid computing that state at startup; one easy way to do this is to precompute most of it and store it in an image. It is typically much faster to read in an image than to try and compute its contents from scratch.&lt;br /&gt;&lt;br /&gt;The problem begins when one relies &lt;span style="font-weight:bold;"&gt;exclusively&lt;/span&gt; on the image mechanism.  It becomes very difficult to disentangle one’s program from the state of the ongoing computation.&lt;br /&gt; &lt;br /&gt;If your program misbehaves (and they always do) and corrupts the state of the process, things get tedious. Say you have a test that somehow failed to clean up properly. In a conventional setting, you fix the problem with your code, and run the test again. With an image, you have the added burden of cleaning up the mess your program made.&lt;br /&gt;&lt;br /&gt;In general, traditional environments tend to force you to start over with each run of the program: the &lt;a href="http://java.sun.com/docs/white/langenv/Interpreted.doc.html#283"&gt;edit-compile-link-load-throw-the-application-off-the- cliff-let-it-crash-and-start-all-over-again&lt;/a&gt; cycle, in the words of the &lt;a href="http://java.sun.com/docs/white/langenv/"&gt;original Java whitepaper&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Tangent: That whitepaper is a masterpiece of technical rhetoric, and was visionary in its day.  Alas, Java never fully realized that vision. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The thing is, sometimes it’s good to be able to start afresh. It may be easier to start from scratch than to mutate your existing process. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Mega-tangent: incidentally, this is an argument for sexual reproduction as well. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Of course sometimes starting anew isn’t so nice: think about fix-and-continue debugging.&lt;br /&gt;&lt;br /&gt;In some cases it is even more critical to separate your code from the computation. You often save your image just to save your program. It may take you a while to find out that your image has been corrupted. Now you need to go back to a correct image, and yet you need to extract your code safely from the corrupt image. To be sure, Smalltalk IDEs provide a variety of tools that can help you with that, but  I have never been really happy with them.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Tangent: this is where irate Smalltalkers berate me about change sets and logs and Envy and Monticello etc. etc.  Sorry, I don’t think it’s good enough.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In general, Smalltalk makes it hard to modularize one's code, and especially to separate the application from the IDE.  The exclusive reliance on the image model greatly aggravates these difficulties.&lt;br /&gt;&lt;br /&gt;Traditional development tools, primitive as they often are, naturally provide a persistent, stateless representation of the program. In fact they provide two: the source code, in a text file, and a binary. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Semi-tangent: source code seems the most obvious thing in the world; but traditional Smalltalk’s have no real syntax above the method level! Classes are defined via the evaluation of reflective expressions, which rely on the reflective API. This is very problematic: the API often varies from one implementation to another. By the way, this is one of the ways Newspeak differs from almost every Smalltalk  (the late, great &lt;a href="http://www.daimi.au.dk/~marius/documents/andersen2004esug.pdf"&gt;Resilient&lt;/a&gt; being the only exception I can recall). Newspeak has a true syntax. Furthermore, because Newspeak module declarations are fully parametric in all their external dependencies, they can be compiled at any time in any order - unlike code in most languages (say Java packages) where there are numerous constraints on compilation order (e.g., imports must be defined).&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A binary is a stateless representation of the program code, but one that does not require a compiler to decode it.  Smalltalk doesn’t usually have a binary form. Code is embedded in the image. There are exceptions, and some Smalltalk flavors have ways of producing executables, but the classic approach ties code and computation together in the image  and makes it very hard to pry them apart. &lt;br /&gt;&lt;br /&gt;None of this means you can’t have an image as well as a binary format.  What is important is that you do not have &lt;span style="font-weight:bold;"&gt;just&lt;span style="font-style:italic;"&gt;&lt;/span&gt;&lt;/span&gt; an image. Ideally, you have images and a binary format. This is one of my goals with Newspeak, and we are pretty close.&lt;br /&gt;&lt;br /&gt;In Newspeak, serialized top level classes can serve as a binary format. I will expand on how serialization can serve as a binary format in an upcoming post. At the same time, we continue to use images, though I hope they will become much less central to our practice as time goes by.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-6528819210793023627?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/6528819210793023627/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=6528819210793023627' title='23 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6528819210793023627'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6528819210793023627'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/10/image-problem.html' title='An Image Problem'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>23</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-8480698125475534398</id><published>2009-09-03T11:15:00.000-07:00</published><updated>2009-09-03T13:06:46.967-07:00</updated><title type='text'>Systemic Overload</title><content type='html'>Type-based method overloading is a (mis)feature of mainstream statically typed languages. I am very much against it - I’ve spoken about it more than once in the past. I decided I should put my views on the matter in writing, just to get it out of my system. I’ll start with the better known problems, and work my way up to more exotic issues.&lt;br /&gt;&lt;br /&gt;Throughout this post, I’ll use the term overloading to denote type based overloading. Arity based overloading, as in e.g.,  Erlang, is pretty harmless.&lt;br /&gt;&lt;br /&gt;When introducing a feature into a language, one has to consider the cost/benefit ratio. You might argue that overloading lets you separate the handling of different types into different methods, instead of doing the type dispatch inside one method, which is tiresome and costly.  A classic example are mathematical operators - things like +.&lt;br /&gt;&lt;br /&gt;This argument would have merit, if the overloading was dynamic, as in multi-methods. Since it isn’t, overloading doesn’t solve this kind of problem. Not that I’m advocating multi-methods here - they have their own problems - but at least they are based on accurate type information, whereas overloading is based on crude static approximations.&lt;br /&gt;&lt;br /&gt;Consider this code (loosely based on an example from Dave Ungar’s &lt;a href="http://oopsla.org/oopsla2003/files/key-4.html"&gt;OOPSLA 2003 keynote&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;class VehicleUtilities { &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    int numberOfAxles(Vehicle v) { return 2;} // a plausible default&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    int numberOfAxles (Truck t){ return 3;} &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Vehicle v = new Truck();&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;VehicleUtilities  u = new VehicleUtilities();&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;u.numberOfAxles(v); // returns 2&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This simple example illustrates the dangerous disconnect between static and dynamic information engendered by overloading.&lt;br /&gt;&lt;br /&gt;What exactly is the benefit of type based overloading? It saves you some trouble inventing names. That’s it. This may seem important, but the only case where it is actually needed is when writing constructors - and only because the language doesn’t allow you to give them distinct names.&lt;br /&gt;&lt;br /&gt;Constructors are a bad idea, &lt;a href="http://gbracha.blogspot.com/2007/06/constructors-considered-harmful.html"&gt;as I’ve already explained&lt;/a&gt;, so let’s assume we don’t have them. For methods, type based overloading provides a trivial rewrite and that is all.   I don’t think it is that hard to give the different operations different names. Sometimes, the language syntax can make it easier for you (like keyword based method names in Smalltalk), but even in conventional syntax, it isn’t that difficult.&lt;br /&gt;&lt;br /&gt;You pay for this convenience in myriad ways. The code above exemplifies one set of issues.&lt;br /&gt;&lt;br /&gt;Another problem is the risk of ambiguity. In most overloading schemes, you can create situations where you can’t decide which method to call and  therefore declare the call illegal. Unfortunately, as the type hierarchy evolves, legal code can become illegal, or simply change its meaning.&lt;br /&gt;&lt;br /&gt;This means that existing code breaks when you recompile, or does the wrong thing if you don’t.&lt;br /&gt;&lt;br /&gt;Overloading is open to abuse: it allows you to give different operations the same name. Altogether,  you need style guides like Effective Java to warn you to use overloading as little as possible. Language constructs that require style guides to tell you not to use them are a bit suspect.&lt;br /&gt;&lt;br /&gt;Ok, you probably know all this. So what contribution does this post  make? Well, the systemic costs of overloading in terms of designing and engineering languages are less widely appreciated.&lt;br /&gt;&lt;br /&gt;Overloading makes it hard to interoperate with other languages.  It’s harder to call your methods from another language. Such a language may have a different type system and/or different overload rules. Or it may be dynamically typed.&lt;br /&gt;&lt;br /&gt;You often find a dynamic language implementing multi-method dispatch to approximate the behavior of overloaded methods it needs to call. This is costly at run time, and is a burden on the language implementor.&lt;br /&gt;&lt;br /&gt;Scala supports overloading primarily so it can call Java; you might say overloading  is a contagious disease, transmitted from one language to another through close contact.&lt;br /&gt;&lt;br /&gt;In general, overloading adds complexity to the language; it tends to interact with all sorts of other features, making those features harder to learn, harder to use, and harder to implement. In particular, any change to the type system is almost certain to interact with type based overloading.&lt;br /&gt;&lt;br /&gt;Here are some examples. Answers at the end of this post.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Exhibit 1: auto-boxing&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;void foo(Number n) { ... }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;void foo(int i) { ...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;foo(new Integer(3)); // quick, what does this do?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Exhibit 2:  the var args feature in Java&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;void foo(String s, Number... n) {...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;void foo(String s, int i, Integer... n) {...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;void foo(String s, int  i, int... n) {...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;void foo(String s, Integer i, Integer... n) {...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;void foo(String s, Integer i, int... n) {...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;foo(“What does this do?”, 1, 2); &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Exhibit 3: Generics&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;void foo(Collection c) {...}&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;void foo(Collection &amp;lt String&amp;gt c){...}&lt;/string&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;void foo(Collection &amp;lt Boolean&amp;gt c){...}&lt;/boolean&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;void foo(Collection &amp;lt ? extends String &amp;gt c){...}&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;void foo(Collection &amp;lt ? super String &amp;gt c){...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Collection&amp;lt String&amp;gt cs;&lt;/string&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;foo(cs);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;/* Which of these is legal? What would happen if we didn’t use erasure? You have 3 seconds. */&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Each of the above exhibits shows a specific type system extension which gets entangled with overloading.&lt;br /&gt;&lt;br /&gt;You might say you don’t care; these are pretty sick examples, and the language designers sorted out the rules. What is their suffering to you? Well, these complications all have cost, and since resources are finite, they come at the expense of other, more beneficial things.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: People tend  to think that the language design and implementation teams at large companies like Sun or Microsoft command vast resources. In reality, the resources are spread pretty thin considering the scope of the task at hand.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The more time is spent chasing down these issues, the less is spent focusing on doing stuff that actually buys you something. Products ship later and have more bugs and less desirable features, because people spent time worrying about gratuitous complications.&lt;br /&gt;&lt;br /&gt;This is not hypothetical. I know the poor soul who wrote the spec for this might have spent his time doing something else, like day dreaming or adding closures. The compiler writers could have fixed more outstanding bugs, perhaps even reaching a state where there were no open bugs. I know they tried. The testers could have tested for other, more pressing issues and discovered more bugs to open, before shipping. These are the wages of sin - and the sin is unjustified complexity.&lt;br /&gt;&lt;br /&gt;Now, for the sake of balance, I should say that overloading, and language complexity in general, do have one advantage I haven’t yet mentioned. They open up great opportunities for training, support and consulting. You can even write some really cool books full of language puzzlers.&lt;br /&gt;&lt;br /&gt;It’s worth noting that this kind of overloading is only a concern in languages with a mandatory type system. If you use optional typing (or just dynamic typing), you aren’t allowed to let the static types of the arguments change the meaning of the method invocation. This keeps you honest.&lt;br /&gt;&lt;br /&gt;Will future language designs avoid the problems of overloading? I wish I was confident that was the case, but I realize overloading is an entrenched tradition, and thus not easily eradicated.&lt;br /&gt;&lt;br /&gt;However, the overarching point I want to make is that the costs of complexity are insidious - unobvious in the short term but pervasive over the longer haul. &lt;span style="font-weight: bold;"&gt;KISS&lt;/span&gt; (&lt;span style="font-weight: bold;"&gt;K&lt;/span&gt;eep &lt;span style="font-weight: bold;"&gt;I&lt;/span&gt;t &lt;span style="font-weight: bold;"&gt;S&lt;/span&gt;imple, &lt;span style="font-weight: bold;"&gt;S&lt;/span&gt;  - where the binding of &lt;span style="font-weight: bold;"&gt;S&lt;/span&gt; is open to interpretation).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Answers:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The issue here is whether we auto-unbox &lt;span style="font-style: italic;"&gt;Integer(3)&lt;/span&gt; first to produce an &lt;span style="font-style: italic;"&gt;int&lt;/span&gt; (and call &lt;span style="font-style: italic;"&gt;foo(int)&lt;/span&gt;) or resolve overloading in favor of &lt;span style="font-style: italic;"&gt;foo(Number)&lt;/span&gt; and don’t unbox at all. Java does the latter. The reason is to remain compatible with older versions.&lt;/li&gt;&lt;li&gt;This is ambiguous. Except for the first declaration, no method is more specific than the others.&lt;/li&gt;&lt;li&gt;They all have the same erasure, and so the example is illegal. If we did not use erasure, than &lt;span style="font-style: italic;"&gt;foo(Collection &amp;lt String &amp;gt)&lt;/span&gt; would be the most specific method.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-8480698125475534398?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/8480698125475534398/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=8480698125475534398' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8480698125475534398'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8480698125475534398'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/09/systemic-overload.html' title='Systemic Overload'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-3254375134700233543</id><published>2009-07-30T19:34:00.000-07:00</published><updated>2010-01-17T17:37:37.571-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Reflection'/><title type='text'>The Miracle of become:</title><content type='html'>One of Smalltalk’s most unique and powerful features is also one of the least known outside the Smalltalk community. It’s a little method called &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; .&lt;br /&gt;&lt;br /&gt;What &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; does is swap the identities of its receiver and its argument.  That is, after&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;a become: b&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;all references to the object denoted by &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; before the call point refer to the object that was denoted by &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;, and vice versa.&lt;br /&gt;&lt;br /&gt;Take a minute to internalize this; you might misunderstand it as something trivial. This is &lt;span style="font-weight: bold;"&gt;not&lt;/span&gt; about swapping two variables - it is literally about one object becoming another. I am not aware of any other language that has this feature. It is a feature of enormous power - and danger.&lt;br /&gt;&lt;br /&gt;Consider the task of extending your language to support persistent objects. Say you want to load an object from disk, but don’t want to load all the objects it refers to transitively (otherwise, it’s just plain object deserialization). So you load the object itself, but instead of loading its direct references, you replace them with husk objects.&lt;br /&gt;&lt;br /&gt;The husks stand in for the real data on secondary storage. That data is loaded lazily. When you actually need to invoke a method on a husk, its &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; method loads the corresponding data object from disk (but again, not transitively).&lt;br /&gt;&lt;br /&gt;Then, it does a &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt;, replacing all references to the husk with references to the newly loaded object, and retries the call.&lt;br /&gt;&lt;br /&gt;Some persistence engines have done this sort of thing for decades - but they usually relied on low level access to the representation. &lt;span style="font-weight: bold;"&gt;Become:&lt;/span&gt; lets you do this at the source code level.&lt;br /&gt;&lt;br /&gt;Now go do this in Java. Or even in another dynamic language. You will recognize that you can do a general form of futures this way, and hence laziness. All without privileged access to the workings of the implementation. It’s also useful for schema evolution - when you add an instance variable to a class, for example. You can “reshape” all the instances as needed.&lt;br /&gt;&lt;br /&gt;Of course, you shouldn’t use &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; casually.  It comes at a cost, which may be prohibitive in many implementations. In early Smalltalks, &lt;span style="font-weight: bold;"&gt;become&lt;/span&gt;: was cheap, because all objects were referenced indirectly by means of an object table. In the absence of an object table, &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; traverses the heap in a manner similar to a garbage collector. The more memory you have, the more expensive &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; becomes.&lt;br /&gt;&lt;br /&gt;Having an object table takes up storage and slows down access; but it does buy you a great deal of flexibility.  Hardware support could ease the performance penalty. The advantage is that many hard problems become quite tractable if you are willing to pay the cost of indirection via an object table up front.  Remember: every problem in computer science can be solved with extra levels of indirection. &lt;a href="http://www.tinlizzie.org/%7Eawarth/"&gt;Alex Warth&lt;/a&gt; has some &lt;a href="http://www.vpri.org/pdf/rn2008001_worlds.pdf"&gt;very interesting work&lt;/a&gt;  that fits in this category, for example.&lt;br /&gt;&lt;br /&gt;Become: has several variations - one way &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; changes the identity of an object &lt;span style="font-style: italic;"&gt;A&lt;/span&gt; to that of another object &lt;span style="font-style: italic;"&gt;B&lt;/span&gt;, so that references to &lt;span style="font-style: italic;"&gt;A&lt;/span&gt; now point at &lt;span style="font-style: italic;"&gt;B&lt;/span&gt;; references to &lt;span style="font-style: italic;"&gt;B&lt;/span&gt; remain unchanged.  It is often useful to do &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; in bulk - transmuting the identities of all objects in an array (either unidirectionally or bidirectionally). A group &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; which does it magic atomically is great for implementing reflective updates to a system, for example. You can change a whole set of classes and their instances in one go.&lt;br /&gt;&lt;br /&gt;You can even conceive of  type safe &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; . Two way &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; is only type safe if the type of &lt;span style="font-style: italic;"&gt;A&lt;/span&gt; is identical to that of &lt;span style="font-style: italic;"&gt;B&lt;/span&gt;,  but one way &lt;span style="font-weight: bold;"&gt;become:&lt;/span&gt; only requires that the new object be a subtype of the old one.&lt;br /&gt;&lt;br /&gt;It may be time to reconsider whether having an object table is actually a good thing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-3254375134700233543?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/3254375134700233543/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=3254375134700233543' title='34 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/3254375134700233543'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/3254375134700233543'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/07/miracle-of-become.html' title='The Miracle of become:'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>34</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-789867630057998884</id><published>2009-07-11T17:41:00.000-07:00</published><updated>2010-01-17T17:35:47.858-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><category scheme='http://www.blogger.com/atom/ns#' term='Modularity'/><title type='text'>A Ban on Imports (continued)</title><content type='html'>In my &lt;a href="http://gbracha.blogspot.com/2009/06/ban-on-imports.html"&gt;previous post&lt;/a&gt;, I  characterized imports as evil, and promised to expand upon non-evil (yay verily, even &lt;span style="font-weight: bold;"&gt;good&lt;/span&gt;) alternatives. First, to recap:&lt;br /&gt;&lt;br /&gt;Imports are used for linking modules together. Unfortunately, they are embedded within the modules they link instead of being external to them. This embedding makes the modules containing the imports dependent on the specific linkage configuration the imports represent.&lt;br /&gt;&lt;br /&gt;Workarounds like dependency injection are just that: workarounds (e.g., see &lt;a href="http://gbracha.blogspot.com/2007/12/some-months-ago-i-wrote-couple-of-posts.html"&gt;this post&lt;/a&gt;). They are complex, cumbersome, heavyweight. OSGi even more so. Above all, they are unnecessary - provided the language has adequate modularity constructs.&lt;br /&gt;&lt;br /&gt;So, which languages have sufficient modularity support? I know of only two such languages: Newspeak and &lt;a href="http://www.plt-scheme.org/"&gt;PLT Scheme&lt;/a&gt;. ML has a very elaborate module system, but ultimately it does not meet my requirements.&lt;br /&gt;&lt;br /&gt;Modules and their definitions (these are two distinct things) should be first class and support mutual recursion. This isn’t the case in ML, though some dialects do support mutual recursion.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Tangent:&lt;/span&gt;&lt;span style="font-style: italic;"&gt; The difficulty in ML, incidentally, is rooted in the type system.  It is very hard to typecheck the kind of abstractions we are talking about. Worse, if you want to make your type declarations modular, your modules end up having types as members. This can lead you into deep water with types of types (making your type system undecidable). To avoid that trap, ML opts to stratify the system, so that modules (that contain types) are not values (that have types).&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Not surprisingly then, progress on these issues comes from the dynamically typed world. Over a decade ago, the Schemers introduced &lt;a href="http://docs.plt-scheme.org/guide/units.html"&gt;Units&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The biggest difference between Newspeak modularity constructs and units is probably the treatment of  inheritance. In Newspeak our module definitions are exactly top level classes, which reduces the number of concepts while allowing module definitions to benefit from inheritance.&lt;br /&gt;&lt;br /&gt;There are strong arguments against inheritance of module definitions.  For example, you cannot reliably add members to a module definition, because they might conflict with identically named members in the heirs of that definition.  Specifying a superclass (or super module definition) looks like a hardwired dependency as well.&lt;br /&gt;&lt;br /&gt;On the other hand, being able to reuse module definitions via inheritance is very attractive. Especially if you can mix them in freely.&lt;br /&gt;&lt;br /&gt;Ultimately, we decided that the benefits of unifying classes and module definitions outweighed the costs.&lt;br /&gt;&lt;br /&gt;Take the argument above regarding extending module definitions with new members. Newspeak was designed with an eye toward a &lt;a href="http://gbracha.blogspot.com/2007/03/sobs.html"&gt;completely networked world&lt;/a&gt;, where &lt;a href="http://www.youtube.com/watch?v=_cBGtvjaLM0"&gt;software is a service&lt;/a&gt;, not an artifact. In such a world, you can find all your heirs - just as if you were working on your own private application in your IDE.  So if you need to add a member to a module definition, you should be able check who is mixing it in and what names they have added.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Tangent:&lt;/span&gt; This may still sound radical today, but this world is moving into place as we speak:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;V8 gives the web browser the performance needed to be a platform for serious client software.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;HTML 5, Gears etc. provide such software with persistent storage on the client&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Chrome OS makes it obvious (as if it wasn’t clear enough before) that this in turn commoditizes the OS, and that the missing pieces will keep coming. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Likewise, in the absence of a global namespace, top level classes do not inherit from any specific superclass (and nested classes don’t either because all names are late bound) . Overall, the downside of allowing inheritance on module definitions doesn't apply in Newspeak.&lt;br /&gt;&lt;br /&gt;The upside compared to conventional constructs is huge. It means you can easily take entire libraries, create multiple instances of them (each with its own configuration), mix them into new definitions, write polymorphic code that can work simultaneously with different instances or even different implementations of the API etc. You can store the libraries and their instances in variables, pass them as parameters, return them from computations, hold them in data structures, serialize them to disk or over the wire - all with the same mechanisms you use for ordinary classes and objects.&lt;br /&gt;&lt;br /&gt;This economy of mechanism is important. It means you don’t have to learn a variety of specialized and complex tools to build modular systems. The same basic tools you use to implement basic CS101 examples will serve across the board.  This will carry through to other areas like tooling:  an object inspector can be used to inspect a “package”, for example.  Altogether, your system can be much smaller - which makes it easier to learn, faster to load, likelier to fit on small devices etc. Simplicity is an advantage in itself.&lt;br /&gt;&lt;br /&gt;As I explained in the &lt;a href="http://gbracha.blogspot.com/2009/06/ban-on-imports.html"&gt;first half of this series&lt;/a&gt;, the only need for a global namespace is for configuration: linking the pieces of an application together. There are several ways you can deal with the configuration/linkage issue. It’s a tooling issue. We use the IDE, as I described in an &lt;a href="http://gbracha.blogspot.com/2008/12/living-without-global-namespaces.html"&gt;older post&lt;/a&gt;.  So using the running example from part 1, we can write:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class&lt;/span&gt;&lt;span style="font-style: italic;"&gt; SoundSystem usingPlatform: platform andPlayer: player&lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt; = {&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;|&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    (* dependencies on platform might include things like the following: *)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    List = platform Collections List.&lt;/span&gt; &lt;span style="font-style: italic;"&gt;(* You can see how this replaces an import *)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    mp3Player = player usingPlatform: platform withDock: self. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;}{ &lt;/span&gt;&lt;span style="font-style: italic;"&gt;... &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class&lt;/span&gt;&lt;span style="font-style: italic;"&gt; IPhone usingPlatform: platform withDock: dock &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;= {      &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;    |  &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    (* dependencies on platform elided *)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    myDock &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;=&lt;/span&gt;&lt;span style="font-style: italic;"&gt; dock. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;|&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;}{&lt;/span&gt;&lt;span style="font-style: italic;"&gt; ... &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class&lt;/span&gt;&lt;span style="font-style: italic;"&gt; Zune usingPlatform: platform withDock: dock &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;= { &lt;br /&gt;&lt;br /&gt;| &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    (* dependencies on platform elided *)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    theirDock &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;=&lt;/span&gt;&lt;span style="font-style: italic;"&gt; dock. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;   &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;|&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;}{&lt;/span&gt;&lt;span style="font-style: italic;"&gt; ... &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;and then create instances in the IDE (which provides us with a namespace where &lt;span style="font-style: italic;"&gt;SoundSystem&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;iPhone&lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;(tm)&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;Zune&lt;span style="font-weight: bold;"&gt;(tm)&lt;/span&gt;&lt;/span&gt; are all bound to the classes defined above):&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;sys1:: SoundSystem usingPlatform: Platform new andPlayer: IPhone.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;sys2:: SoundSystem usingPlatform: Platform new andPlayer: Zune.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;tm:&lt;/span&gt; Did you know?  iPhone is trademark of Apple; Zune is a trademark of Microsoft.&lt;br /&gt;&lt;br /&gt;Variations on the above are possible; hopefully, you get the idea. If not - well, don’t worry, I probably won’t explain it again.&lt;br /&gt;&lt;br /&gt;The absence of a global namespace has additional advantages of course: there’s &lt;a href="http://gbracha.blogspot.com/2008/02/cutting-out-static.html"&gt;no static state&lt;/a&gt;, and it’s good for security (but that is for another day).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-789867630057998884?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/789867630057998884/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=789867630057998884' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/789867630057998884'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/789867630057998884'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/07/ban-on-imports-continued.html' title='A Ban on Imports (continued)'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-1844927628139698674</id><published>2009-06-30T16:28:00.000-07:00</published><updated>2010-01-17T17:35:47.858-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><category scheme='http://www.blogger.com/atom/ns#' term='Modularity'/><title type='text'>A Ban on Imports</title><content type='html'>This is not, of course, an essay on restricting free trade.  Rather, this post is about the evils of the &lt;span style="font-weight: bold;"&gt;import&lt;/span&gt; clause, which occurs in one form or another across a wide array of programming languages, from Ada, Oberon and the various Modulas, through Java and C#, and on to F# and Haskell.&lt;br /&gt;&lt;br /&gt;The import clause should be banned because it undermines modularity in a deep and insidious way. This is a point I’ve attempted to convey time and time again, with only limited success. I will now try to illustrate the problem via a hardware inspired example.&lt;br /&gt;&lt;br /&gt;Consider the not-so-humble MP3 player. An MP3 player is a hardware module. The market is full of them, as well as other hardware modules they can plug in to. For example, sound systems where on can dock an MP3 player and have it play on stereo speakers.&lt;br /&gt;&lt;br /&gt;Let’s try and describe the analog of such a sound system using programming language modularity constructs that rely on imports:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;module&lt;/span&gt; SoundSystem&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;    import&lt;/span&gt; MP3Player;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;  ... wonderful functionality elided ...&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;&lt;br /&gt;end&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I want to describe how my sound system works, separately from the description of how an MP3 player works.  I would like to later plug in a particular MP3 player, say a Zune(tm) or an iPod(tm)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;tm&lt;/span&gt;&lt;span style="font-style: italic;"&gt;: Zune and iPod are trademarks of Microsoft and Apple respectively, two companies with armies of lawyers who might harass me if I do not state the obvious.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now the first problem is that neither Zune or iPod are named MP3Player.  If I want to connect my sound system to a Zune, I will have to edit the definition of &lt;span style="font-style: italic;"&gt;SoundSystem&lt;/span&gt; to name the specific module I want to import.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;If you’re very petty, you might say that Zune and iPod do not share a common interface and cannot be docked into the same sound system.  Imagine that we wish to use our sound system with an iPhone (tm)  and an iPod Touch (tm)  of some compatible generation.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;tm&lt;/span&gt;: iPhone and iPod Touch are trademarks of Apple.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Say I decide to go with a Zune.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;module&lt;/span&gt; SoundSystem&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    &lt;span style="font-weight: bold;"&gt;import&lt;/span&gt; Zune;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;  ... wonderful functionality elided ...&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;&lt;br /&gt;end&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Later I change my mind for some reason, and want to hook up my system to an iPod. It’s easy: I just edit the definition of my system again, to import iPod:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;module&lt;/span&gt; SoundSystem&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;    &lt;span style="font-weight: bold;"&gt;import&lt;/span&gt; iPod;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;  ... wonderful functionality elided ...&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;&lt;br /&gt;end&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The question you should be asking is: Why should I edit the definition of my system each time I change the configuration? In reality, it is unlikely that I am actually the designer of &lt;span style="font-style: italic;"&gt;SoundSystem&lt;/span&gt;. I probably don’t even have access to its definition. I just want to configure it to work with my MP3 player.&lt;br /&gt;&lt;br /&gt;The problem is that &lt;span style="font-weight: bold;"&gt;import&lt;/span&gt; confounds &lt;span style="font-style: italic;"&gt;module definition&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;module configuration&lt;/span&gt;. Module definition describes the design of a module; module configuration describes how one hooks up different modules. The former has to do with module &lt;span style="font-style: italic; font-weight: bold;"&gt;internals&lt;/span&gt;; the latter should be done &lt;span style="font-weight: bold; font-style: italic;"&gt;externally&lt;/span&gt; to the modules involved, to allow them to be used in any context where they could function.&lt;br /&gt;&lt;br /&gt;We clearly want our sound system to abstract over the specific player being plugged in to it. Any player with a compatible interface will do.  A well known mechanism for abstracting things is parameterization.  We might be happier if we defined our sound system parametrically with respect to the MP3 player&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;module&lt;/span&gt; SoundSystem(anMP3Player){&lt;br /&gt;... great wonders using anMP3Player ...&lt;br /&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We could them configure our system to use an iPod:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;SoundSystem(iPod);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;or a Zune&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;SoundSystem(Zune);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;without having to modify (or even have access to) the source code for the definition of &lt;span style="font-style: italic;"&gt;SoundSystem&lt;/span&gt;. Hurray!&lt;br /&gt;&lt;br /&gt;The module definition looks a lot like a function, and the configuration code looks like a function application. This is very suggestive. Indeed, ML introduced a module system based on function-like things called &lt;span style="font-style: italic;"&gt;functors&lt;/span&gt; a quarter century ago.  But there’s a bit more to this.&lt;br /&gt;These hardware pieces tend to plug in to &lt;span style="font-style: italic; font-weight: bold;"&gt;each other&lt;/span&gt;.  For example, the definition of IPod is parametric too:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;IPod(dockingStation){... even greater wonders ...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Our sound system does its thing by behaving like a docking station. It and the MP3 player are mutually recursive modules. Configuration therefore requires support for mutual recursion (which is not allowed in Standard ML):&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;letrec&lt;/span&gt; {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;  dock = SoundSystem(mp3Player);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;  mp3Player = IPod(dock);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;} &lt;span style="font-weight: bold;"&gt;in&lt;/span&gt; dock;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;If this notation is unfamiliar, please brush up on your functional programming skills before you become unemployable. Basically, ignore the  first and last line, and treat the two lines involving = as equations.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So the module definitions are a lot like functions that yield modules. You could also think of module definitions as classes yielding instances. The instances are like physical hardware modules.&lt;br /&gt;&lt;br /&gt;Now we can add another sound system and use our old Zune&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;letrec&lt;/span&gt;&lt;span style="font-style: italic;"&gt; {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;dock2 = SoundSystem(oldMP3);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;oldMP3 = Zune(dock2);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;} &lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;in&lt;/span&gt;&lt;span style="font-style: italic;"&gt; dock2;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This is what is often called side by side deployment - multiple instances of the same design, configured differently.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Tangent:&lt;/span&gt;  Yes, Virginia, you can achieve that sort of thing in Java despite imports, using class loaders. Imports hardwire names into your code, and class loaders can counteract that by letting you define multiple namespaces. These can have multiple copies of your code, potentially hardwired to different things  (even though they all have the same name). If you think class loaders offer  a simple, clean way of doing things that is easy to learn, use, understand and debug, this post is not for you. Nor will any amount of OSGi magic on top fundamentally change things.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We might also choose to define things differently&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;module&lt;/span&gt;&lt;span style="font-style: italic;"&gt; SoundSystem(MP3Player) { &lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;    player = MP3Player(self);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt; ... &lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Here, we are passing module definitions as parameters. We are also referring to &lt;span style="font-style: italic;"&gt;SoundSystem&lt;/span&gt;’s current instance from within itself - a lot like classes, no? We might configure things thusly&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;SoundSystem(iPod);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So it looks like mutual recursion and first class module definitions are very natural things to have. And yet traditional languages do not support this - even though many languages have constructs like classes and functions that are first class values and can be defined in a mutually recursive fashion. &lt;br /&gt;&lt;br /&gt;One problem with using these constructs to define modules is that they are usually able to access anything in the global namespace. This makes it very hard to avoid implicit dependencies.&lt;br /&gt;&lt;br /&gt;Interestingly, the global namespace is exactly what &lt;span style="font-weight: bold;"&gt;import&lt;/span&gt; requires. Since we don’t need or want &lt;span style="font-weight: bold;"&gt;import&lt;/span&gt;, let’s do away with it and the global namespace.  We clearly will get a much more modular system without it; but wait - there seems to be one place where we really want the global namespace. That is when we write our configuration code, the code that wires our modules together. &lt;br /&gt;&lt;br /&gt;That’s fine - there are a number of solutions for that. It isn’t always clear that our configuration language is the same language as the programming language(s) that define our modules, for example. If you write a makefile, the global namespace is defined by your file system and accessible within the makefile. Not that I really want to recommend make and its ilk.&lt;br /&gt;&lt;br /&gt;I think we do want to code our configuration in a nice general purpose high level programming language. One solution is to have our IDE provide us with an object representing the known global namespace, and write our configuration code with respect to that namespace object. This is essentially what we do in Newspeak.&lt;br /&gt;&lt;br /&gt;In the next post, I’ll discuss more of the advantages of this approach, contrast how Newspeak handles things with other languages with powerful module systems, like Scheme (which for the past decade or so has had a system called Units that is quite close to what I’ve discussed so far) and ML, and show once more how one actually does configuration in Newspeak.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;To be continued.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-1844927628139698674?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/1844927628139698674/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=1844927628139698674' title='26 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1844927628139698674'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1844927628139698674'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/06/ban-on-imports.html' title='A Ban on Imports'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>26</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5605434425459409714</id><published>2009-05-25T16:10:00.000-07:00</published><updated>2009-05-25T18:10:35.004-07:00</updated><title type='text'>Original Sin</title><content type='html'>I’ve often said that Java’s original sin was not being a pure object oriented language - a language where everything is an object.&lt;br /&gt;&lt;br /&gt;As one example, consider type &lt;span style="font-weight: bold;"&gt;char&lt;/span&gt;.  When Java was introduced, the Unicode standard required 16 bits. This later changed, as 16 bits were inadequate to describe the world’s characters.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: Not really surprising if you think about it. For example, Apple’s internationalization support allocated 24 bits per character in the early 1990s. However, engineering shortcuts are endemic. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In the meantime, Java had committed to a 16 bit character type. Now, if characters were objects, their representation would be encapsulated, and nobody would very much affected how many bits are needed. A primitive type like &lt;span style="font-weight: bold;"&gt;char&lt;/span&gt;, however, advertises its representation to the world. Consequently, people dealing with unicode in Java have to deal with encoding code points themselves.&lt;br /&gt;&lt;br /&gt;I was having this conversation for the umpteenth time last week. My interlocutor asked whether it would be possible for Java to be as efficient as it was without primitive types. The answer is yes, and prompted this post.&lt;br /&gt;&lt;br /&gt;So how would we go about getting rid of primitive types without incurring a significant performance penalty?&lt;br /&gt;&lt;br /&gt;Java has a mandatory static type system; it is compiled into a statically typed assembly language (Java byte codes, aka JVML). It supports final classes. I do not favor any of these features, but we will take them as a given.  The only changes we will propose are those necessary to eradicate primitive types.&lt;br /&gt;&lt;br /&gt;Assume that the we have a final class &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt; representing 32 bit integers.  The compiler can translate occurrences of this type into type &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt;. Hence, we can generate the same code for scalars as Java does today with no penalty whatsoever.&lt;br /&gt;&lt;br /&gt;To make &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt; a suitable replacement for &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt;, we would like the syntactic convenience of using operators:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Int twice(Int i) { return i + i;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This is easy enough. We don’t want operator overloading of course. What we want is simply the ability to define methods whose names are operators.  The language can support the same set of binary operators it does today, with the same fixed precedence.  No need for special syntax and rules to define the precedence of operators etc. I leave that sort of thing to people who love complexity.&lt;br /&gt;&lt;br /&gt;It is crucial that &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt;s behave as values. This requires that the &lt;span style="font-weight: bold;"&gt;==&lt;/span&gt; operator on &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt;s behaves the same as equality. There should be no way to detect if we’ve allocated one copy of the number 3 or a million of them. Since we can define &lt;span style="font-weight: bold;"&gt;==&lt;/span&gt; as a method, we can define it in &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt; to work the way it does for most objects, and override it with a method in &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt; and other value classes to work differently. We’ll want to take care of &lt;span style="font-weight: bold;"&gt;identityHash&lt;/span&gt; as well of course.&lt;br /&gt;&lt;br /&gt;At this point, we can simply say that &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt; stands for &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt;. Except for locking. What would it mean to synchronize on an instance of &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt;? To avoid this nastiness, we’ll just say that all instances of &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt; are locked when they are created. Hence no user code can ever synchronize on them.&lt;br /&gt;&lt;br /&gt;In this hypothetical dialect, literal integers such as 3 are considered instances of &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt;. They, and all other integer results, can be stored in collections, passed to polymorphic code that requires type &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt; etc. The compiler sees to it that they are usually represented as the primitive integer type of the JVM. When they are used as objects, they get boxed into real instances of &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt;. When an object gets cast to an &lt;span style="font-weight: bold;"&gt;Int&lt;/span&gt;, it gets unboxed.&lt;br /&gt;&lt;br /&gt;This is similar to what happens today, &lt;span style="font-style: italic;"&gt;except that it is completely transparent&lt;/span&gt;.  Because these objects have true value semantics for identity, you can never tell if a new instance was allocated by the boxing or not; indeed, you cannot tell if boxing or unboxing occur at all.&lt;br /&gt;&lt;br /&gt;So far I haven’t described anything really new. A detailed proposal along the lines above existed years ago.  It would allow adding new user defined value types like &lt;span style="font-weight: bold;"&gt;Complex&lt;/span&gt; as well.&lt;br /&gt;&lt;br /&gt;A more restricted, but somewhat similar proposal was considered as part of JSR201 - the idea being that boxing would create value objects, without actually replacing the existing primitive types. It was rejected by the expert group. As I recall, only myself and Corky Cartwright supported it. There were concerns as to what would become of the existing wrapper classes (&lt;span style="font-weight: bold;"&gt;Integer&lt;/span&gt; and friends); those are used by reflection etc., and are hopelessly broken because they have public constructors that allocate new instances for all the world to see.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: I don’t know who came up with the idea of an Integer class that could have multiple distinct instances of 3. I assume it was another engineering shortcut (see above) by someone really clever.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;To be clear - &lt;span style="font-style: italic; font-weight: bold;"&gt;I am not suggesting that this discussion should be reopened. This is just a mental exercise&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;There is only one place where this proposal introduces a performance penalty: polymorphic code on type &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt; that tests for identity. Now that identity testing is method, it will be slower. Not by all that much, and only for type &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Why? Because we preclude overriding &lt;span style="font-weight: bold;"&gt;==&lt;/span&gt; in non-value classes. There are several ways to arrange this. Either by fiat, or by rearranging the class hierarchy with types &lt;span style="font-weight: bold;"&gt;Value&lt;/span&gt; (for value types) and &lt;span style="font-weight: bold;"&gt;Reference&lt;/span&gt; (for regular objects) as the only subtypes of &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt;, making &lt;span style="font-weight: bold;"&gt;==&lt;/span&gt; final in those two types. Or some other variation&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;What has not been seriously discussed, to my knowledge, is what to do with arrays. As indicated above, scalar uses of primitive types don’t pose much of a problem.&lt;br /&gt;&lt;br /&gt;However, what of types like &lt;span style="font-weight: bold;"&gt;int[]&lt;/span&gt;? Again, we can allocate these as arrays of 32 bit integers just as we do today.  No memory penalty, no speed penalty when writing in &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt;s or reading them out.&lt;br /&gt;&lt;br /&gt;What makes things complicated is this: If &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt; is a subtype of &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt; (as it is in the above) we’d expect &lt;span style="font-weight: bold;"&gt;int[]&lt;/span&gt; to be a subtype of &lt;span style="font-weight: bold;"&gt;Object[]&lt;/span&gt;, because in Java we expect covariant subtyping among arrays.&lt;br /&gt;&lt;br /&gt;Of course that isn’t type safe, and one could certainly argue that this our chance to correct that problem.  But I won’t. Instead, assume we want to preserve Java’s covariant array subtyping.&lt;br /&gt;&lt;br /&gt;The thing is, we do not want to box all the elements of an&lt;span style="font-weight: bold;"&gt; int[]&lt;/span&gt; when we assign it to an &lt;span style="font-weight: bold;"&gt;Object[]&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The zeroth order solution is to leave the existing subtype relations among arrays of the predefined types  unchanged. It’s ugly, but we are no worse off than we are today - and have a bunch of advantages because we are free of primitive types. But we can do better.&lt;br /&gt;&lt;br /&gt;Suppose &lt;span style="font-weight: bold;"&gt;Object[]&lt;/span&gt; has methods &lt;span style="font-weight: bold;"&gt;getObjectElement&lt;/span&gt; and &lt;span style="font-weight: bold;"&gt;setObjectElement&lt;/span&gt;. These (respectively) retrieve and insert elements into the array.&lt;br /&gt;&lt;br /&gt;We can equip &lt;span style="font-weight: bold;"&gt;int[]&lt;/span&gt; with versions of these methods that box/unbox elements that are being retrieved or inserted. We ensure that &lt;span style="font-weight: bold;"&gt;a[i]&lt;/span&gt; and &lt;span style="font-weight: bold;"&gt;a[i] = e&lt;/span&gt; are compiled differenty based upon the static type of &lt;span style="font-weight: bold;"&gt;a&lt;/span&gt;. If &lt;span style="font-weight: bold;"&gt;a&lt;/span&gt; is &lt;span style="font-weight: bold;"&gt;Object[]&lt;/span&gt;, we use the &lt;span style="font-weight: bold;"&gt;getObjectElement&lt;/span&gt; and &lt;span style="font-weight: bold;"&gt;setObjectElement&lt;/span&gt;.  If &lt;span style="font-weight: bold;"&gt;a&lt;/span&gt; is of type&lt;span style="font-weight: bold;"&gt; int[]&lt;/span&gt;, we use methods that access the integer representation directly, say, &lt;span style="font-weight: bold;"&gt;getIntElement&lt;/span&gt; and &lt;span style="font-weight: bold;"&gt;setIntElement&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;At this point, we can even introduce meaningful relations among array types like &lt;span style="font-weight: bold;"&gt;int[]&lt;/span&gt; and &lt;span style="font-weight: bold;"&gt;short[] &lt;/span&gt;using the same technique.&lt;br /&gt;&lt;br /&gt;In essence, &lt;span style="font-weight: bold;"&gt;Object[]&lt;/span&gt; provides an interface that different array types may implement differently. We just provide a sugar for accessing this interface magically based on type.  On a modern implementation like Hotspot, the overhead for this would be minimal.&lt;br /&gt;&lt;br /&gt;Of course, if you make arrays invariant, the whole issue goes away. That just makes things too easy. Besides, I’ve come to the conclusion that Java made the right call on array covariance in the first place.&lt;br /&gt;&lt;br /&gt;All in all - Java could have been purely object oriented with no significant performance hit. But it wasn’t, isn’t and likely won’t.   &lt;span style="font-style: italic;"&gt;Sic Transit Gloria Mundi.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5605434425459409714?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5605434425459409714/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5605434425459409714' title='50 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5605434425459409714'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5605434425459409714'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/05/original-sin.html' title='Original Sin'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>50</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5734360635972064061</id><published>2009-04-30T19:09:00.000-07:00</published><updated>2010-01-17T17:37:37.571-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Reflection'/><title type='text'>The Need for More Lack of Understanding</title><content type='html'>People are always claiming that if only there was more understanding in the world, it would be a better place.  This post will argue that less is more: we need less understanding - specifically  more not understanding.&lt;br /&gt;&lt;br /&gt;A couple of weeks ago I gave a &lt;a href="http://msdn.microsoft.com/en-us/oslo/dd727737.aspx"&gt;talk&lt;/a&gt; at &lt;a href="http://www.sellsbrothers.com/conference/"&gt;DSL Dev Con.&lt;/a&gt; One of the encouraging things that was evident there was the increased understanding that &lt;span style="font-style: italic; font-weight: bold;"&gt;not&lt;/span&gt; understanding is important.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: While I'm advertising this talk, I might as well advertise &lt;/span&gt;&lt;a style="font-style: italic;" href="http://channel9.msdn.com/shows/Going+Deep/Gilad-Bracha-Inside-Newspeak/"&gt;my interview on Microsoft's channel 9&lt;/a&gt; which&lt;span style="font-style: italic;"&gt; explains the motivation for Newspeak and its relation to cloud computing.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Several programming languages support a mechanism by which a class or object can declare a general-purpose handler for method invocations it does not explicitly support.&lt;br /&gt;&lt;br /&gt;Smalltalk was, AFIK, the first language to introduce this idea. You do it by declaring a method called &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; . The method takes a single argument, that represents a reification of the call. The argument tells us the name of the method that was invoked, and the actual arguments passed.  If a method &lt;span style="font-style: italic;"&gt;m is&lt;/span&gt; invoked on an object that does not have a member method &lt;span style="font-style: italic;"&gt;m (&lt;/span&gt;that is, &lt;span style="font-style: italic;"&gt;m&lt;/span&gt; is not declared by the class of the object or any of its superclasses), then the object’s  &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; method is invoked. The default implementation of &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt;, declared in class &lt;span style="font-weight: bold;"&gt;Object&lt;/span&gt;, is to throw an exception. By overriding  &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; one can control the system’s behavior when such calls are made. Similar mechanisms exist in several other dynamic languages (e.g., &lt;span style="font-weight: bold;"&gt;missingMethod&lt;/span&gt; in Ruby and Groovy, &lt;span style="font-weight: bold;"&gt;_noSuchMethod_&lt;/span&gt; in some dialects of Javascript).&lt;br /&gt;&lt;br /&gt;Aficionados of these languages know that this is an extremely useful mechanism. However, users of mainstream object-oriented languages typically lack an appreciation of the power this mechanism can provide. I hope this post can  be a small step in rectifying that situation.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;DoesNotUnderstand:&lt;/span&gt; helps implement orthogonal persistence, lazy loading,  futures, and remote proxies, to name a few.  Recently, there’s been a surge of interest in domain-specific languages, and &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; can help there as well.&lt;br /&gt;&lt;br /&gt;We’ll use an example from my talk at DSL Dev Con.  Consider how to interact with an OS shell like bash or csh from within a general purpose programming language.  We’ll use Newspeak as our general purpose language (what were you expecting?), because it works best (in my unbiased opinion).&lt;br /&gt;&lt;br /&gt;Suppose you want a listing of the files in the current directory. You could view ls as a method on a shell object, and write: &lt;span style="font-weight: bold;"&gt;shell ls&lt;/span&gt;.  Of course, we won’t do something like the following Java code:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;class Shell {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    public Collection&lt;string&gt; ls() {...}&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;... an infinity of other stuff&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There are any number of commands that a shell can understand, depending on the current path and the executables in the directories on that path. We cannot plausibly enumerate them all as a fixed set of methods in &lt;span style="font-weight: bold;"&gt;Shell&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Instead, in we can define a class &lt;span style="font-weight: bold;"&gt;NewShell&lt;/span&gt; with a &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; method to look up the name of the message in the shell’s path and execute it.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;shell ls&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If we write this code in the context of a subclass of &lt;span style="font-weight: bold;"&gt;NewShell&lt;/span&gt;, we can take advantage of Newspeak’s implicit receiver sends and just write&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Nice, but not quite good enough. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls aFilename&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;doesn’t work at all.  We don’t want to invoke ls immediately here - we need to gather its arguments in some way.  One way to do this is to have &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; return a  function object, that can be fed its arguments.  This is in fact what we do in our implementation. We call this object a &lt;span style="font-weight: bold;"&gt;CommandSession&lt;/span&gt;. To get a &lt;span style="font-weight: bold;"&gt;CommandSession&lt;/span&gt; to actually run the command, you call one of is value methods, with the desired arguments:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls value: aFileName&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This is less convenient for the simple case, where we need to write&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls value&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;to get ls to do something - but it is much more general.&lt;br /&gt;&lt;br /&gt;What about modifiers, as in &lt;span style="font-weight: bold;"&gt;ls -l&lt;/span&gt; ? We can make simple cases work slightly better by defining -  as a method on &lt;span style="font-weight: bold;"&gt;CommandSession&lt;/span&gt; :&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls -’l’&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;This is what the current implementation does.&lt;br /&gt;&lt;br /&gt;The most general approach is to treat them as arguments&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls value: ‘-l’&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls value: ‘-l’ value: aFileName&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;An alternative might be to leave ls as it was originally, but allow&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls: aFileName&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;as well. In this version, &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; checks to see if the message takes an argument (i.e., it ends with a colon). If so it strips the colon off the message name, creates &lt;span style="font-weight: bold;"&gt;CommandSession&lt;/span&gt; for the result, and calls its &lt;span style="font-weight: bold;"&gt;value:&lt;/span&gt; method with the argument. This handles modifiers pretty well&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls: ‘-l’&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If there are multiple arguments, we can pass a tuple as the argument, and doesNotUnderstand: will unpack it as needed.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ls: {‘-l’. aFileName}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now how about pipes?&lt;br /&gt;&lt;br /&gt;We could introduce &lt;span style="font-weight: bold;"&gt;pipeValue&lt;/span&gt; methods, that produced an object that responded to  the pipe operator. Or we could say that everything produced a &lt;span style="font-weight: bold;"&gt;CommandSession&lt;/span&gt; (and these understood “&lt;span style="font-weight: bold;"&gt;|&lt;/span&gt;”) and a special action is needed to get a result (sending it an &lt;span style="font-weight: bold;"&gt;evaluate&lt;/span&gt; or &lt;span style="font-weight: bold;"&gt;end&lt;/span&gt; message). This action is the analog of the newline that tells the shell to go ahead and evaluate. This could be dispensed with in a lazy setting.&lt;br /&gt;&lt;br /&gt;Combining our second proposal above with this, we could say that value was used to derive a result. Then we can view the shell as a combinator library for &lt;span style="font-weight: bold;"&gt;CommandSessions&lt;/span&gt;. This does conflate two issues - the use of &lt;span style="font-weight: bold;"&gt;CommandSession&lt;/span&gt; to delay evaluation until a result is needed (the shell parses the input as a unit ensuring laziness)   and the use of real combinators on byte streams.&lt;br /&gt;&lt;br /&gt;We use &lt;span style="font-weight: bold;"&gt;NewShell&lt;/span&gt; in our IDE - for example, to manipulate subversion commands in the source control browser. It would be nice to refine it further, perhaps along the lines suggested above, but even in its current simplistic incarnation, it is quite useful.&lt;br /&gt;&lt;br /&gt;As I noted at the beginning of this post, there a host of other cool uses for &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt;. I may return to those in another post.&lt;br /&gt;&lt;br /&gt;Of course,  if you are a fan of mandatory static typing, you aren’t allowed to use &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt; in your language. n the general case, it simply cannot be statically typed - which is an argument against mandatory typing, not against &lt;span style="font-weight: bold;"&gt;doesNotUnderstand:&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Just as switch statements, catch clauses and regular expressions all need defaults/catch-alls/wildcards, so does method dispatch.  There are situations where you cannot avoid uncertainty.  Reality is dynamic.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5734360635972064061?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5734360635972064061/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5734360635972064061' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5734360635972064061'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5734360635972064061'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/04/need-for-more-lack-of-understanding.html' title='The Need for More Lack of Understanding'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-6796281891265227069</id><published>2009-04-18T17:17:00.000-07:00</published><updated>2010-01-17T17:54:06.392-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Web Platform and Objects as Software Services'/><title type='text'>The Language Designer’s Dilemma</title><content type='html'>Last week I was at &lt;a href="http://langnetsymposium.com/"&gt;lang.net 09&lt;/a&gt; and &lt;a href="http://sellsbrothers.com/conference/"&gt;DSL Dev Con&lt;/a&gt;. It was great fun, as always. The &lt;a href="http://langnetsymposium.com/2009/talks/23-ErikMeijer-LiveLabsReactiveFramework.html"&gt;best talk was by Erik Meijer&lt;/a&gt;. Use the link to see it - but without live video of Erik in action, you can't really understand what it was like. Such is performance art.&lt;br /&gt;&lt;br /&gt;I also gave a &lt;a href="http://langnetsymposium.com/2009/talks/08-GiladBracha-Hopscotch.html"&gt;talk&lt;/a&gt;, and there were many others, but that is not the point of this post. Rather, this post was prompted by one specific, and excellent, talk - Lars Bak’s &lt;a href="http://langnetsymposium.com/2009/talks/18-LarsBak-JavaScript.html"&gt;presentation&lt;/a&gt; on V8.  Lars clearly enjoyed his visit to the lion’s den; more importantly, perhaps Microsoft will finally shake off their apparent paralysis and produce a competitive Javascript engine.&lt;br /&gt;&lt;br /&gt;That, in turn, will make it possible to distribute serious applications targeting the web browser, with the knowledge that all major browsers have performant Javascript engines that can carry the load.&lt;br /&gt;&lt;br /&gt;It’s all part of the evolution of the web browser into an OS. This was what Netscape foresaw back in 1995.  And it is a perfect example of &lt;a href="http://www.google.com/books?id=lqKho8KWXmAC&amp;amp;dq=Innovator%27s+Dilemma&amp;amp;printsec=frontcover&amp;amp;source=bn"&gt;innovator’s dilemma&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Innovator’s dilemma applies very directly to programming languages, and Todd Proebsting already &lt;a href="http://www.google.com/url?sa=U&amp;amp;start=1&amp;amp;q=http://ll2.ai.mit.edu/talks/proebsting.ppt&amp;amp;ei=yXHqSc_KKaXoswOQ1_TfAQ&amp;amp;sig2=SKypcCim41psrQv4fBXymA&amp;amp;usg=AFQjCNFRF-2e-HeTHnEJ15nXu3yrhkFaGg"&gt;discussed this&lt;/a&gt; back in 2002.&lt;br /&gt;&lt;br /&gt;To recap, the basic idea is that established players in a market support sustaining innovation but not disruptive innovation. Examples of sustaining innovation would be the difference between Windows 2000 and Windows XP, or between Java 1.0 thru Java 6.&lt;br /&gt;Sustaining innovations are gradual improvements and refinements - some improvement in performance, or some small new feature etc.&lt;br /&gt;&lt;br /&gt;Disruptive innovation tends to appear “off the radar” of the established players. It is often inferior to the established product in many ways. It tends to start in some small niche where it happens to work better than the established technology. The majors ignore it, as it clearly can’t handle the “real” problems they are focused on. Over time, the new technology grows more competent and eats its way upward, consuming the previous market leader’s lunch.&lt;br /&gt;&lt;br /&gt;We’ve seen this happen many times before. Remember Sun workstations? PCs came in and took over that market. Sun retreated to the server business, and PC’s chase it up towards the high end of that market, until there’s nowhere left to run.&lt;br /&gt;&lt;br /&gt;In the programming language space, take Ruby as an example. Ruby is slow, until quite recently lacked IDE support etc. It’s easy for the Javanese to dismiss it.  Over time, such technology can evolve to be much faster, have better tooling etc. And so it may grow into Java’s main market.&lt;br /&gt;&lt;br /&gt;Don’t believe me? Java in 1995 was just a language for writing applets. It’s performance was poorer than Smalltalk’s, and nowhere near that of C++.  There were no IDEs. Who's laughing now?&lt;br /&gt;&lt;br /&gt;Javascript is an even clearer case in point.  It was just a scripting language for web browsers.  Incredibly slow implementations, restricted to interacting with the just the browser.  No debuggers, no tools, no software engineering support to speak of.&lt;br /&gt;&lt;br /&gt;Then V8 came along and showed people that it can be much faster. Lars’ has doubled performance since the release in September, and expects to double it again within a year, and again 2-3 years after that.&lt;br /&gt;&lt;br /&gt;Javascript security and modularity are evolving as well. By the time that’s done, I suspect it will far outstrip clunky packaging mechanisms like OSGi.  This doesn’t mean you have to write everything in Javascript - I don’t believe in a monolingual world.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: Yes, I did spend ten years at Sun. A lot of it was spent arguing that Java was not the final step in the evolution of programming languages. I guess I’m just not very persuasive.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The web browser itself is one of the most disruptive technologies in computing. It is poised to become the OS. Javascript performance is an essential ingredient but there are others. SVG provides a graphics model.  HTML 5 brings with it persistent storage - a file system, as it were - only better. You may recall that Vista was supposed to have a database like file system. Vista didn’t deliver, but web browsers will.&lt;br /&gt;&lt;br /&gt;This trend seems to make the traditional OS irrelevant. Who cares what’s managing the disk drive? It’s just some commoditized component, like a power supply. We can already see how netbooks are prospering. This isn’t good news for Windows.&lt;br /&gt;&lt;br /&gt;Of course, you might ask why Microsoft would go along with this? Is IE’s Javascript so inadequate on purpose? Maybe it is part of the master plan? Well, I don’t believe in conspiracy theories.&lt;br /&gt;&lt;br /&gt;It does look as if Microsoft is facing a lose-lose proposition. Making IE competitive supports the movement toward a web-browser-as-OS world. Conversely, if they let IE languish, as they have, they can expect its market share to continue to drop.&lt;br /&gt;&lt;br /&gt;However, you cannot stop the trend toward the Web OS by making your tools inferior. You can only compete by innovating, not by standing still.&lt;br /&gt;&lt;br /&gt;I wouldn’t count Redmond out just yet. They have huge assets - a terrific research lab, armies of smart people in product land as well, market dominance, and vast amounts of money. They also have huge liabilities of course - and those add up to innovator’s dilemma.&lt;br /&gt;&lt;br /&gt;In any case, for language implementors, it’s clear that one needs to be able to compile to the internet platform and Javascript is its assembly language. Web programming will evolve into general purpose programming, using the persistent storage provided by the browser to function off line as well as online.&lt;br /&gt;&lt;br /&gt;As many readers of this blog will recognize, this sets the stage for the brave new world of objects as software services. I hope to bring Newspeak to the point where it is an ideal language for this coming world. It’s a bit difficult with the very limited resources at our disposal, but we are making progress.&lt;br /&gt;&lt;br /&gt;The entire progression from conventional computing to the world of the Web OS is taking longer than one expects, but is happening. The time will come when we hit the tipping point, and things will change very rapidly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-6796281891265227069?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/6796281891265227069/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=6796281891265227069' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6796281891265227069'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6796281891265227069'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/04/last-week-i-was-at-lang.html' title='The Language Designer’s Dilemma'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-7076023764071614736</id><published>2009-03-29T12:44:00.000-07:00</published><updated>2010-01-17T17:42:05.892-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Modularity'/><title type='text'>Subsuming Packages and other Stories</title><content type='html'>Barbara Liskov recently won the Turing award. It’s great to see the importance of programming language work recognized once again. Even if you aren’t into the academic literature, you have probably heard her name, for example, when people mention the Liskov substitution principle (LSP). The LSP  states that if a property P is provable for an object of type S, and T is a subtype of S, then P is provable for an object of type T.&lt;br /&gt;&lt;br /&gt;This is behavioral subtyping - instances of subtypes are semantically substitutable where supertypes are expected. This is a very strong requirement, but not something a practical programming language can enforce.&lt;br /&gt;&lt;br /&gt;Closely related (and enforceable)  is the subsumption rule of type systems for OO languages. This rule states that if an expression e has type T, and T is a subtype of S, then e has type S as well.  This post is about the importance of subsumption in programming language design.&lt;br /&gt;&lt;br /&gt;The subsumption rule is a cornerstone of formal type systems for OO. More importantly, it captures a deep intuition that programmers have about subtyping. After all, what could be more natural: if e: T and T &lt;: S, e: S. I hope readers agree with me that this property is intuitive, perhaps even so obvious that you wonder why anyone would bother belaboring the point.   ***   So it is most unfortunate that mainstream programming languages violate subsumption left and right. Below, we’ll look at some examples, and draw some conclusions about language design.  &lt;span style="font-weight: bold;"&gt;Case 1: Hiding fields.&lt;/span&gt; In Java (and C++) you can define a field in a subclass with type X, even though the superclass has a field of the same name, with unrelated type Y. The subclass’ field &lt;span style="font-style: italic;"&gt;hides&lt;/span&gt; the superclass field.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;class&lt;/span&gt; S { &lt;span style="font-weight: bold;"&gt;final&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt; f = 42;}&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;class&lt;/span&gt; T &lt;span style="font-weight: bold;"&gt;extends&lt;/span&gt; S { &lt;span style="font-weight: bold;"&gt;final&lt;/span&gt; String f = “!”;}&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;new&lt;/span&gt; T().f;  // “!”&lt;br /&gt;((S)&lt;span style="font-weight: bold;"&gt; new&lt;/span&gt; T()).f; // 42&lt;br /&gt;&lt;br /&gt;In case it isn’t obvious, any object of type S has the property that it’s &lt;span style="font-style: italic;"&gt;f&lt;/span&gt; field has type &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt;. An instance of T should also have the same property, but doesn’t. To get at the field defined by S, we have to coerce the T value into an S.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: It would be much better to ensure that fields are always private, or better yet, avoid field references altogether, as in Self or Newspeak.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A typical programmer in the Java/C++ tradition may ask what’s the harm in the rules as they are.&lt;br /&gt;&lt;br /&gt;The harm is precisely in having rules that contradict strong and valid intuition. The contradiction will bite you in the end, because you will end up thinking some program property holds based on your intuition, while the rules of the language contradict that intuition. Years of exposure to these misfeatures may have inured the mind to their toxicity, but they are toxic all the same.&lt;br /&gt;&lt;br /&gt;However, there is much more harm, as we’ll see as this post progresses.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Case 2: Hiding static methods.&lt;/span&gt;  The same sort of situation occurs with static methods. Static methods aren’t really methods at all, as no run time object is involved in determining the meaning of their invocations. No overriding can occur between static methods. The meaning of a static method is statically bound: static method is a misnomer for subroutine (as in FORTRAN subroutine circa 1955). And so Java allows static methods to hide each other much like fields.  You can easily recreate the above example with static methods.&lt;br /&gt;&lt;br /&gt;I’ve written elsewhere about the problems of all things &lt;span style="font-weight: bold;"&gt;static&lt;/span&gt;; add this to the list. Maybe you’re beginning to get the sense that violating subsumption correlates with iffy language constructs.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Case 3: Package privacy.&lt;/span&gt; This brings us back to Barbara Liskov. Her work on CLU in the mid 70s brought David Parnas’ notion of information hiding into programming languages, with support for abstract datatypes (ADTs). Packages are one of the mechanisms that Java uses to support ADTs. Packages have had all kinds of problems; lack of subsumption is at the root of many of them.&lt;br /&gt;&lt;br /&gt;For example, recall that in Java, protected access implies package access. Package access is always relative the the current package. So when a protected member is inherited by a subclass in another package, it eliminates the access from the superclass package (while granting access to its own package):&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;package&lt;/span&gt; P1;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;public&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;class&lt;/span&gt; A {&lt;span style="font-weight: bold;"&gt;protected&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt; foo(){&lt;span style="font-weight: bold;"&gt;return&lt;/span&gt; 91;}}&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;class&lt;/span&gt; C {&lt;span style="font-weight: bold;"&gt;static&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt; i = &lt;span style="font-weight: bold;"&gt;(new&lt;/span&gt; P2.B()).foo();}&lt;br /&gt;&lt;br /&gt;**&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;package&lt;/span&gt; P2;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;public&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;class&lt;/span&gt; B &lt;span style="font-weight: bold;"&gt;extends&lt;/span&gt; P1.A {}&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;This code is legal. If we then change B such that&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;public class&lt;/span&gt; B &lt;span style="font-weight: bold;"&gt;extends&lt;/span&gt; P1.A {&lt;span style="font-weight: bold;"&gt;protected&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;int&lt;/span&gt; foo() {&lt;span style="font-weight: bold;"&gt;return&lt;/span&gt; 42;}}&lt;br /&gt;&lt;br /&gt;The call in C is no longer allowed!  We have not added or removed members in the subclass, or changed their accessibility (or have we? Did we have a choice?) and yet we’ve broken client code.&lt;br /&gt;&lt;br /&gt;I don’t want to go into the details of an even more bizarre &lt;a href="http://bracha.org/x-pkg.rtf"&gt;example&lt;/a&gt; (due, as far as I can recall, to David Chase) - but &lt;a href="http://dow.ngra.de/2009/02/16/the-ultimate-java-puzzler/"&gt;others have&lt;/a&gt;. Suffice to say that there was a huge bug tail around such examples; it took years to resolve them. Compiler engineers did not correctly diagnose the problems, because their intuition led them to try and preserve subsumption, to no avail. I have several patents to my name that deal with the mechanisms of implementing the desired behavior.&lt;br /&gt;&lt;br /&gt;And why should you care?  Because these issues led to subtle incompatibilities between compilers and VMs, across vendors and releases; these led to subtle program bugs, to jokes about &lt;span style="font-style: italic;"&gt;write once debug everywhere&lt;/span&gt; etc.&lt;br /&gt;&lt;br /&gt;Packages are one way in which Java supports ADTs.  It turns out that you cannot have ADTs, inheritance and subsumption in one language. You must make a choice - and I argue that subsumption must always be preserved.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Case 4: Class-based Encapsulation.&lt;/span&gt;&lt;br /&gt;Another mechanism that supports ADTs in all the mainstream OO languages is class based encapsulation. This the idea that privacy is per class type, not per object. This makes it possible to write code like&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;class&lt;/span&gt; C {&lt;br /&gt;   &lt;span style="font-weight: bold;"&gt;private int&lt;/span&gt; f(){&lt;span style="font-weight: bold;"&gt;return&lt;/span&gt; 42;}&lt;br /&gt;   &lt;span style="font-weight: bold;"&gt;public int&lt;/span&gt; foo(C c){&lt;span style="font-weight: bold;"&gt;return&lt;/span&gt; c.f();} // I can get at c’s f(), because we’re both C’s&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;It also makes it possible to violate subsumption:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;public class &lt;/span&gt;A {&lt;br /&gt;   {&lt;br /&gt;   ((A)&lt;span style="font-weight: bold;"&gt; new&lt;/span&gt; B()).f(); // 42&lt;br /&gt;   &lt;span style="font-weight: bold;"&gt;new&lt;/span&gt; B().f(); // “!”&lt;br /&gt;   }&lt;br /&gt;   &lt;span style="font-weight: bold;"&gt;private int&lt;/span&gt; f(){&lt;span style="font-weight: bold;"&gt;return&lt;/span&gt; 42;} // package private&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;public class &lt;/span&gt;B &lt;span style="font-weight: bold;"&gt;extends&lt;/span&gt; A {&lt;br /&gt;   &lt;span style="font-weight: bold;"&gt;public&lt;/span&gt; String f(){&lt;span style="font-weight: bold;"&gt;return&lt;/span&gt; “!”;}&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;An alternative to class-based encapsulation is &lt;span style="font-style: italic;"&gt;object-based encapsulation&lt;/span&gt;. Privacy is per object.  You can enforce it via context free syntax. For example, you can arrange that only method invocations on &lt;span style="font-weight: bold;"&gt;self&lt;/span&gt; (aka &lt;span style="font-weight: bold;"&gt;this&lt;/span&gt;) attempt to lookup private methods. You don’t need a typechecker (aka verifier) to prevent malicious parties accessing private methods or fields. Consequently, you don’t have to rely on a complex byte code verifier to prevent such access as one does in Java and .Net.&lt;br /&gt;&lt;br /&gt;This fits perfectly with the object-capability security model. It is essentially what we do in Newspeak; but the ideas are universal.&lt;br /&gt;&lt;br /&gt;Oh, and object-based encapsulation works with both subsumption and inheritance.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Case 5: Privileged access within “nests” of classes.&lt;/span&gt; Giving enclosing classes privileged access to their nested classes, and giving sibling classes privileged access to each other, is another violation of subsumption (but giving a nested class free access to its surrounding scope, including private elements of its enclosing class, is fine).&lt;br /&gt;&lt;br /&gt;Another way to look at all this is that object-based encapsulation falls out naturally from the use of interfaces. Consider what would happen if the type C above was, by language definition, the public interface of C (as it is in Strongtalk, for example). Subsumption would fall out automatically.&lt;br /&gt;&lt;br /&gt;Subsumption is a very good litmus test for language constructs; if a construct violates subsumption, it will cause grief. Ditch it.&lt;br /&gt;&lt;br /&gt;Strict adherence to subsumption reduces cognitive dissonance for programmers, compiler writers, VM implementors and language designers. It helps avoid bugs in user code, compilers and VMs, enhancing reliability and portability. It makes it easier to build secure systems.&lt;br /&gt;&lt;br /&gt;So if you design a type system for an OO language, preserve subsumption at all costs. Do not try and combine ADTs with inheritance (because then you lose subsumption).&lt;br /&gt;&lt;br /&gt;Now if we can’t have ADTs and inheritance, we have to chose between them. I believe the benefits of inheritance far outweigh the costs. Those who feel differently are unlikely to do better than &lt;a href="http://lucacardelli.name/"&gt;Luca Cardelli&lt;/a&gt;‘s sublimely beautiful &lt;a href="http://lucacardelli.name/Papers/TypefulProg.pdf"&gt;Quest&lt;/a&gt;. On the other hand, if one choses inheritance, one should go with object-based encapsulation.&lt;br /&gt;&lt;br /&gt;Put another way, if you take the adage &lt;span style="font-style: italic;"&gt;program to an interface, not an implementation&lt;/span&gt; seriously, it will prevent a host of problems. This is what we are doing with Newspeak.&lt;br /&gt;&lt;br /&gt;At a meta-level, the lesson is: pay attention to programming language theory; occasionally, you can learn from it.&lt;br /&gt;&lt;br /&gt;Update: Yardena pointed out &lt;a href="http://dow.ngra.de/2009/02/16/the-ultimate-java-puzzler/"&gt;this discussion of Java packages&lt;/a&gt;. It goes through the problems in detail. I've added a link to it in the main post as well.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-7076023764071614736?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/7076023764071614736/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=7076023764071614736' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7076023764071614736'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7076023764071614736'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/03/subsuming-packages-and-other-stories.html' title='Subsuming Packages and other Stories'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-142319848391306271</id><published>2009-02-27T23:30:00.000-08:00</published><updated>2010-01-17T17:34:23.522-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Newspeak Prototype Escapes into the Wild</title><content type='html'>The Newspeak prototype is now available at &lt;a href="http://newspeaklanguage.org/downloads/"&gt;http://newspeaklanguage.org/downloads/&lt;/a&gt; . We had planned to release it in early January, but decided to complete a few more things - like better source control support, a fully functional Hopscotch based debugger and a new GUI for unit testing (shown below).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_k3ghkt1e0I8/SajovPJraDI/AAAAAAAAAEA/uQcp6ELtWKE/s1600-h/testResults2.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 395px;" src="http://2.bp.blogspot.com/_k3ghkt1e0I8/SajovPJraDI/AAAAAAAAAEA/uQcp6ELtWKE/s400/testResults2.png" alt="" id="BLOGGER_PHOTO_ID_5307748059074750514" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Newspeak is a long way from being finished - we are less than half way to realizing the vision of a robust, high performance, secure, network aware, high productivity platform. We can only get there if people care enough to use it and contribute to it.  I hope you'll check it out.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-142319848391306271?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/142319848391306271/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=142319848391306271' title='19 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/142319848391306271'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/142319848391306271'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/02/newspeak-prototype-escapes-into-wild.html' title='Newspeak Prototype Escapes into the Wild'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_k3ghkt1e0I8/SajovPJraDI/AAAAAAAAAEA/uQcp6ELtWKE/s72-c/testResults2.png' height='72' width='72'/><thr:total>19</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-1728887574143465074</id><published>2009-01-13T10:43:00.000-08:00</published><updated>2010-01-17T17:56:27.729-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Apologia</title><content type='html'>Here’s a quick update on the status of the Newspeak prototype release. If anyone is keeping score - I had said we’d put something out in the first week of January 2009.  So we’re behind schedule. I apologize, but what are you going to do - cut my funding?&lt;br /&gt;&lt;br /&gt;Nevertheless, we will be putting a prototype out soon. I’m holding off so that a few small things can be added.&lt;br /&gt;&lt;br /&gt;In the meantime, I wanted to take the opportunity to say a bit about the prototype. Obviously, due to our reduced circumstances, it is a lot less ambitious than I had once hoped.&lt;br /&gt;&lt;br /&gt;We do have a native GUI binding on Windows; I had hoped to have native GUI bindings for Mac and Linux, but that will depend on future open source contributions. You can of course run the system on those platforms, but you’ll have to live with the Morphic binding for now.&lt;br /&gt;&lt;br /&gt;The language itself remains incomplete as well - but the key features are there. I expect to rev the syntax and add the missing features (like object literals) over time. One of the reasons these features have been delayed is that I want to add them in the context of a completely new compiler.  Such a compiler will also make it easier to do ports - e.g., Newspeak on V8.&lt;br /&gt;&lt;br /&gt;One of the biggest deficits is the lack of libraries written in Newspeak. We continue to rely heavily on the existing Squeak libraries. We have made some progress on this front, largely through the efforts of some volunteers who got early access to the system. I want to take this opportunity to thank Yardena Meymann, David Pennell and Stephen Pair for their work on the library port. They are all busy professionals who were willing to take the time to do something concrete to help move Newspeak forward. I hope that we’ll see a lot more of that after the release!&lt;br /&gt;&lt;br /&gt;What we’ve been doing is porting some of the core libraries from Strongtalk to Newspeak. These libraries have several important advantages: they are small, yet complete enough to run a real system; they are blue book compatible (give or take), so we have a good chance of replacing our uses of Squeak code with them, without excessive disruption; they are quite cleanly written; they have type annotations; and, it so happens, they have a liberal license.&lt;br /&gt;&lt;br /&gt;On the other hand, these libraries were not purpose built for Newspeak; no thought has been given to security, the designs do not leverage features like mixins as much as one might etc..  Still, they provide us the with the most realistic path to getting a small stand alone system in the foreseeable future.&lt;br /&gt;&lt;br /&gt;This library code will not be anywhere near complete and integrated when we put out the prototype. But it whatever we have will be available, and we’ll move on from there.&lt;br /&gt;There are any number of other things lacking, and no doubt many will be eager to point them out. However, that scarcely matters. What matters is the &lt;span style="font-style: italic;"&gt;potential&lt;/span&gt; this system has. It’s being put out, so those with the imagination, sophistication and, above all, &lt;span style="font-weight: bold;"&gt;taste&lt;/span&gt; to appreciate it can start using it and contributing to it - eventually realizing that potential.&lt;br /&gt;&lt;br /&gt;So for that small elite that has shown interest and appreciation for Newspeak so far - thanks, and hang in there. It will be available, Real Soon Now :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-1728887574143465074?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/1728887574143465074/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=1728887574143465074' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1728887574143465074'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1728887574143465074'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2009/01/apologia.html' title='Apologia'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-9220526657191250140</id><published>2008-12-09T19:07:00.000-08:00</published><updated>2010-01-17T17:34:23.522-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Living without Global Namespaces</title><content type='html'>Newspeak differs from most programming languages in that it doesn’t provide a global namespace. And it differs from most imperative programming languages, because it has no static state.&lt;br /&gt;&lt;br /&gt;I’ve spoken and written a fair amount about &lt;a href="http://gbracha.blogspot.com/2008_02_01_archive.html"&gt;why the absence of static state is a good thing&lt;/a&gt; . What I haven’t discussed much is how you actually organize programs in this way. There have been a lot of questions along these lines. This post is an attempt to answer some of them.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Caveat&lt;/span&gt;: Some of the details here differ in the current prototype.  Some of the features are still incomplete. What's described here is how things are supposed to work. We're not far from that.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;First, let’s tackle the question of static state. It should be obvious: anything that you expected to put in a static variable goes in an instance variable of a module. What about singleton classes? How do I ensure that there’s only one instance? The easiest way is to initialize a read only slot of a module with an object literal. What happens if there are multiple instances of the module declaration? Well, each module has its own “singleton”. That’s exactly what happens with singleton classes in Java when they are defined by multiple class loaders.&lt;br /&gt;&lt;br /&gt;What if your class defines some service process and you need to be really sure there’s only one in the entire system? First, in many cases you may find that the system in question is your subsystem, defined by your modules, and the answer above applies.&lt;br /&gt;&lt;br /&gt;Now if you really mean “the entire system”, then you need to control that via some state in the platform object - through its links to the world’s state (e.g., the file system) or by having some registry in the platform object. Of course, not all code may see the true platform object, so it isn’t really global either; but it won’t matter.&lt;br /&gt;&lt;br /&gt;Having no static state doesn’t preclude having a global namespace, as long as that namespace doesn’t contain any stateful objects. The original plan for Newspeak was to have a global namespace of pure values, structured as an inversion of the internet domain namespace. This would have been much like the convention for naming Java packages (except that the scopes of namespaces would nest properly, as you’d expect). It was the only idea from Java that I saw a use for in Newspeak. It’s a good idea, but it turns out to be unnecessary.&lt;br /&gt;&lt;br /&gt;So, given no global namespace, what can I write at the top level? Remember, I can’t refer to any names, even things like &lt;span style="font-style: italic;"&gt;Object&lt;/span&gt; or &lt;span style="font-style: italic;"&gt;String&lt;/span&gt; that presumably exist in every implementation. This seems awkward. Not to worry - we won’t be writing SKI combinators or even plain old lambdas.&lt;br /&gt;&lt;br /&gt;We might be able to write some literal expressions like 1 + 2, but that isn’t all that interesting, and isn’t even necessary. What we need to write are things that produce new kinds of objects, like classes.&lt;br /&gt;&lt;br /&gt;Happily, we can write a a top level class declaration, with one caveat: A top level class declaration cannot declare a superclass explicitly since there is no way to name it, because there is no enclosing namespace.  In that case, by special dispensation, the superclass will be the class &lt;span style="font-style: italic;"&gt;Object&lt;/span&gt; provided by the underlying platform. Similar rules apply to object literals (which can be thought of as “anonymous classes done right”).&lt;br /&gt;&lt;br /&gt;Ok, so now we can write a class, which can have other classes nested inside it, so it can be an entire library; and since there is no surrounding namespace, it is necessarily independent of any specifics of the environment - it is a module declaration. An example of such a module declaration would be the Newspeak AST&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;class&lt;/span&gt; NewspeakAST usingLib: platform { ....&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;... lots of nested AST classes ....&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A similar class would be &lt;span style="font-style: italic;"&gt;CombinatorialParsing&lt;/span&gt; library I’ve written about before.&lt;br /&gt;&lt;br /&gt;There’s just one little problem. How do I use such a class? I gave it a name, but no one can refer to it, since there isn’t any surrounding namespace for the name to be bound!&lt;br /&gt;&lt;br /&gt;Suppose I want to create a parser that builds an AST, using the two classes mentioned above. I need a grammar, which should be defined by a subclass of the parser library, and the parser class itself would in turn be a subclass of the grammar. Call these classes &lt;span style="font-style: italic;"&gt;Grammar&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;Parser&lt;/span&gt;.&lt;br /&gt;Since I can’t name the superclass of &lt;span style="font-style: italic;"&gt;Grammar&lt;/span&gt;, I’ll just define it as a mixin, and worry about how to pair it with the superclass later.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;class&lt;/span&gt; Grammar  =  { ....}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Likewise with &lt;span style="font-style: italic;"&gt;Parser&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;class&lt;/span&gt;&lt;span style="font-style: italic;"&gt; Parser usingLib: platform astLib: ast = { ...}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;That way I can define all the actual code required. The problem remaining is how to link all these pieces together.&lt;br /&gt;&lt;br /&gt;If I actually had a namespace where I could refer to the pieces, I could write linking code like:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;“confused”&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;main: platform {&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    MyGrammar = Grammar |&gt; CombinatorialParsing usingLib: platform.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    MyParser = Parser usingLib: platform astLib: NewspeakAST  |&gt; MyGrammar.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;    return:: MyParser  parse: ‘a string in my language, perhaps?’&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So how would I go about creating such a namespace? This is ultimately a question of tooling. Suppose my IDE lets me load class objects dynamically - say by reading in serialized class objects saved in files on disk. When it loads such a class object, it can reflect on it to find out its name, and store the class object in a slot of the same name in some new object it creates.&lt;br /&gt;&lt;br /&gt;If I choose to load the classes, &lt;span style="font-style: italic;"&gt;Grammar&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;Parser&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;CombinatorialParsing&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;NewspeakAST&lt;/span&gt;, I can create an object that is precisely the namespace I needed. I can then modify its class by adding the &lt;span style="font-style: italic;"&gt;main:&lt;/span&gt; method listed above. This object is now an application, whose behavior is defined by its &lt;span style="font-style: italic;"&gt;main:&lt;/span&gt; method. I can serialize this application object to disk.&lt;br /&gt;&lt;br /&gt;Running my program then amounts to deserializing the object, and invoking its &lt;span style="font-style: italic;"&gt;main:&lt;/span&gt; method with an object representing the current platform.&lt;br /&gt;&lt;br /&gt;I’ve glossed over some crucial details here. We don’t really want to serialize the entire object, as it points to objects in our IDE, like &lt;span style="font-style: italic;"&gt;Object&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;Class&lt;/span&gt; and a few others. These are standard, and we can cut off the object graph  with symbolic links at these standard points, and have the deserializer hook up their equivalents on the destination.&lt;br /&gt;&lt;br /&gt;Is using the IDE this way cheating? After all, it ultimately resorts to using the namespace of the underlying file system (or the network, or a global IDE namespace, depending where the IDE fetches class objects from).  I think not. The truth is that this is what any language in the world does at some level. Whether we rely on a compiler that uses a &lt;span style="font-style: italic; font-weight: bold;"&gt;CLASSPATH&lt;/span&gt; environment variable to define a set of local directories, or on the IDE, or on makefiles in a given directory to link separately compiled files, it is ultimately the same: some tool uses the operating system to find pieces of program.&lt;br /&gt;&lt;br /&gt;We don’t have to use the IDE; we could use a preprocessor that understood directives that referred to classes in the file system instead. It could even use something as inane as &lt;span style="font-weight: bold; font-style: italic;"&gt;CLASSPATH&lt;/span&gt;. Of course, I’m not really recommending that.&lt;br /&gt;&lt;br /&gt;My key point is that the language needs nothing more than objects to serve as its namespaces.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-9220526657191250140?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/9220526657191250140/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=9220526657191250140' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/9220526657191250140'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/9220526657191250140'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/12/living-without-global-namespaces.html' title='Living without Global Namespaces'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-6152967728783830038</id><published>2008-12-05T15:14:00.000-08:00</published><updated>2010-01-17T17:34:23.522-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Unidentified Foreign Objects  (UFOs)</title><content type='html'>I recently found out that Newspeak’s basic foreign function interface (FFI), called  &lt;span style="font-weight: bold;"&gt;Aliens&lt;/span&gt;, is being &lt;a href="http://wiki.squeak.org/squeak/6100"&gt;made available in Squeak&lt;/a&gt; (though that will require new VMs with the required primitives).  Thanks to John McIntosh for doing this.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;I should also thank Eliot Miranda for most of the original work on aliens, and  Vassili Bykov, Peter Ahe and Bill Maddox for the rest. Also thanks to Lars Bak, whose work on the Strongtalk FFI inspired the VM level view of aliens; and to Dave Ungar,  who was the first to understand that objects were all you needed on the language side of an FFI. Lastly, this post benefited immensely from conversations with Vassili.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So I figured I’d write a little bit about Aliens from a high level perspective. As usual, the ideas apply to programming languages in general.&lt;br /&gt;&lt;br /&gt;In Smalltalk,  there isn’t a standard FFI. Various dialects provide different solutions, with varying degrees of functionality, performance and ease of use. To be honest, they are usually a poor fit with the surrounding language and fairly awkward to use. This inhibits Smalltalk’s interoperability with the rest of the world. I’d argue that the absence of a good, standard FFI has cost the Smalltalk community dearly.&lt;br /&gt;&lt;br /&gt;In Java, by contrast, native methods and JNI provide a standardized FFI. This mechanism is far from perfect, but at least there is a more or less standard solution.&lt;br /&gt;&lt;br /&gt;What these and other systems have in common is support for a special construct (such as the native modifier for methods, or declarations like &lt;span style="font-weight: bold;"&gt;extern C&lt;/span&gt;, or the truly ugly ad hoc FFI syntax extensions used in various Smalltalks) for foreign functions.&lt;br /&gt;&lt;br /&gt;Newspeak’s FFI was strongly influenced by the Strongtalk FFI; but unlike Strongtalk, Newspeak doesn’t have a special syntax for foreign calls. As &lt;span style="font-weight: bold;"&gt;Self&lt;/span&gt; showed many years ago, one doesn’t really need a special syntax for the FFI.  The foreign functions, APIs, DLLs etc. can all be represented as objects. They just happen to be &lt;span style="font-style: italic;"&gt;foreign&lt;/span&gt; objects.&lt;br /&gt;&lt;br /&gt;The idea of a foreign object, which we call an &lt;span style="font-weight: bold; font-style: italic;"&gt;alien&lt;/span&gt;, is at the foundation of the Newspeak FFI.&lt;br /&gt;&lt;br /&gt;For starters, any decent language should be able to represent functions as values; and in an object-oriented language, these values are objects, accessed via a standard interface. Foreign functions are just a different implementation of that interface.&lt;br /&gt;&lt;br /&gt;Another natural way to model a foreign function is as a method defined on a foreign object.  For example, one can view an entire DLL as an object with a set of methods corresponding to the functions defined by the DLL. Better yet, we could represent an entire API as an object, independently of what DLLs actually defined it.&lt;br /&gt;&lt;br /&gt;Aliens can be defined for different foreign languages; for example, while &lt;span style="font-weight: bold;"&gt;Alien&lt;/span&gt; is used to interface with C, we also have a class called &lt;span style="font-weight: bold;"&gt;ObjectiveCAlien&lt;/span&gt; that can be used to interface with ObjectiveC, which is the native language on MacOS X.  C Aliens and ObjectiveC Aliens do not interfere with each other, and when/if we need to add Java Aliens or CLR Aliens we can do that as well.&lt;br /&gt;&lt;br /&gt;The alien approach is also a good fit with security: one need not be concerned that code may bypass high level language safety guarantees by calling out to C; untrusted code can be prevented from doing that, simply by not providing any Alien library objects to it.&lt;br /&gt;&lt;br /&gt;Newspeak’s C Alien implementation is fast, but also dangerous. An alien is basically a blob of memory. The user of an Alien is responsible for interpreting and accessing that data correctly.  There is no checking being done for you.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: It's  worth noting that the basic Alien layer may evolve further; for example, we aren't thrilled with the practice of subclassing Alien. It's not clear if the Alien class really needs to change, or just the pattern of using it.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;On top of this foundation, safer and/or more convenient abstractions can be built.  We have built objects that support not just methods corresponding to the functions of an API, but also methods that provide factories for the various datatypes used in the function’s signatures, including those defined by macros. These objects wrap the basic alien API, and help with error prone book keeping - converting between Newspeak types (e.g., Strings) and foreign types, freeing aliens after use etc.&lt;br /&gt;&lt;br /&gt;At the moment, both the declarations of low level aliens and higher level APIs are constructed manually, which is tedious and  error prone. We’ve been planning on a higher level tool called CSlick, which would allow you to specify a set of .h files and the requisite DLLs, and obtain an object that supports the desired functions automatically.&lt;br /&gt;&lt;br /&gt;As a first approximation, you could think of CSlick as a function:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;CSlick: List&lt;headerfiles&gt; -&gt; List&lt;dll&gt; -&gt; ForeignAPI&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The signature above is deliberately curried, because you may actually want to be able to specify just the header files, and later bind different DLLs to provide the actual functionality, just as a .h file can be associated with different .c files.&lt;br /&gt;&lt;br /&gt;When this will happen is anyone’s guess right now; but &lt;a href="http://blog.3plus4.org/"&gt;Vassili&lt;/a&gt; has done this before (in the context of Lisp) and I’m sure he can do it again.&lt;br /&gt;&lt;br /&gt;The resulting foreign API should incorporate the low level alien API, and, as much as possible, a higher level API as well.&lt;br /&gt;&lt;br /&gt;The CSlick implementation will need to know how to parse C header files, and how to reflectively manufacture the low level code that actually invokes the C functions. Fortunately we have a strong parsing infrastructure, so that isn’t as daunting as it sounds.&lt;br /&gt;&lt;br /&gt;When I’ve told people about CSlick, they often mention SWIG. However, I believe CSlick can be made substantially easier to use than SWIG. SWIG has to cope with multiple languages, each with a pre-existing story on how to do foreign calls. In contrast, we can integrate CSlick more tightly with the language. Ultimately, that should translate to a simpler model for the user.&lt;br /&gt;&lt;br /&gt;The key take away is that objects are all you really need to interact with foreign programming languages.  They are better than built in language constructs in terms of ease of use, security, and multiple language support. As usual, less is more.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-6152967728783830038?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/6152967728783830038/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=6152967728783830038' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6152967728783830038'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6152967728783830038'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/12/unidentified-foreign-objects-ufos.html' title='Unidentified Foreign Objects  (UFOs)'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5192679980025218518</id><published>2008-11-24T17:11:00.000-08:00</published><updated>2010-01-17T17:34:23.523-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>We have Good news, and we have Bad news</title><content type='html'>First, the good news. We will be releasing a Newspeak prototype in the first week of January 2009. This prototype is anything but finished, but it will have to do, because of the bad news (insert dramatic pause here).&lt;br /&gt;&lt;br /&gt;What's the bad news? Well, funding for the Newspeak team is being discontinued in early January, another victim of the times.&lt;br /&gt;&lt;br /&gt;We are currently seeking a new home for Newspeak, but it is by no means certain that such a thing can be found in the current economic climate.  Perhaps I should take the Newspeak private jet to Washington and demand a bail-out - but alas, the jet is indisposed (when I call it all I  get is &lt;span style="font-style: italic;"&gt;message not understood&lt;/span&gt;)&lt;br /&gt;&lt;br /&gt;We expect to keep pushing the Newspeak platform forward in any event; that said, there's a big difference between having several developers fully dedicated to a project, and the limited efforts people can make in their spare time. So the goal is to find support for the project soon - otherwise we'll all go get other jobs and progress on Newspeak will slow down accordingly.&lt;br /&gt;&lt;br /&gt;Of course, once we put out the prototype, others can help fill in the blanks. I hope that will happen, irrespective of whether Newspeak gets further funding. I also plan to write up our work on Newspeak for academic publication.&lt;br /&gt;&lt;br /&gt;Hopefully, this is only a temporary setback, and in it lies new opportunity: &lt;span style="font-style: italic;"&gt;per aspera ad astra&lt;/span&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5192679980025218518?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5192679980025218518/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5192679980025218518' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5192679980025218518'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5192679980025218518'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/11/we-have-good-news-and-we-have-bad-news.html' title='We have Good news, and we have Bad news'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-685389284173953354</id><published>2008-11-01T20:29:00.000-07:00</published><updated>2010-01-17T17:37:37.572-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Reflection'/><title type='text'>Dynamic IDEs for Dynamic Languages!</title><content type='html'>Dynamic languages are all the rage these days, and rightly so. Historically, most of the currently popular dynamic languages have had negligible IDE support (since they are scripting languages).  More recently, we see a flurry of activity around IDE support for these languages. However, the mainstream IDEs are designed for, and implemented in, static languages. Consequently, even when dealing with a dynamic language, the dynamism stops when it comes to the IDE itself.&lt;br /&gt;&lt;br /&gt;What do I mean by this? In an IDE written in a dynamic language (such as Smalltalk or Self or Lisp), the IDE code itself can be modified on the fly. This is a double edged sword, as I’ll explain shortly. I’ll start with the good edge.&lt;br /&gt;&lt;br /&gt;Suppose you want to modify the IDE for some reason. It lacks a feature you need, or something is buggy. If it’s a proprietary system, you can file a bug with the vendor, and hope that they pay attention, so that in the next release, in a year or two, you’ll get the fix.&lt;br /&gt;&lt;br /&gt;If the system is open source, you can just go ahead and implement the fix/feature. Now if that system is written in a mainstream, conventional language, you just need to recompile the compilation unit with the fix, and rebuild your system.  Of course, in order to develop the fix, you had to load a project with the source for your IDE and probably go through several edit-compile-debug cycles. When it all worked, you replace your existing installation with your new build. I have the strange feeling that all this setting up the IDE as a project within itself, rebuilding and reinstalling could take quite a lot of time and effort.&lt;br /&gt;&lt;br /&gt;In an IDE like Smalltalk (or Newspeak, or Self), the IDE source is always available from within itself.  In such a system, if you change its source code, the change takes effect immediately. For example, the Newspeak class browser  supports a pop up menu for methods, that lets you find references to the method and to the messages sent within it.  In the screen shot below, the speech bubble on the right denotes this menu&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_k3ghkt1e0I8/SQ0gTpFLsrI/AAAAAAAAADY/TSVnK1_vv-U/s1600-h/inspectPresenter0.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 55px;" src="http://1.bp.blogspot.com/_k3ghkt1e0I8/SQ0gTpFLsrI/AAAAAAAAADY/TSVnK1_vv-U/s400/inspectPresenter0.png" alt="" id="BLOGGER_PHOTO_ID_5263899061283173042" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;img src="file:///Users/gilad/Library/Caches/TemporaryItems/moz-screenshot.jpg" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;However, the initial version did not support the same functionality for slots.&lt;br /&gt;How do we fix this? Well, for starters, we need to find where the relevant source code is. How do you do this? The old fashioned way is to go on a trek through the source code trying to figure it out. What subdirectory should we start with? Which file should we open first? What should we grep for?&lt;br /&gt;&lt;br /&gt;Fortunately, Newspeak makes it easy, as most IDE components have an &lt;span style="font-style: italic;"&gt;Inspect Presenter&lt;/span&gt; menu item, that will open an object inspector on the actual presenter in question; a presenter on a presenter.&lt;br /&gt;&lt;br /&gt;For example, if we want to find out how the references menu for methods works, we can look at a method  like the one above. On the far right is a drop down menu:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_k3ghkt1e0I8/SQ0gotzDWkI/AAAAAAAAADg/OVd9k8JIruI/s1600-h/inspectPresenter1.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 194px; height: 100px;" src="http://1.bp.blogspot.com/_k3ghkt1e0I8/SQ0gotzDWkI/AAAAAAAAADg/OVd9k8JIruI/s400/inspectPresenter1.png" alt="" id="BLOGGER_PHOTO_ID_5263899423326558786" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This opens an object presenter, like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_k3ghkt1e0I8/SQ0gyhOkY1I/AAAAAAAAADo/g7QepAEsAb8/s1600-h/inspectPresenter2.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 328px;" src="http://2.bp.blogspot.com/_k3ghkt1e0I8/SQ0gyhOkY1I/AAAAAAAAADo/g7QepAEsAb8/s400/inspectPresenter2.png" alt="" id="BLOGGER_PHOTO_ID_5263899591751000914" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Toward the top, is a line marked class; it contains a hyperlink to the class of the object being inspected - the class &lt;span style="font-style: italic;"&gt;ExpandableMethodPresenter&lt;/span&gt;. Now we know where the code that presents methods resides! If we click on the link, the Hopscotch browser will show us the class.&lt;br /&gt;&lt;br /&gt;Having found the code that manages the presentation of methods, and the implementation of the references menu, we next want to find the code that presents slots, so we can modify it. We do the same thing again - invoke the &lt;span style="font-style: italic;"&gt;Inspect Presenter&lt;/span&gt;  menu item, but this time on a slot.&lt;br /&gt;&lt;br /&gt;Once we’ve found the code and made the change, we can test it right away. Just hit refresh in the browser, and you’ll see the new reference bubble next to slots.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_k3ghkt1e0I8/SQ0jG3N8-wI/AAAAAAAAAD4/lMdnRrfN-Vg/s1600-h/header.PNG"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 154px;" src="http://3.bp.blogspot.com/_k3ghkt1e0I8/SQ0jG3N8-wI/AAAAAAAAAD4/lMdnRrfN-Vg/s400/header.PNG" alt="" id="BLOGGER_PHOTO_ID_5263902140274637570" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;All of this is part of what Dan Ingalls, and Smalltalkers in general, mean when they say the system is &lt;span style="font-weight: bold; font-style: italic;"&gt;alive&lt;/span&gt;. It’s the key property you seek in every system - its ability to respond to changes and mutate while running, which gives you the kind of interactivity you get in the real world with real objects.  &lt;span style="font-weight: bold; font-style: italic;"&gt;Live code&lt;/span&gt; in the IDE is the opposite of &lt;span style="font-style: italic; font-weight: bold;"&gt;dead code&lt;/span&gt; - stuff written in files, which needs to be explicitly loaded into a runtime to do anything.&lt;br /&gt;&lt;br /&gt;This all is very cool, but it also has a downside. Suppose you have a bug in your code, causing it to crash. You may be able to fix it in the debugger - but maybe the root of the problem is in another class. You need to browse it - but now, every time you try and browse a class, it crashes. You’ve shot yourself in the foot.&lt;br /&gt;&lt;br /&gt;This is where you almost wish you were looking at dead code, by browsing files in an editor (or a conventional IDE). Fortunately, we know how to deal with this problem as well.&lt;br /&gt;&lt;br /&gt;In this case, we want to manipulate the modified IDE from another, unmodified one, &lt;span style="font-style: italic; font-weight: bold;"&gt;as a dynamic program&lt;/span&gt;. That way we still enjoy  immediate feedback for our changes, while jkeeping our feet safe.  Once it works, we want to be able to easily suck in the changes into the unmodifed version - without restarting, rebuilding, or going back to the big bang.&lt;br /&gt;&lt;br /&gt;We haven’t implemented this yet - but much of it’s been done before, in the mid-90s, by Allen Wirfs-Brock in the FireWall project at ParcPlace.  You do it using mirror based reflection.&lt;br /&gt;&lt;br /&gt;One of the many things mirrors facilitate, is managing reflection in a distributed setting. Java’s JDI is an example of a mirror API was designed to do just that - though it has serious limitations because it has to work on a variety of JVMs. If you design the mirror API correctly, and build your IDE on it, the IDE can work almost identically on programs within the same process, in another process, or across the internet.&lt;br /&gt;&lt;br /&gt;This ability to manage reflective changes to the IDE via a separate process is a luxury, which probably explains why it generally isn’t implemented, The reality is that shooting yourself in the foot is infrequent, and you can always recover (if only by saving your changes to a file before testing them). The benefits of liveness far outweigh the risks.&lt;br /&gt;&lt;br /&gt;Nevertheless, in time, I expect to address this in the Newspeak IDE. In the meantime, it’s alive, even if support for death is lacking. Of course, the broader lesson is that IDEs &lt;span style="font-weight: bold; font-style: italic;"&gt;especially&lt;/span&gt;, should be implemented using dynamic languages.&lt;br /&gt;&lt;br /&gt;The above echoes my previous post from June, but I hope that the concrete examples are helpful. The liveness makes it easy to implement features like &lt;span style="font-style: italic;"&gt;Inspect Presenter&lt;/span&gt;, which let you identify where to change the IDE, but  even more critically, liveness allows you actually make the change and get instant feedback.&lt;br /&gt;&lt;br /&gt;In closing, I want to emphasize that this is not an unprincipled approach. All of the above is possible because the IDE can reflect on itself.  Self reference is at the heart of computing. It’s all about recursion. I find it strange that many theoretically oriented computer scientists, who can wax poetic about the Y combinator, eschew languages and systems that apply the same principles of self-application to themselves.  I’m sure this is will change though; the good ideas win in the end.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-685389284173953354?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/685389284173953354/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=685389284173953354' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/685389284173953354'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/685389284173953354'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/11/dynamic-ides-for-dynamic-languages.html' title='Dynamic IDEs for Dynamic Languages!'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_k3ghkt1e0I8/SQ0gTpFLsrI/AAAAAAAAADY/TSVnK1_vv-U/s72-c/inspectPresenter0.png' height='72' width='72'/><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-6860101018441749150</id><published>2008-09-20T14:25:00.000-07:00</published><updated>2010-01-17T17:34:23.523-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Skinning Newspeak?</title><content type='html'>Whenever it comes to discussing language syntax, &lt;a href="http://en.wikipedia.org/wiki/Color_of_the_bikeshed"&gt;Parkinson’s law of triviality&lt;/a&gt; comes to mind.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Incidentally, the book is back in print! If you haven’t read it, check out &lt;a href="http://books.google.com/books?id=a3ktAAAAMAAJ&amp;amp;q=Parkinson%27s+Law&amp;amp;dq=Parkinson%27s+Law&amp;amp;ei=OG_VSMXXC5zOswPx1YmOBA&amp;amp;client=safari&amp;amp;pgis=1"&gt;this priceless classic&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Newspeak’s syntax currently resembles Smalltalk and Self. For the most part, this a fine thing, but I recognize that it can be a barrier to adoption. So, as we get closer to releasing Newspeak we may have to consider, with heavy heart,  changes that could ease the learning curve.&lt;br /&gt;&lt;br /&gt;One approach is the idea of syntactic skins. You keep the same abstract language underneath, but adjust its concrete syntax. In theory, you can have a skin that looks Smalltalkish, and one that looks Javanese, and another sort of like Visual Basic etc.&lt;br /&gt;&lt;br /&gt;The whole idea of skins is closely related to Charles Simonyi’s notion of intentional programming. Cutting through the vapor, one of the few concrete things one can extract is the idea (not new or original with Simonyi) of an IDE that can present programs in a rich variety of skins, some of which are graphical.  That, and support for defining DSLs for the domain experts to program in. This is all a fine thing, as long as you understand that’s all it is. This is still a pretty tall order.&lt;br /&gt;In any case, Magnus Christerson is doing a superb job of making that vision a reality.&lt;br /&gt;&lt;br /&gt;It is of course crucial that any program can be automatically displayed in any skin in the IDE. And designing skins requires thought, and is prone to abuse, which makes me hesitate.&lt;br /&gt;&lt;br /&gt;Naming conventions that may make sense in one syntax may not really work in another, for example.  Take maps (dictionaries in Smalltalk). In Smalltalk, the number of arguments to a method is encoded in the method name. So class &lt;span style="font-weight: bold;"&gt;Dictionary&lt;/span&gt; has a method called &lt;span style="font-weight: bold;"&gt;at:put:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;aMap at: 3 put: 9.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In a javanese language, you’d tend to use a different name, say, &lt;span style="font-style: italic; font-weight: bold;"&gt;put&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;aMap.put(3, 9);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;However, with skins, you need to either use the very same name&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;aMap.at:put:(3, 9);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;which looks weird and may even conflict with other parts of the syntax, or have some automated transformation&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;aMap.atPut(3, 9);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;All of which looks odd and may have issues (after all &lt;span style="font-weight: bold;"&gt;at:Put:&lt;/span&gt; would be a distinct, legal Newspeak method name which would also map to &lt;span style="font-weight: bold;"&gt;atPut&lt;/span&gt;). And what happens if you start writing your code in the javanese syntax? How do I map &lt;span style="font-weight: bold;"&gt;put&lt;/span&gt; into a 2 argument Newspeak method name? &lt;span style="font-weight: bold;"&gt;p:ut:&lt;/span&gt;? &lt;span style="font-weight: bold;"&gt;pu:t:&lt;/span&gt;? Maybe in this case, it takes a single tuple as its argument: &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;aMap put: {3. 9}.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There may be a creative way out; &lt;a href="http://blogs.msdn.com/madst/"&gt;Mads Torgersen&lt;/a&gt; once suggested a syntax like&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;aMap.at(3) put(9)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Or maybe we map all names into &lt;span style="font-weight: bold;"&gt;ratHole&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The standard procedure call syntax also has more substantial difficulties. Without a typechecker, it’s hard to get the number of arguments right. The Smalltalk keyword syntax, while unfamiliar to many, has a huge advantage in a dynamically typed setting - no arity errors (if you’ve written any javascript, you probably know what I mean).&lt;br /&gt;&lt;br /&gt;In addition, the Smalltalk keyword notation is really nice for defining embedded DSLs, as &lt;a href="http://blog.3plus4.org/"&gt;Vassili&lt;/a&gt; notes in &lt;a href="http://blog.3plus4.org/2008/09/13/a-taste-of-implicit-receivers/"&gt;a recent post&lt;/a&gt;. This is a point that I want to expand upon at some future time.&lt;br /&gt;&lt;br /&gt;So I’m pretty sure that regardless of whether we use skins or not, we’ll retain the keyword message send syntax across all skins. It’s just a good idea for this sort of language. &lt;br /&gt;&lt;br /&gt;There are syntactic elements that are easy to tweak so that they are more familiar to a wider audience. In some cases, there are standard syntactic conventions that work well, and we can just adopt them.  For example, using curly braces as delimiters for class and method bodies (and also closures), or using &lt;span style="font-weight: bold;"&gt;return:&lt;/span&gt; instead of &lt;span style="font-weight: bold;"&gt;^&lt;/span&gt;. If these were the only issues, one might not really consider skins, since the differences are minor. The &lt;a href="http://bracha.org/newspeak-spec.pdf"&gt;current draft spec&lt;/a&gt; mentions some of these.&lt;br /&gt;&lt;br /&gt;Skins may be most valuable for issues in between the two extremes cited above. One of the most obvious is operator precedence. People have been taught a set of precedences in elementary school, and most have never recovered. Programmers have also learned C or something similar, they have even more expectations in this regard.&lt;br /&gt;&lt;br /&gt;Newspeak, like Smalltalk, gives all binary operators equal precedence, evaluating them in order from left to right.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;5 + 4 * 3&lt;/span&gt; evaluates to &lt;span style="font-style: italic; font-weight: bold;"&gt;27&lt;/span&gt; in Smalltalk, not to &lt;span style="font-style: italic; font-weight: bold;"&gt;17&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Now I have never, ever had a bug due to this, but many people get all worked up over this issue. Why not just give in, and follow standard precedence rules? Well, there is the question of whose rules - C, Java, Perl?  What about operators those languages don’t have (ok, so Perl probably has all operators in the known universe and some to spare)?&lt;br /&gt;&lt;br /&gt;Another issue is that Newspeak is designed to let you embed domain specific languages as libraries. Then the standard choices don’t always make sense. Allow people to set precedence explicitly you say? This is problematic. Newspeak aims to stay simple. This is a matter of taste and judgement. If you like an endless supply of bells and whistles, look elsewhere.&lt;br /&gt;&lt;br /&gt;Skins might give us an out.  Some skins would dictate the precedence of popular operators (leaving the rest left-to-right, as in Scala for example). This means your DSL may look odd in another skin, but maybe that’s tolerable.&lt;br /&gt;&lt;br /&gt;Once you have skins, you can also address issues that otherwise aren’t worth dealing with - like dots. If you really feel the need to write &lt;span style="font-weight: bold;"&gt;aList.get&lt;/span&gt; instead of &lt;span style="font-weight: bold;"&gt;aList get&lt;/span&gt;, a suitable skin could be made available.&lt;br /&gt;&lt;br /&gt;It looks like language skins can be used to bridge over minor syntactic differences, but not much more. On the other hand, if you don’t have skins, you have a better chance of establishing a shared lingua franca amongst a programming community.&lt;br /&gt;&lt;br /&gt;Overall, my sense is that such skins are more trouble than they’re worth.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-6860101018441749150?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/6860101018441749150/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=6860101018441749150' title='19 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6860101018441749150'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6860101018441749150'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/09/skinning-newspeak.html' title='Skinning Newspeak?'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>19</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-7885861475997677863</id><published>2008-08-31T12:15:00.000-07:00</published><updated>2010-01-17T17:37:37.572-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><category scheme='http://www.blogger.com/atom/ns#' term='Reflection'/><title type='text'>Foreign functions, VM primitives and Mirrors</title><content type='html'>An issue that crops up in systems based on virtual machine is: what are the primitives provided by the VM and how are they represented?&lt;br /&gt;&lt;br /&gt;One answer would be that those are simply the instructions constituting the virtual machine language (often referred to as byte codes). However, one typically finds that there are some operations that do not fit this mold. An example would be the &lt;span style="font-weight: bold; font-style: italic;"&gt;defineClass()&lt;/span&gt; method, whose job is to take a class definition in JVML (Java Virtual Machine Language) and install it into the running JVM.  Another would be the &lt;span style="font-weight: bold; font-style: italic;"&gt;getClass()&lt;/span&gt; method that every Java object supports.&lt;br /&gt;&lt;br /&gt;These operations cannot be expressed directly by the high level programming languages running on the VM, and no machine instruction is provided for them either. Instead, the VM provides a procedural interface. So while the Java platform exposes &lt;span style="font-style: italic; font-weight: bold;"&gt;getClass()&lt;/span&gt;, &lt;span style="font-weight: bold; font-style: italic;"&gt;defineClass()&lt;/span&gt; and the like, behind the scenes these Java methods invoke a VM primitive to do their job.&lt;br /&gt;&lt;br /&gt;Why aren’t primitives supported by their own, dedicated virtual machine language instructions? One reason is there are typically too many of them, and giving each an instruction might disrupt the instruction set architecture (because you might need too many bits for opcodes, for example). It’s also useful to have an open ended set of primitives, rather than hardwiring them in the instruction set.&lt;br /&gt;&lt;br /&gt;You won’t find much discussion of VM primitives in the Java world. Java provides no distinct mechanism for calling VM primitives. Instead, primitives are treated as native methods (aka foreign functions) and called using that mechanism. Indeed, in Java there is no distinction between a foreign function and a VM primitive: a VM primitive is foreign function implemented by the VM.&lt;br /&gt;&lt;br /&gt;On its face, this seems reasonable. The JVM is typically implemented in a foreign language (usually C or C++) and it can expose any desired primitives as C functions that can then be accessed as native methods. It is very tempting to use one common mechanism for both purposes.&lt;br /&gt;&lt;br /&gt;One of the goals of this post is to explain why this is wrong, and why foreign functions and VM primitives differ and should be treated differently.&lt;br /&gt;&lt;br /&gt;Curiously, while Smalltalk defines no standardized FFI (Foreign Function Interface), the original specification defines a standard set of VM primitives. Part of the reason is historical: Smalltalk was in a sense the native language on the systems where it originated. Hence there was no need for an FFI (just as no one ever talks about an FFI in C), and hence primitives could not be defined in terms of an FFI and had to be thought of distinctly.&lt;br /&gt;&lt;br /&gt;However, the distinction is useful regardless. Calling a foreign function requires marshaling of data crossing the interface. This raises issues of different data formats, calling conventions, garbage collection etc. Calling a VM primitive is much simpler: the VM knows all there is to know about the management of data passed between it and the higher level language.&lt;br /&gt;&lt;br /&gt;The set of primitives is moreover small and under the control of the VM implementor. The set of foreign functions is unbounded and needs to be extended routinely by application programmers. So the two have different usability requirements.&lt;br /&gt;&lt;br /&gt;Finally, the primitives may not be written in a foreign language at all, but in the same language in a separate layer.&lt;br /&gt;&lt;br /&gt;So, I’d argue that in general one needs both an FFI and a notion of VM primitives (as in, to take a random example, Strongtalk). Moreover, I would base an FFI on VM primitives rather than the other way around. That is, a foreign call is implemented by a particular primitive (&lt;span style="font-style: italic; font-weight: bold;"&gt;call-foreign-function&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;Consider that native methods in Java are implemented with VM support; the JVM’s method representation marks native methods specially, and the method invocation instructions handle native calls accordingly. &lt;br /&gt;&lt;br /&gt;The Smalltalk blue book’s handling of primitives is similar; primitive methods are marked specially and handled as needed by the method invocation (send) instructions.&lt;br /&gt;&lt;br /&gt;It might be good to have one instruction, &lt;span style="font-style: italic; font-weight: bold;"&gt;invokeprimitive&lt;/span&gt;, dedicated to calling primitives. Each primitive would have an identifying code, and one assumes that the set of primitives would never exceed some predetermined size (8 bits?). That would keep the control of the VM entirely within the instruction set.&lt;br /&gt;&lt;br /&gt;It is good to have a standardized set of VM primitives, as Smalltalk-80 did. It makes the interface between the VM and built in libraries cleaner, so these libraries can be portable. We discussed doing this for the JVM about nine or ten years ago, but it never went anywhere.&lt;br /&gt;&lt;br /&gt;If primitives aren’t just FFI calls, how does one invoke them at the language level? Smalltalk has a special syntax for them, but I believe this is a mistake. In Newspeak, we view a primitive call as a message send to the VM. So it is natural to reify the VM via a VM mirror that supports messages corresponding to all the primitives.&lt;br /&gt;&lt;br /&gt;A nice thing abut using a mirror in this way, is that access to primitives is now controlled by a capability (the VM mirror), so the standard object-capability architecture handles access to primitives just like anything else.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;To get this to really work reliably, the low level mirror system must prohibit installation of primitive methods by compilers etc.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Another desirable propery of this scheme is that you can emulate the primitives in a regular object for purposes of testing, profiling or whatever. It's all a natural outgrowth of using objects and message passing throughout.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-7885861475997677863?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/7885861475997677863/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=7885861475997677863' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7885861475997677863'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7885861475997677863'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/08/foreign-functions-vm-primitives-and.html' title='Foreign functions, VM primitives and Mirrors'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-7477168280176165957</id><published>2008-07-27T11:30:00.000-07:00</published><updated>2010-01-17T17:56:27.729-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Invisible Video</title><content type='html'>A quick update; several people have told me that the Smalltalk Solutions video of the Hopscotch demo is unhelpful, since you can't see the screen. I should have watched it before linking to it; I apologize. Since I was in the room during the presentation, I didn't think to watch it again. My bad.  I've taken the link down. We'll produce a viewable demo for download and post it in th enear future.  Again, apologies to  to anyone who spent time downloading/watching the unwatchable.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-7477168280176165957?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/7477168280176165957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=7477168280176165957' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7477168280176165957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7477168280176165957'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/07/invisiible-video.html' title='Invisible Video'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5523663196277658696</id><published>2008-07-26T12:39:00.000-07:00</published><updated>2010-01-17T17:58:08.351-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><category scheme='http://www.blogger.com/atom/ns#' term='Reflection'/><title type='text'>Debugging Visual Metaphors</title><content type='html'>My previous post commented on the unsatisfactory state of mainstream IDEs. Continuing with this theme, I want to focus on one of my long term pet peeves - debugger UIs.&lt;br /&gt;&lt;br /&gt;Donald Norman, in his book &lt;a href="http://www.amazon.com/Design-Everyday-Things-Donald-Norman/dp/0385267746"&gt;&lt;span style="font-style: italic;"&gt;The Design of Everyday Things&lt;/span&gt;&lt;/a&gt;, makes the point that truly good designs are easy to use, because they make it intuitive how they should be used - in his words, they &lt;span style="font-style: italic;"&gt;afford&lt;/span&gt; a usage pattern.&lt;br /&gt;&lt;br /&gt;How do you describe the state of a running program on a whiteboard? You draw the stack. So it seems to me that a stack is the natural visual metaphor for a debugger. Here is a screen shot of our new Hopscotch-based debugger, which will soon replace the Squeak debugger in the Newspeak IDE.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://bracha.org/Hopscotch%20debugger.png" /&gt;&lt;br /&gt;&lt;br /&gt;You can see that the debugger looks like a stack trace. Every entry in the stack trace can be expanded into a presenter that shows that stack frame - including the source code for the method in question, and a view of the state of the frame - the receiver, the variables in the activation, the top of the evaluation stack for expressions (i.e., the last value computed).&lt;br /&gt;&lt;br /&gt;Of course, this isn’t my idea. It’s another one of those great ideas from Self, as evidenced in this screenshot:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://bracha.org/Self%20debugger.png" /&gt;&lt;br /&gt;&lt;br /&gt;we already stole the idea from Self once before, in Strongtalk, as shown here.&lt;br /&gt;&lt;br /&gt;&lt;img style="width: 709px; height: 572px;" src="http://bracha.org/StrongtalkDebugger.jpg" /&gt;&lt;br /&gt;&lt;br /&gt;One of the nice properties of the stack oriented view is that you can view multiple stack frames at once, so you can see and reason about a series of calls, much as you would at your whiteboard.&lt;br /&gt;&lt;br /&gt;In contrast, most IDEs (even Smalltalk IDEs) show a multi-pane view, with one pane showing the source for the current method (in its file, naturally), one pane showing the state of the current frame, and one showing the stack trace.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://bracha.org/EclipseDebugger.jpg" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;You can’t easily see the state of the caller of the current method, or its code, while simultaneously looking at the current method (or another activation in the call chain). And if you modify the current method’s code (assuming you can do that at all), you’re likely locked into a mode, and can’t see the other frames unless you save or cancel your changes.&lt;br /&gt;&lt;br /&gt;Hopscotch GUIs are inherently non-modal - so you can modify any one of the methods you’re viewing, and then link to another page to view, say, the complete class declaration, all without opening another window and without having to save or lose your work.&lt;br /&gt;&lt;br /&gt;The fact that one rarely needs more than one window is one of the things I really like about Hopscotch. There’s no need for a docking bar, or tabs for that matter. Tabs are popular these days, but they don’t scale: they occupy valuable screen real estate, and beyond half a dozen or so become disorienting and unmanageable.&lt;br /&gt;Hopscotch does better than the mainstream, and better than previous efforts like Strongtalk or Self, partly because of its navigation model, and partly because of the inherent &lt;span style="font-weight: bold; font-style: italic;"&gt;compositionality&lt;/span&gt; of tools built with it.  The fact that I can move from the debugger to the class browser in the same window did not require special planning - it’s inherent in the way Hopscotch based tools behave - they can be composed by means of a set of combinators. Compositionality is one of the most crucial, and most often overlooked, properties in software design; it’s what sets Hopscotch apart.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;You can find out more about Hopscotch in this &lt;a href="http://bracha.org/hopscotch-wasdett.pdf"&gt;paper&lt;/a&gt; and on &lt;a href="http://blog.3plus4.org/"&gt;Vassili ‘s blog&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The design of debugger UIs is one example of something that needs to change in modern IDEs. There are others of course. Many are related to the basic problems of modality, navigation and proliferation of panes/windows noted above. Overall, your typical IDE UI is too much like the &lt;a href="http://uscockpits.com/Jet%20Bombers/B-52G%20Stratofortress%20center.JPG"&gt;control panel of a B-52 bomber&lt;/a&gt; or an &lt;a href="http://www.space1.com/Spacecraft_Data/Handbook_Illustrations/Apollo/Apollo_Control_Panel/apollo_control_panel.html"&gt;Apollo space capsule&lt;/a&gt;: a mind boggling array of switches, gauges, controls and wizards that interact with each other in myriad and confusing ways.  This is neither necessary nor desirable. Like explicit memory management and primitive types, in time we will progress beyond these.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5523663196277658696?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5523663196277658696/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5523663196277658696' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5523663196277658696'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5523663196277658696'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/07/debugging-visual-metaphors.html' title='Debugging Visual Metaphors'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-8970800758112672874</id><published>2008-06-07T16:30:00.000-07:00</published><updated>2010-01-17T17:37:37.573-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Reflection'/><title type='text'>Incremental Development Environments</title><content type='html'>Back in 1997, when I started working at Sun, I did not expect to do much programming language design. After all, Java was completely specified in the original JLS, or so it was thought.&lt;br /&gt;&lt;br /&gt;What I actually expected to do was to work on a Java IDE. Given my Smalltalk background, I was of course very much aware of how valuable a good IDE is for programmers. The Java IDE I had in mind never materialized. Management at the time thought this was an issue to be left to other vendors. If this seems a little strange today - well, it seemed strange back then too. In any case, the Java world has since learned that IDEs are very important and strategic.&lt;br /&gt;&lt;br /&gt;That said, todays Java IDEs are still very different from what I envisioned based on my Smalltalk background.&lt;br /&gt;&lt;br /&gt;Java IDEs have yet to fully embrace the idea of incremental development.  Look in any such system, and you’ll find a button or menu labeled &lt;span style="font-weight: bold; font-style: italic;"&gt;Build&lt;/span&gt;. The underlying idea is that a program is assembled from scratch every time you change it.&lt;br /&gt;&lt;br /&gt;Of course, that is how the real world works, right? You find a broken window on the Empire State building, so you tear it down and rebuild it with the fixed window. If you're clever, you might be able to do it pretty fast. Ultimately, it doesn't scale.&lt;br /&gt;&lt;br /&gt;The build button comes along with an obsession with files. In C and its ilk, the notion of files is part of the language, because of #include.  Fortunately, Java did away with that legacy. Java programmers can think in terms of packages, independent of their representation on disk.  Why spend time worrying about class paths and directory structures?  What do these have to do with the structure of your program and the problem you’re trying to solve?&lt;br /&gt;&lt;br /&gt;The only point where files are useful in this context is as a medium for sharing source code or distributing binary code.  Files are genuinely useful for those purposes, and Smalltalk IDEs have generally gone overboard in the opposite direction; but I digress.&lt;br /&gt;&lt;br /&gt;A consequence of the file/build oriented view is that the smallest unit of incremental change is the file - something that is often too big (if you ever notice the time it takes to compile a change, that’s too long) and moreover, not even a concept in the language.&lt;br /&gt;&lt;br /&gt;More fundamentally, what’s being changed is a static, external representation of the program code; there is no support for changing the live process derived from the code so that it matches up with the code that describes it. It’s like having a set of blueprints for a building (the code in the file) which doesn’t match the building (the process generated from the code).&lt;br /&gt;&lt;br /&gt;For example, once you add an instance variable to a class, what happens to all the existing instances on the heap? In Smalltalk, they all have the new instance variable. In Java - well, nothing happens.&lt;br /&gt;&lt;br /&gt;In general, any change you make to the code should be reflected immediately in the entire environment. This implies a very powerful form of fix-and-continue debugging (I note in amazement that after all these years, Eclipse still doesn’t support even the most basic form of fix-and-continue).&lt;br /&gt;&lt;br /&gt;All this is of course a very tall order for a language with a mandatory static type system.&lt;br /&gt;I’m not aware of a JVM implementation that can begin to deal with class schema changes (that is, changing the shapes of objects because their class has lost or acquired fields). It’s not impossible, but it is hard.&lt;br /&gt;&lt;br /&gt;Consider that removing a field requires invalidating any code that refers to it. In a language where fields are private, the amount of code to invalidate is nicely bounded to the class (good design decisions tend to simplify life). Public fields, apart from their nasty software engineering properties, add complexity to the underlying system.&lt;br /&gt;&lt;br /&gt;This isn’t just a headache for the IDE. The VM has to provide support for changing the classes of existing instances. However, in the absence of VM support there all kinds of tricks one can play. If you compile all code yourself, you can ensure that no one accesses fields directly - everything goes through accessors. You can even rewrite imported binaries. With enough effort, I believe you can make it all work on an existing JVM with good JVMDI support.&lt;br /&gt;&lt;br /&gt;Changing code in a method is supported by JVMDI (well, the JVMDI spec allows for supporting schema changes as well - it’s just that it isn’t required and no one ever implemented it). However, what happens if you change the signature of a method?  Any number of existing callers may be invalid due to type errors.  The IDE needs to tell you about this pretty much instantaneously, invalidating all these callers. Most of this worked in Trellis/Owl back in the late 80s. The presence of the byte code verifier means that this applies to binary code as well.&lt;br /&gt;&lt;br /&gt;Achieving true incremental development is very hard. Still, given the amount of people working on Java, you’d think it would have happened after all these years. It hasn’t, and I don’t expect it to.&lt;br /&gt;&lt;br /&gt;Someone will rightly make the point that mandatory typing can be very helpful in an IDE - its easier to find callers of methods, implementors, references to fields or classes, as well as refactoring (though, oddly, all these features originated in IDEs for dynamic languages - Smalltalk or Lisp; speculating why that is would make for another controversial post). This post isn’t really about static/dynamic typing - it’s about incrementality in the IDE.&lt;br /&gt;&lt;br /&gt;Of course, mainstream IDEs annoy me for other reasons: the bloat, the slow startup, and most of all the B-52 bomber/Apollo space capsule style UI. That probably deserves another post.&lt;br /&gt;&lt;br /&gt;In the meantime, I can go back to &lt;a href="http://blog.3plus4.org/"&gt;Vassili’s&lt;/a&gt; fabulous &lt;a href="http://bracha.org/hopscotch-wasdett.pdf"&gt;Hopscotch&lt;/a&gt; browsers and leave the mainstream to cope with all the docking bars, tabs and panes too small to print their name. You, dear reader, if you’re using a mainstream IDE, may not realize what you’re missing. To an extent, these things have to be experienced to be appreciated. Still I encourage you to demand better - demand true incremental development support, in whatever language you use. Because in the end, there are only two kinds of development - incremental, and excremental.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-8970800758112672874?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/8970800758112672874/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=8970800758112672874' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8970800758112672874'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8970800758112672874'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/06/incremental-development-environments.html' title='Incremental Development Environments'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-6406299503701798118</id><published>2008-05-06T09:56:00.000-07:00</published><updated>2010-01-17T17:34:23.523-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>The Future of Newspeak</title><content type='html'>Several people have asked me when Newspeak will be released. Well, I still don’t know, but at least now I know it &lt;span style="font-weight: bold;"&gt;will&lt;/span&gt; be released. Cadence has generously agreed to make Newspeak available as open source software under the Apache 2.0 license.&lt;br /&gt;&lt;br /&gt;We will be publishing a draft spec for Newspeak soon; I say draft because I expect Newspeak to continue to evolve substantially for the next year at least, and because the initial spec will necessarily be incomplete.&lt;br /&gt;&lt;br /&gt;It will be a while before we are ready to make a proper public release of Newspeak. In the meantime, I’ve gathered some information on my &lt;a href="http://bracha.org/Site/Newspeak.html"&gt;personal web site&lt;/a&gt;. We plan to set up an official Newspeak web site in the near future.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-6406299503701798118?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/6406299503701798118/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=6406299503701798118' title='20 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6406299503701798118'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6406299503701798118'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/05/future-of-newspeak.html' title='The Future of Newspeak'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>20</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-8743722749815113628</id><published>2008-04-26T07:43:00.000-07:00</published><updated>2010-01-17T17:52:38.012-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Web Platform and Objects as Software Services'/><title type='text'>java'scrypt</title><content type='html'>Everyone is talking about cloud computing these days; I should add some vapor to the mist.&lt;br /&gt;&lt;br /&gt;I began thinking seriously about the topic a bit after April fools day 2004, when gmail was released. I started using gmail shortly therefater.  After about 30 minutes, I was convinced that web clients had a long way to go.&lt;br /&gt;&lt;br /&gt;It is true that gmail showed the world that you could do far more with javascript in a browser than I or most other people had realized.&lt;br /&gt;&lt;br /&gt;Javascript and Ajax have come a long way since then. For an impressive demonstration of what’s possible, see the &lt;a href="http://research.sun.com/projects/lively/"&gt;Lively Kernel&lt;/a&gt;, a realization of John Maloney’s  Morphic GUI ideas of Self and Squeak, in Javascript. It only works in Safari 3.0 and some advanced Firefox builds, but that situation will improve in time.&lt;br /&gt;&lt;br /&gt;Javascript’s performance remains an issue; this will also improve dramatically in the foreseeable future. However, with all the talk about computing in the cloud, I think some caveats are in order.&lt;br /&gt;&lt;br /&gt;Many real applications will need to be used offline, and so will need to store state on the client.  They are NOT going to be pure server side web applications. Google has finally acknowledged as much with the introduction of Google Gears. Microsoft’s LiveMesh naturally emphasizes this even more.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;I say naturally, because Microsoft has an interest in the personal computer, while Google has an interest in taking us to an updated version of X terminals and 1960’s time sharing.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Of course, there are other technologies in this space, like Adobe Air.  Macromedia (now part of Adobe) saw this coming earlier than most, going back to at least 2003 with its &lt;/span&gt;&lt;a style="font-style: italic;" href="http://www.adobe.com/products/central/"&gt;Central project&lt;/a&gt;&lt;span style="font-style: italic;"&gt;.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;With all these pieces in place, we may finally have the underlying elements necessary to build good applications delivered through the web browser.&lt;br /&gt;&lt;br /&gt;Why has this taken so long? In theory, Java was supposed to deliver all this over ten years ago. Java started as a client technology, and applets were supposed to do exactly what advanced Ajax applications do today, only better.&lt;br /&gt;&lt;br /&gt;Of course, the problem was that Java clients behaved very poorly; rather than fix the client technology, Sun captured the lucrative enterprise business. To understand why Sun behaved that way, why it had to behave that way, read  &lt;a href="http://books.google.com/books?id=SIexi_qgq2gC&amp;amp;dq=innovator%27s+dillema&amp;amp;pg=PP1&amp;amp;ots=AhrUcHA7Em&amp;amp;sig=9cwl6_QMtk4bSvX7ldw4i_UvC48&amp;amp;hl=en&amp;amp;prev=http://www.google.com/search?client=safari&amp;amp;rls=en-us&amp;amp;q=Innovator%27s+Dillema&amp;amp;ie=UTF-8&amp;amp;oe=UTF-8&amp;amp;sa=X&amp;amp;oi=print&amp;amp;ct=title&amp;amp;cad=one-book-with-thumbnail"&gt;The Innovator’s Dilemma&lt;/a&gt;.&lt;br /&gt;Maybe I’ll expand on that theme in another post.&lt;br /&gt;&lt;br /&gt;The result of Java’s failure on the client was that the entire vision of rich network-enabled client applications was put on hold for a decade.   It is now once again coming to the fore,  but there are other languages in that space now, Javascript chief among them.&lt;br /&gt;&lt;br /&gt;I’m not advocating writing clients in Javascript directly. Javascript is the assembly language of the internet platform (and the browser is the OS). It’s flexible enough to support almost anything on top of it, and poorly structured enough that humans shouldn’t be writing sizable applications in it.&lt;br /&gt;&lt;br /&gt;I am not in favor of the attempts to make &lt;a href="http://www.ecmascript.org/docs.php"&gt;Javascript 2&lt;/a&gt; be another Java either. Javascript’s importance stems from the existing version’s position in the browser, and from its flexibility.   It should be kept simple. Javascript's value increases the more uniformly it is implemented across browsers. A new, complex language will take a long time to reach that point.&lt;br /&gt;&lt;br /&gt;People should program in a variety of languages that suit them, and have these compiled to Javascript and HTML (GWT is an example of that, though not one I really like).  One shouldn't have to deal with the hodgepodge of Javascript and HTML (broadly defined to include XML, CSS etc.)  that constitute web programming today.&lt;br /&gt;&lt;br /&gt;I don’t expect native clients to go away any time soon though. They will provide a better experience than the browser in many ways for quite a long time. The advantage of writing in a clean, high level language and platform is that it can be compiled to run either on the “WebOS” or on a native OS.&lt;br /&gt;&lt;br /&gt;What one wants to see in such a platform is the ability to take advantage of the network in a deep way - for software delivery and update and for data synchronization. Which brings me back to my theme of objects as software services (see the &lt;a href="http://video.google.com/videoplay?docid=-162051834912297779"&gt;video&lt;/a&gt;), which I’ve been preaching publicly since 2005 ( see my &lt;a href="http://www.bracha.org/oopsla05-dls-talk.pdf"&gt;OOPSLA 2005 presentation&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;One should be able to write an application once, and deploy it both to the web browser and natively to any major platform. In all those scenarios, one should be able to update the software and data on the fly,  run it on- and offl-ine and and monitor it over the network continuously. And all this needs to be done in a way that preserves user’s security. This last bit is still a big challenge.&lt;br /&gt;&lt;br /&gt;The technology to do this is becoming available, even though everything takes much longer than one expects.  The vision has its root in research in the 1980s, continued in the 1990s with Java (and others, now largely forgotten, like &lt;a href="http://en.wikipedia.org/wiki/General_Magic"&gt;General Magic&lt;/a&gt;'s  &lt;a href="http://cgibin.erols.com/ziring/cgi-bin/cep/cep.pl?_key=Telescript"&gt;Telescript&lt;/a&gt;) and will be probably be realized in the 2010s, per Alan Kay’s 30 year rule.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-8743722749815113628?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/8743722749815113628/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=8743722749815113628' title='16 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8743722749815113628'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8743722749815113628'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/04/everyone-is-talking-about-cloud.html' title='java&apos;scrypt'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>16</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5601241685320247528</id><published>2008-03-23T10:48:00.000-07:00</published><updated>2010-01-17T17:36:35.103-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Modularity'/><title type='text'>Monkey Patching</title><content type='html'>Earlier this month I spoke at the &lt;a href="http://www.sfi.org.pl/festival"&gt;International Computer Science Festival in Krakow&lt;/a&gt;. Krakow is a beautiful city with several universities, and it is becoming a high tech center, with branches of companies like IBM and Google. The CS festival draws well over a thousand participants; the whole thing is organized by students. While much of the program was in Polish, there were quite a few talks in English.&lt;br /&gt;&lt;br /&gt;Among these was &lt;a href="http://chadfowler.com/"&gt;Chad Fowler&lt;/a&gt;’s talk on Ruby. Chad is a very good speaker, who did an excellent job of conveying the sense of working in a dynamic language like Ruby. Almost everything he said would apply to Smalltalk as well.&lt;br /&gt;&lt;br /&gt;One of the points that came up was the custom, prevalent in Ruby and in Smalltalk, of extending existing classes with new methods in the service of some other application or library. Such methods are often referred to as extension methods in Smalltalk, and the practice is supported by tools such as the Monticello source control system.&lt;br /&gt;&lt;br /&gt;As an example, I’ll use my parser combinator library, which I’ve described in previous posts. To define a production for an &lt;span style="font-weight: bold;"&gt;if&lt;/span&gt; statement, you might write:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;if:: tokenFromSymbol: #if.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;then:: tokenFromSymbol: #then.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;ifStat:: if, expression, then, statement.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;assuming that productions for &lt;span style="font-weight: bold; font-style: italic;"&gt;expression&lt;/span&gt; and &lt;span style="font-style: italic; font-weight: bold;"&gt;statement&lt;/span&gt; already exist. The purpose of the rules if and then is to produce a tokenizing parser that accepts the symbols &lt;span style="font-style: italic; font-weight: bold;"&gt;#if&lt;/span&gt; and &lt;span style="font-style: italic; font-weight: bold;"&gt;#then&lt;/span&gt; respectively.  It might be nicer to just write:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;ifStat:: #if, expression, #then, statement.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;and have the system figure out that, in this context, you want to denote a parser for a symbol by the symbol directly, much as you would when writing the BNF.&lt;br /&gt;&lt;br /&gt;One way of achieving this would be to actually go and add the necessary methods to the class &lt;span style="font-style: italic; font-weight: bold;"&gt;Symbol&lt;/span&gt;, so that symbols could behave like parsers.  I know otherwise intelligent people who are prepared to argue for this approach. As I said, Smalltalkers would call these additions extension methods, but I find the more informal term &lt;span style="font-weight: bold;"&gt;monkey patching&lt;/span&gt; conveys a better intuition.&lt;br /&gt;&lt;br /&gt;Typically, one wants to deliver a set of such methods as a unit, to be installed when a certain class or library gets loaded.  So these changes are often provided as a patch that is applied dynamically. Not a problem in Smalltalk or Ruby or Python (though I gathered from the Pythoners in Krakow that they, to their credit, frown on the practice).&lt;br /&gt;&lt;br /&gt;Apparently, there is a need to explain why monkey patching is a really bad idea. For starters, the methods in one monkey’s patch might conflict with those in some other monkey’s patch. In our example, the sequencing operator for parsers conflicts with that for symbols.&lt;br /&gt;&lt;br /&gt;A mere flesh wound, says our programming primate: I usually don’t get conflicts, so I’ll pretend they won’t happen. The thing is, as thing scale up, rare occurrences get more frequent, and the costs can be very high.&lt;br /&gt;&lt;br /&gt;Another problem is API bloat. You can see this in &lt;a href="http://www.squeak.org/"&gt;Squeak&lt;/a&gt; images, where a lot of monkeying about has taken place over the years. Classes like &lt;span style="font-weight: bold; font-style: italic;"&gt;Object&lt;/span&gt; and &lt;span style="font-weight: bold; font-style: italic;"&gt;String&lt;/span&gt; are polluted with dozens of methods contributed by enterprising individuals who felt that their favorite convenience method is something the whole world needs to benefit from.&lt;br /&gt;&lt;br /&gt;Even in your own code, one needs to exercise restraint lest your API becomes obese with convenience methods. Big APIs eat up memory for both people and machinery, reducing responsiveness as well as learnability.&lt;br /&gt;&lt;br /&gt;Then there is the small matter of security. If you are free to patch the definition of a class like &lt;span style="font-style: italic; font-weight: bold;"&gt;String&lt;/span&gt; (typically on the fly when their code gets loaded), what’s to stop malicious macaques from replacing critical methods with really damaging stuff?&lt;br /&gt;&lt;br /&gt;The counter argument is that in many cases (though not in this example), the patch is designed to avoid the use of &lt;span style="font-weight: bold; font-style: italic;"&gt;typecase/switch/instance-of&lt;/span&gt; constructs, which bring  their own set of evils to the table.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lamp.epfl.ch/%7Eemir/written/MatchingObjectsWithPatterns-TR.pdf"&gt;Extractors&lt;/a&gt; are a new approach to pattern matching developed by &lt;a href="http://lamp.epfl.ch/%7Eodersky/"&gt;Martin Odersky&lt;/a&gt; for &lt;a href="http://www.scala-lang.org/"&gt;Scala&lt;/a&gt;. They overcome the usual difficulty with pattern matching, which is that it violates encapsulation by exposing the implementation type of data, just like &lt;span style="font-style: italic; font-weight: bold;"&gt;instance-of&lt;/span&gt;.  It may be part of the answer here as well.&lt;br /&gt;&lt;br /&gt;However, many monkey patches are motivated by a desire for syntactic sugar, as the example shows.  Extractors won’t help here.&lt;br /&gt;&lt;br /&gt;A variety of language constructs have been devised to deal with this and related situations. &lt;a href="http://www.iam.unibe.ch/%7Escg/Archive/Papers/Berg03aClassboxes.pdf"&gt;Class boxes&lt;/a&gt; and selector namespaces in Smalltalk dialects, &lt;a href="http://www.swa.hpi.uni-potsdam.de/cop/"&gt;context oriented programming&lt;/a&gt; in Lisp and Smalltalk, &lt;a href="http://en.wikipedia.org/wiki/Extension_method"&gt;static extension methods in C#&lt;/a&gt; and even &lt;a href="http://www.haskell.org/tutorial/classes.html"&gt;Haskell type classes&lt;/a&gt; are related. These mechanisms don’t all provide the same functionality of course. I confess that I find none of them attractive. Each comes at a price that is too high for what it provides.&lt;br /&gt;&lt;br /&gt;For example, C# extension methods rely on mandatory typing. Furthermore, they would not address the example above, because we need the literal symbols we use in the grammar to behave like parsers when passed into the parser combinator library code, not just in the lexical scope of the grammar.&lt;br /&gt;&lt;br /&gt;Haskell type classes are much better. They would work for this problem (and many others), but also rely crucially on mandatory typing.&lt;br /&gt;&lt;br /&gt;Class boxes are dynamic, but again only effect the immediate lexical scope. The same is true of simple formulations of selector namespaces. Richer versions let you import the desired selectors elsewhere, but I find this gets rather baroque. I'm not sure how COP meshes with security; so far it seems too complex for me to consider.&lt;br /&gt;&lt;br /&gt;I’ve contemplated a change to the Newspeak semantics that would accommodate the above example, but it hasn’t been implemented, and I have mixed feelings about it. If a literal like  &lt;span style="font-weight: bold; font-style: italic;"&gt;#if &lt;/span&gt;is interpreted as an invocation of a factory method on &lt;span style="font-weight: bold; font-style: italic;"&gt;Symbol&lt;/span&gt;, then we can override &lt;span style="font-weight: bold; font-style: italic;"&gt;Symbol&lt;/span&gt; so that it supports the parser combinators. This only effects symbols created in a given scope, but isn’t just syntactic sugar like the C# extension methods suggested above.&lt;br /&gt;&lt;br /&gt;Of course, this can be horribly abused; one shudders to think what a band of baboons might make of the freedom to redefine the language’s literals. On the other hand, used judiciously, it is great for supporting domain specific languages.&lt;br /&gt;&lt;br /&gt;So far, I have no firm conclusions about how to best address the problems monkey patching is trying to solve. I don’t deny that it is expedient and tempting. Much of the appeal of dynamic languages is of course the freedom to do such things. The contrast with a language like Java is instructive.  Adding a method to &lt;span style="font-weight: bold; font-style: italic;"&gt;String&lt;/span&gt; is pretty much impossible. One has to sacrifice one’s first-born to the gods of the &lt;a href="http://jcp.org/en/home/index"&gt;JCP&lt;/a&gt; and wait seven years for them to decide whether to add the method or not. I’m not endorsing that model either: I know it only too well.&lt;br /&gt;&lt;br /&gt;Regardless, given my flattering portrayals of primate practices, you may deduce that my main comment on monkey patching is “just say no”.  The problems it induces far outweigh its benefits. If you feel tempted, think hard about design alternatives. One can do better.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5601241685320247528?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5601241685320247528/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5601241685320247528' title='21 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5601241685320247528'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5601241685320247528'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/03/monkey-patching.html' title='Monkey Patching'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>21</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-7020647932848543962</id><published>2008-02-17T17:22:00.000-08:00</published><updated>2008-02-17T17:28:06.445-08:00</updated><title type='text'>Cutting out Static</title><content type='html'>Most imperative languages have some notion of static variable. This is unfortunate, since static variables have many disadvantages. I have argued against static state for quite a few years (at least since the dawn of the millennium), and in Newspeak, I’m finally able to eradicate it entirely.  Why is static state so bad, you ask?&lt;br /&gt;&lt;br /&gt;Static variables are bad for security. See the &lt;a href="http://www.erights.org/"&gt;E literature&lt;/a&gt; for extensive discussion on this topic. The key idea is that static state represents an ambient capability to do things to your system, that may be taken advantage of by evildoers.&lt;br /&gt;&lt;br /&gt;Static variables are bad for distribution. Static state needs to either be replicated and sync’ed across all nodes of a distributed system, or kept on a central node accessible by all others, or some compromise between the former and the latter. This is all difficult/expensive/unreliable.&lt;br /&gt;&lt;br /&gt;Static variables are bad for re-entrancy. Code that accesses such state is not re-entrant. It is all too easy to produce such code. Case in point: javac. Originally conceived as a batch compiler, javac had to undergo extensive reconstructive surgery to make it suitable for use in IDEs. A major problem was that one could not create multiple instances of the compiler to be used by different parts of an IDE, because javac had significant static state. In contrast, the code in a Newspeak module definition is always re-entrant, which makes it easy to deploy multiple versions of a module definition side-by-side, for example.&lt;br /&gt;&lt;br /&gt;Static variables are bad for memory management. This state has to be handed specially by implementations, complicating garbage collection. The woeful tale of class unloading in Java revolves around this problem. Early JVMs lost application’s static state when trying to unload classes. Even though the rules for class unloading were already implicit in the specification, I had to add a section to the JLS to state them explicitly, so overzealous implementors wouldn’t throw away static application state that was not entirely obvious.&lt;br /&gt;&lt;br /&gt;Static variables are bad for for startup time. They encourage excess initialization up front. Not to mention the complexities that static initialization engenders: it can deadlock, applications can see uninitialized state, and unless you have a really smart runtime, you find it hard to compile efficiently (because you need to test if things are initialized on every use).&lt;br /&gt;&lt;br /&gt;Static variables are bad for for concurrency. Of course, any shared state is bad for concurrency, but static state is one more subtle time bomb that can catch you by surprise.&lt;br /&gt;&lt;br /&gt;Given all these downsides, surely there must be some significant upside, something to trade off against the host of evils mentioned above? Well, not really.  It’s just a mistake, hallowed by long tradition. Which is why Newspeak has dispensed with it.&lt;br /&gt;&lt;br /&gt;It may seem like you need static state, somewhere to start things off, but you don’t. You start off by creating an object, and you keep your state in that object and in objects it references. In Newspeak, those objects are modules.&lt;br /&gt;&lt;br /&gt;Newspeak isn’t the only language to eliminate static state. E has also done so, out of concern for security.  And so has Scala, though its close cohabitation with Java means Scala’s purity is easily violated. The bottom line, though, should be clear. Static state will disappear from modern programming languages, and should be eliminated from modern programming practice.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-7020647932848543962?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/7020647932848543962/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=7020647932848543962' title='47 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7020647932848543962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7020647932848543962'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2008/02/cutting-out-static.html' title='Cutting out Static'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>47</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-4363727735499118288</id><published>2007-12-31T17:09:00.000-08:00</published><updated>2010-01-17T17:36:35.104-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><category scheme='http://www.blogger.com/atom/ns#' term='Modularity'/><title type='text'>More on Modules</title><content type='html'>My posts seem to raise more questions than they answer. This is as it should be, in accordance with the Computational Theologist Full Employment Act. In this post, I’ll try and answer some of the questions that arose from my last one.&lt;br /&gt;&lt;br /&gt;How does one actually hook modules together and get something going? As I mentioned before, module definitions are top level classes - classes that are defined in a namespace, rather than in another class.&lt;br /&gt;&lt;br /&gt;Defining a top level class makes its name available in the surrounding namespace. More precisely, it causes the compiler to define a getter method with the name of the class on the namespace; the method will return the class object.&lt;br /&gt; &lt;br /&gt;Since a module definition is just a class, one needs to instantiate it, by calling a constructor - which is a class side method. Continuing with the example from the previous post:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;MovieLister finderClass: ColonDelimitedMovieFinder &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now this isn’t quite realistic, since &lt;span style="font-style:italic;"&gt;ColonDelimitedMovieFinder&lt;/span&gt; probably needs access to things like collections and files to do its job. So it’s probable that it takes at least one parameter itself. The typical situation is that a module definition takes a parameter representing the necessary parts of the platform libraries. It might look something like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;ColonDelimitedMovieFinder usingLib: platform = (&lt;br /&gt;| &lt;br /&gt;OrderedCollection = platform Collections OrderedCollection.&lt;br /&gt;FileStream = platform Streams FileStream.&lt;br /&gt;|&lt;br /&gt;)...&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So we’d really create the application this way:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;MovieLister finderClass: (ColonDelimitedMovieFinder usingLib: Platform new)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;where &lt;span style="font-style:italic;"&gt;Platform&lt;/span&gt; is a predefined module that provides access to the built-in libraries. &lt;br /&gt;&lt;br /&gt;Bob Lee points out that if I change &lt;span style="font-style:italic;"&gt;MovieLister&lt;/span&gt; so that it takes another parameter, I have to update all the creation sites for &lt;span style="font-style:italic;"&gt;MovieLister&lt;/span&gt;, whereas using a good DIF I only declare what needs to be injected and where. &lt;br /&gt;&lt;br /&gt;In many cases, I could address this issue by declaring a secondary constructor that feeds the second argument to the primary one.  &lt;br /&gt;&lt;br /&gt;Say we changed &lt;span style="font-style:italic;"&gt;MovieLister&lt;/span&gt; because it too needed access to some platform library:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;class MovieLister usingLib: platform finderClass: MovieFinder = ...&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We might be able to define a  secondary constructor &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;class MovieLister usingLib: platform finderClass: MovieFinder = (&lt;br /&gt;...&lt;br /&gt;): ( &lt;br /&gt;  finderClass: MovieFinder = (&lt;br /&gt;     ^usingLib: Platform new finderClass: MovieFinder&lt;br /&gt; )&lt;br /&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There are however two situations where this won’t work in Newspeak. &lt;br /&gt;&lt;br /&gt;One is inheritance, because subclasses must call the primary constructor. I showed how to deal with that in one of my August 2007 posts -  don’t change the primary constructor - change the superclass.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;class MovieLister finderClass: MovieFinder =  NewMovieLister usingLib: Platform new finderClass: MovieFinder&lt;br /&gt;(...)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The other problematic case is for module definitions. In most cases,  the solutions above won’t help; they won’t be able to provide a good default for the additional parameter, because they won’t have access to the surrounding scope. For this last situation I have  no good answer yet. I will say that the public API of a top level module definition should be pretty stable, and the number of calls relatively few.&lt;br /&gt;&lt;br /&gt;So overall, I think Bob makes an important point - DIFs give you a declarative way of specifying how objects are to be created. On the other had, it gets a bit complicated when different arguments are needed in different places, or if we don’t want to compute so many things up front at object creation time.  Guice has mechanisms to help with that, but I find them a bit rich for my blood.  In those cases, I really prefer to specify things naturally in my code.  &lt;br /&gt;&lt;br /&gt;Another advantage of abstracting freely over classes is that you can inherit from classes that are provided as parameters. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;class MyCollections usingLib: platform = (&lt;br /&gt;| ArrayList = platform ArrayList. |&lt;br /&gt;)( &lt;br /&gt;ExtendedArray List = ArrayList (...)&lt;br /&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now, depending what library you actually provide to &lt;span style="font-style:italic;"&gt;MyCollections&lt;/span&gt; as an argument, you can obtain distinct subclasses (in fact, there’s an easier way to do this,  but this post is once again getting too long). Correct me if I’m wrong, but I don’t think a DIF helps here.&lt;br /&gt;&lt;br /&gt;You can also do class hierarchy inheritance: modify an entire library embedded within a module by subclassing it and changing only what’s needed. This is somewhat less modular (inheritance always causes problems) but the tradeoff is well worth it in my opinion.&lt;br /&gt;&lt;br /&gt;I spoke about class hierarch inheritance at JAOO, and will likely speak about it again in one or more of my upcoming talks on Newspeak, at Google in Kirkland on January 8th, at &lt;a href="http://fool08.kuis.kyoto-u.ac.jp/"&gt;FOOL&lt;/a&gt; on January 13th, or at &lt;a href="http://www.langnetsymposium.com/"&gt;Lang.Net 2008&lt;/a&gt; in Redmond in late January.&lt;br /&gt; &lt;br /&gt;I’m trying to make each of these talks somewhat different, but they will necessarily have some overlap. I hope that some of these talks will make it onto the net and make these ideas more accessible.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-4363727735499118288?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/4363727735499118288/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=4363727735499118288' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4363727735499118288'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/4363727735499118288'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/12/more-on-modules.html' title='More on Modules'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-1781570237423980139</id><published>2007-12-16T17:25:00.000-08:00</published><updated>2010-01-17T17:45:42.985-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><category scheme='http://www.blogger.com/atom/ns#' term='Modularity'/><title type='text'>Lethal Injection</title><content type='html'>Some months ago, I wrote a &lt;a href="http://gbracha.blogspot.com/2007_08_01_archive.html"&gt;couple of posts&lt;/a&gt; about object construction and initialization. I made the claim that so-called dependency-injection frameworks (DIFs) are completely unnecessary in a language like Newspeak, and promised to expand on that point in a later post.  Four months should definitely qualify as “later”, so here is the promised explanation.&lt;br /&gt;&lt;br /&gt;I won’t explain DIFs in detail here - read &lt;a href="http://martinfowler.com/articles/injection.html"&gt;Martin Fowler’s excellent overview&lt;/a&gt; if you need an introduction. The salient information about DIFs is that they are used to write code that does not have undue references to concrete classes embedded in it. These references are usually calls to constructors or static methods. These concrete references create undue intermodule dependencies.&lt;br /&gt;&lt;br /&gt;The root of the problem DIFs address is that mainstream languages provide inadequate mechanisms to abstract over classes.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Terminology rant: DIFs should more properly be called &lt;span style="font-weight: bold;"&gt;dependee&lt;/span&gt;-injection frameworks. A dependency is a relationships between a dependent (better called &lt;span style="font-weight: bold;"&gt;depender&lt;/span&gt;) and a &lt;span style="font-weight: bold;"&gt;dependee&lt;/span&gt;. The dependencies are what we do not want in our code; we certainly don’t want to inject more of them. Instead, DIFs inject instances of the dependees, so the dependers don’t have to create them.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;DIFs require you write your code in a specific way, where you avoid creating instances of dependees. Instead, you make sure that there is a way to provide the dependee instance (in DIF terminology, to inject it) from outside the object.  You then tell the framework where and what to inject. The reason injection traffics in instances rather than the classes themselves is because there’s no good way to abstract over the classes.&lt;br /&gt;&lt;br /&gt;Having recapped DIFs, lets move on to Newspeak. Newspeak modules are defined in namespaces. Namespaces are simply objects that are required to be deeply immutable; they are stateless.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Tangent: This ensures that there is no global or static state in Newspeak, which gives us many nice properties.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Namespaces are organized like Java packages, as an inversion of the internet domain namespace. Unlike Java packages, sub-namespaces can see their enclosing namespace.&lt;br /&gt;&lt;br /&gt;A module is a top-level class, that is, a class defined directly within a namespace. Newspeak classes can nest arbitrarily, so a module can contain an entire class library or framework, which can in turn be subdivided into subsystems to any depth. Nested classes can freely access the lexical scope of their enclosing class.&lt;br /&gt;&lt;br /&gt;Modules, like all classes, are reified as objects that support constructor methods. Recall that in Newspeak, a constructor invocation is indistinguishable from an ordinary method invocation. Objects are constructed by sending messages to (invoking virtual methods on) another object. That object may or may not be a class; it makes no difference.  Hence all the usual abstraction mechanisms in the language apply to classes - in particular, parameterization.&lt;br /&gt;&lt;br /&gt;Here is a trivial top level class, modeled after &lt;a href="http://martinfowler.com/articles/injection.html#ANaiveExample"&gt;the motivating example for DIFs given in Fowler’s article&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;blockquote&gt;public class MovieLister = (&lt;br /&gt;|&lt;br /&gt; private movieDB = ColonDelimitedMovieFinder from:’movies.txt’.&lt;br /&gt;|)&lt;br /&gt;(&lt;br /&gt;   public moviesDirectedBy: directorName = (&lt;br /&gt;       ^movieDB findAll select:[:m |&lt;br /&gt;                           m director = directorName&lt;br /&gt;                        ].&lt;br /&gt;)&lt;/blockquote&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The idea is that MovieLister supports one method, moviesDirectedBy:, which takes a string that contains the name of a director and returns a collection of movies directed by said director. The point of Fowler’s example is that there is an undesirable dependency on a class, ColonDelimitedMovieFinder, embedded in MovieLister. If we want to use a different database, we need to change the code.&lt;br /&gt;&lt;br /&gt;However, this code won’t actually work in Newspeak. The reason is that the enclosing namespace is not visible inside a Newspeak module. Any external dependencies must be made explicit by passing them to the module as parameters to a constructor method. These parameters could be other modules, namespaces, or classes and instances of any kind.&lt;br /&gt;&lt;br /&gt;In this specific case, ColonDelimitedMovieFinder cannot be referenced from MovieLister. If we try and create a MovieLister by writing: MovieLister new, creation will fail with a message not understood error on ColonDelimitedMovieFinder. We’d have to declare a constructor for MovieLister with the movie finder as a parameter:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;public class MovieLister finderClass: MovieFinder = (&lt;br /&gt;|&lt;br /&gt; private movieDB = MovieFinder from:’movies.txt’.&lt;br /&gt;|)&lt;br /&gt;(&lt;br /&gt;   public moviesDirectedBy: directorName = (&lt;br /&gt;       ^movieDB findAll select:[:m |&lt;br /&gt;                           m director = directorName&lt;br /&gt;                        ].&lt;br /&gt;)&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;At this point, we can immediately see that we can replace ColonDelimitedMovieFinder with any class that supports the same interface, which was the object of the entire exercise. Newspeak won’t let you create a module with concrete external dependencies, because that wouldn’t really be a module, would it?&lt;br /&gt;&lt;br /&gt;In Newspeak code in a module doesn’t have any concrete external dependencies, and no dependees need to be injected. What’s more we can subclass or mix-in any class coming in as a parameter - something a DIF won’t handle.&lt;br /&gt;&lt;br /&gt;What about a subsystem within a module? What if I don’t want it using the same name binding as the enclosing module?  I can explicitly parameterize my subsystem,  though that requires pre-planning. &lt;br /&gt;&lt;br /&gt;I can also override any class binding in a subclass. Newspeak is message-based, so all names are late-bound. Hence any reference to the name of a class can be overridden in a subclass. Classes can be overridden by methods or slots or other classes in any combination. So even if you do not explicitly parameterize your code to allow for another class to be used to construct an object, you can still override the binding of the class name as necessary.&lt;br /&gt;&lt;br /&gt;In summary, Newspeak is designed to support, even induce, loose coupling. That’s the point of message based programming languages. DIFs are an expedient technique to reduce code coupling in the sad world of mainstream development, but in a language like Newspeak, they are pointless.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-1781570237423980139?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/1781570237423980139/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=1781570237423980139' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1781570237423980139'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1781570237423980139'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/12/some-months-ago-i-wrote-couple-of-posts.html' title='Lethal Injection'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5103070001746073480</id><published>2007-09-25T11:50:00.001-07:00</published><updated>2010-01-17T17:35:06.149-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Executable Grammars</title><content type='html'>Way back in January, &lt;a href="http://gbracha.blogspot.com/2007/01/parser-combinators.html"&gt;I described a parser combinator library&lt;/a&gt; I’d built in Smalltalk. Since then, we’ve moved from Smalltalk to Newspeak,  and refined the library in interesting ways.&lt;br /&gt;&lt;br /&gt;The original  parser combinator library has  been a great success, but grammars built with it were still polluted by two solution-space artifacts. One is the need to use self sends, as in&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;id := self letter, [(self letter | [self digit]) star]&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;In Newspeak, self sends can be implicit, and so this problem goes away. We could write&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;id:: letter, [(letter | [digit]) star]&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The other problem is the use of blocks. We’d much rather have written&lt;br /&gt;&lt;br /&gt;id:&lt;span style="font-style: italic;"&gt;: letter, (letter | digit) star&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The solution is to have the framework pre-initialize all slots representing productions with a forward reference parser that acts as a stand-in for the real parser that will be computed later.&lt;br /&gt;&lt;br /&gt;When the tree of parsers representing the grammar has been computed, it will contain stand-ins exactly in those places where forward references occurred due to mutual recursion. When such a parser gets invoked, it looks up the real parser (now stored in the corresponding slot) and forwards all parsing requests to it.&lt;br /&gt;&lt;br /&gt;Our parsers are now entirely unpolluted by solution-space boilerplate, so I feel justified in calling them executable grammars. They really are executable specifications, that can be shared among all the tools that need access to a language’s grammar.&lt;br /&gt;&lt;br /&gt;Below is a small but complete grammar in Newspeak:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;class ExampleGrammar1 = ExecutableGrammar (&lt;br /&gt;|&lt;br /&gt; digit = self charBetween: $0 and: $9.&lt;br /&gt; letter = (self charBetween: $a and: $z) | (self charBetween: $A and: $Z).&lt;br /&gt; id = letter, (letter | digit) star.&lt;br /&gt; identifier = tokenFor: id.&lt;br /&gt; hat = tokenFromChar: $^.&lt;br /&gt; expression = identifier.&lt;br /&gt; returnStatement = hat, expression.&lt;br /&gt;|&lt;br /&gt;) ()&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If you want to understand all the details, check out &lt;a href="http://bracha.org/executableGrammars.pdf"&gt;this paper&lt;/a&gt;; if you can, you might also look at my &lt;a href="http://jaoo.dk/"&gt;JAOO &lt;/a&gt;2007 &lt;a href="http://bracha.org/newspeak-parsers.pdf"&gt;talk&lt;/a&gt;, which also speculates on how we can make things look even nicer, e.g.:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;class ExampleGrammar3 = ExecutableGrammar (&lt;br /&gt;|&lt;br /&gt;digit = $0 - $9.&lt;br /&gt;letter = ($a - $z) | ($A - $Z).&lt;br /&gt;id = letter, (letter | digit) star.&lt;br /&gt;identifier = tokenFor: id.&lt;br /&gt;expression = identifier.&lt;br /&gt;returnStatement = $^, expression.&lt;br /&gt;|&lt;br /&gt;)() &lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5103070001746073480?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5103070001746073480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5103070001746073480' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5103070001746073480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5103070001746073480'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/09/executable-grammars.html' title='Executable Grammars'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-2103408822566191738</id><published>2007-08-15T13:30:00.000-07:00</published><updated>2010-01-17T17:39:25.958-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='constructors'/><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Object Initialization and Construction Revisited</title><content type='html'>In my last post, which discussed object initialization and construction,  I had promised to come back to the topic and clarify it with concrete examples. I've finally found time to do that; hopefully I will dispel some of the misunderstandings that the last post engendered, no doubt replacing them with fresh, deeper misunderstandings.&lt;br /&gt;&lt;br /&gt;Below is a standard example - a class representing points in the plane. What’s non-standard is that it is written in &lt;span style="font-weight: bold;"&gt;Newspeak&lt;/span&gt;, an experimental language in the style of Smalltalk, which I and some of my colleagues are working on right now.  In cases where the syntax is non-obvious, I’ll use comments (Pascal style, like so: &lt;span style="font-style: italic;"&gt;(* this is a comment *)&lt;/span&gt;) to show how a similar construct might be written in a more conventional (and less effective) notation.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;class Point2D x: i y: j = ( &lt;br /&gt;(* Javanese version might look like this : &lt;br /&gt;   class Point2D setXY(i, j) { ...} *)&lt;br /&gt;(*A class representing points in 2-space” *)&lt;br /&gt;|&lt;br /&gt;  public x ::= i.&lt;br /&gt;  public y ::= j.&lt;br /&gt;|&lt;br /&gt;) (    (* instance side *)&lt;br /&gt;&lt;br /&gt;  public printString = (&lt;br /&gt;    ˆ ’ x = ’, x printString, ’ y = ’, y printString&lt;br /&gt;  ) &lt;br /&gt;)&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;this declaration introduces the class &lt;span style="font-style: italic;"&gt;Point2D&lt;/span&gt;.  The class name is immediately followed by a message pattern (method signature for readers of Javanese) &lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;x: i y: j&lt;/span&gt;&lt;/span&gt;.  This pattern describes the primary constructor message for the class. The pattern introduces two formal parameters, &lt;span style="font-style: italic;"&gt;i&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;j&lt;/span&gt;, which are in scope in the class body. The result of sending this message to the class is a fresh instance, e.g.:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;Point2D x: 42 y: 91 &lt;br /&gt;(* In Javanese, you might write Point2D.setXY(42, 91);  &lt;br /&gt;   But don’t even think of interpreting setXY as a static method!&lt;br /&gt;*)&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;yields a new instance of &lt;span style="font-style: italic;"&gt;Point2D&lt;/span&gt; with &lt;span style="font-style: italic;"&gt;x = 42&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;y = 91&lt;/span&gt;. The message causes a new instance to be allocated and executes the slot initializers for that instance, in this case&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;x ::= i.&lt;br /&gt;y ::= j.&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The slots are accessed only through automatically generated getters (&lt;span style="font-style: italic;"&gt;x&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;y&lt;/span&gt;) and setters (&lt;span style="font-style: italic;"&gt;x:&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;y:&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;How is all this different from mainstream constructors?&lt;br /&gt;Because an instance is created by sending a message to an object, and not by some special construct like a constructor invocation, we can replace the receiver of that message with any object that responds to that message. It can be another class (say, an implementation based on polar coordinates), or it can be a factory object that isn’t a class at all. &lt;br /&gt;&lt;br /&gt;Here is a method that takes the class/factory as a parameter&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;makePoint: pointFactory = (&lt;br /&gt;(* In Javanese: &lt;br /&gt;   makePoint(pointFactory) {&lt;br /&gt;     return pointFactory.setXY(100, 200)&lt;br /&gt;   } &lt;br /&gt;*)&lt;br /&gt;  ^pointFactory x: 100 y: 200&lt;br /&gt;)&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We can invoke this so:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;makePoint: Point2D&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;but also so:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;makePoint: Polar2D&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;where &lt;span style="font-style: italic;"&gt;Polar2D&lt;/span&gt; might be written as follows:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;class Polar2D rho: r theta: t = (&lt;br /&gt;(* A class representing points in 2-space”*)&lt;br /&gt;|&lt;br /&gt;  public rho ::= r.&lt;br /&gt;  public theta ::= t.&lt;br /&gt;|&lt;br /&gt;) (   (* instance side *)&lt;br /&gt;  public x = ( ^rho * theta cos) (* emulate x/y interface *)&lt;br /&gt;  public y = (^rho * theta sin)&lt;br /&gt;  ...&lt;br /&gt;  public printString  = (&lt;br /&gt;    ˆ ’ x = ’, x printString, ’ y = ’, y printString&lt;br /&gt;  )&lt;br /&gt;) : (  (* class side begins here*)&lt;br /&gt;  public x: i y: j = (&lt;br /&gt;    | r t |&lt;br /&gt;    t := i arcCos.&lt;br /&gt;    r := j/ t sin.&lt;br /&gt;    ˆrho: r theta: t&lt;br /&gt;  )&lt;br /&gt;)&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Here, &lt;span style="font-style: italic;"&gt;Polar2D&lt;/span&gt; has a secondary constructor, a class method &lt;span style="font-style: italic;"&gt;x:y:&lt;/span&gt;, which will be invoked by &lt;span style="font-style: italic;"&gt;makePoint:&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;You cannot do this with constructors or with static factories; you simply cannot abstract over them.&lt;br /&gt;&lt;br /&gt;You could use reflection in Java, passing the &lt;span style="font-style: italic;"&gt;Class&lt;/span&gt; object as a parameter and then searching for a constructor or static method matching the desired signature. Even then, you would have to commit to using a class. Here we can use any object that responds to the message &lt;span style="font-style: italic;"&gt;x:y:&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Using Java core reflection in this case is awkward and verbose, and historically hasn’t been available on configurations like JavaME. And it doesn’t work well with proxy objects either (see the &lt;a href="http://www.bracha.org/mirrors.pdf"&gt;OOPSLA 2004 paper&lt;/a&gt; we wrote for details).  What’s more, you may not have the right security permissions to do it. The situation is not much better with the VM from the makers of Zune (tm) either.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;Zune is a trademark of Microsoft Corporation.  Microsoft is also a trademark of Microsoft Corporation. But GNU’s not Unix&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Alternatively, you could also define the necessary factory interface, implement it with factory classes, create factory objects and only pass those around.  You’d have to do this for every class of course, whether you declared it or not. This is tedious, error prone, and very hard to enforce. The language should be doing this for you.&lt;br /&gt;&lt;br /&gt;So far, we’ve shown how to manufacture instances of a class. What about subclassing? This is usually where things get sticky.&lt;br /&gt;&lt;br /&gt;Here’s a class of 3D points&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;class Point3D x: i y: j z: k = Point2D x: i y: j (&lt;br /&gt;(* A class representing points in 3-space *)&lt;br /&gt;| public z ::= k. |&lt;br /&gt;)  (* end class header *)&lt;br /&gt;(  (*begin instance side *)&lt;br /&gt;   public printString = (&lt;br /&gt;    ˆsuper printString, ’ z = ’, z printString&lt;br /&gt;  )&lt;br /&gt;)&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;One detail that’s new here is the superclass clause: &lt;span style="font-style: italic;"&gt;Point3D&lt;/span&gt; inherits from &lt;span style="font-style: italic;"&gt;Point2D&lt;/span&gt;, and calls &lt;span style="font-style: italic;"&gt;Point2D’s&lt;/span&gt; primary constructor.  This is a requirement, enforced dynamically at instance creation time. It helps ensure that an object is always completely initialized.&lt;br /&gt;&lt;br /&gt;Unlike Smalltalk, one cannot call a superclass’ constructors on a subclass. This prevents you from partially instantiating an object, say by writing:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;Point3D x: 1 y: 2  (* illegal! *)&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;without initializing &lt;span style="font-style: italic;"&gt;z&lt;/span&gt; as intended. Also, unlike Smalltalk, there’s no instance method that does the initialization on behalf of the class object. So you cannot initialize an object multiple times, unless the designer deliberately creates an API to allow it. The idea is to ensure every object is initialized once and only once, but without the downsides associated with constructors.&lt;br /&gt;&lt;br /&gt;Preventing malicious subclasses from undermining the superclass initialization takes care. We’re still considering potential solutions. The situation is no worse than in Java, it seems, and we may be able to make it better.&lt;br /&gt;&lt;br /&gt;A different concern is that the subclass must call the primary constructor of the superclass. So what happens when I want to change the primary constructor? Say I want to change &lt;span style="font-style: italic;"&gt;Point2D&lt;/span&gt; to use polar representation. Can I make &lt;span style="font-style: italic;"&gt;rho:theta:&lt;/span&gt; the primary constructor? How can I do this without breaking subclasses of &lt;span style="font-style: italic;"&gt;Point2D&lt;/span&gt;, such as &lt;span style="font-style: italic;"&gt;Point3D&lt;/span&gt;? We can't do it directly yet (though we should have a fix for that in not too long), but I can redefine &lt;span style="font-style: italic;"&gt;Point2D&lt;/span&gt;&lt;br /&gt;as&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;class Point2D x: i y: j =  Polar2D rho:  ... theta: ... = ()()&lt;br /&gt;: ( “class side begins here”&lt;br /&gt;(* secondary constructor *)&lt;br /&gt;  public rho: r theta: t = (&lt;br /&gt;    ˆx: r * t cos y: r * t sin&lt;br /&gt;  )&lt;br /&gt;)&lt;/pre&gt; &lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now anyone who uses a &lt;span style="font-style: italic;"&gt;Point2D&lt;/span&gt; gets a point in polar representation, while preserving the existing interface. And anyone who wants to can of course create polar points using the secondary constructor. I can also arrange for that constructor to return instances of &lt;span style="font-style: italic;"&gt;Polar2D&lt;/span&gt; directly:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&lt;pre&gt;public rho: r theta: t = (&lt;br /&gt;  ˆPolar2D rho: r theta: t&lt;br /&gt;)&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If you find this interesting, you might want to read a short &lt;a href="http://dyla2007.unibe.ch/?download=dyla07-Gilad.pdf"&gt;position paper&lt;/a&gt; I wrote for the &lt;a href="http://dyla2007.unibe.ch/"&gt;Dynamic Languages Workshop&lt;/a&gt; at ECOOP. It only deals with one specific issue regarding the interaction of nested classes and inheritance, and it’s a just a position paper describing work in progress, but if you’ve gotten this far, you might take a look.&lt;br /&gt;&lt;br /&gt;I still haven’t explained why I see no need for dependency inversion frameworks. The short answer is that because &lt;span style="font-weight: bold;"&gt;Newspeak&lt;/span&gt; classes nest arbitrarily, we can define a whole class library nested inside a class, and parameterize that class with respect to any external classes the library depends on. That probably needs more explanation; indeed, I think there’s a significant academic paper to be written on the subject. Given the length of this post,  I won’t expand on the topic just yet.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-2103408822566191738?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/2103408822566191738/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=2103408822566191738' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/2103408822566191738'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/2103408822566191738'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/08/object-initialization-and-construction.html' title='Object Initialization and Construction Revisited'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-7589575897199820043</id><published>2007-06-24T20:47:00.000-07:00</published><updated>2010-01-17T17:38:16.452-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='constructors'/><title type='text'>Constructors Considered Harmful</title><content type='html'>In mainstream object oriented programming languages, objects are created by invoking constructors. This is rather ironic, since you can say a lot about constructors, but you cannot honestly say that they are object oriented. Ok, so what? Isn’t “object-oriented” just an old buzzword? If constructors work well, who cares?&lt;br /&gt;&lt;br /&gt;Of course, the problem is that constructors don’t work very well all. Understanding why helps to understand what “object oriented” really means, and why it is important.&lt;br /&gt;&lt;br /&gt;Constructors come with a host of special rules and regulations: they cannot be overridden like instance methods; they need to call another constructor, or a superclass constructor etc. Try defining mixins in the presence of constructors - it’s tricky because the interface for creating instances gets bundled with the interface of the instances themselves. &lt;br /&gt;&lt;br /&gt;The basic issue is a failure of abstraction, as Dave Ungar put it in his OOPSLA keynote in 2003.&lt;br /&gt;&lt;br /&gt;Suppose we have a constructor C(x) and code that creates objects by calling it. What happens if we find that we actually need to return an instance of a class other than C? For example,  we might want to lazily load data from secondary storage, and need to return some sort of placeholder object that behaves just like a C, but isn’t? What if we want to avoid reallocating a fresh object, and use a cached one instead?&lt;br /&gt;&lt;br /&gt;So it’s clear that you don’t want to publicize your constructors to the clients of your API, as all they can do is tie you down.&lt;br /&gt; &lt;br /&gt;The standard recommended solution is to use a static factory. This however, does nothing to help the other victims of constructors - their callers. As a consumer of an API, you don’t want to use constructors: they induce a tight coupling between your code and specific implementations. You can’t abstract over static methods, just as you can’t abstract over constructors. In both cases, there is no object that is the target of the operation and a conventional interface declaration cannot describe it.  The absence of an object means that constructors don’t have the benefits objects bring - dynamic binding of method calls chief among these. Which is why constructors and static methods don’t work well, and incidentally, aren’t object oriented.&lt;br /&gt;&lt;br /&gt;Having dismissed constructors and static factories, it seems we need to define a factory class whose instances will  support an interface that includes a method that constructs the desired objects. How will you create the factory object? By calling a constructor? Or by defining a meta-factory? After how many meta-meta-meta- .. meta-factories do you give up and call a constructor?&lt;br /&gt;&lt;br /&gt;What about using a dependency injection framework (DIF)? Ignoring the imbecilic name, I think that if you’re stuck with a mainstream language, that may be a reasonable work around. It requires a significant degree of preplanning, and makes your application dependent on one more piece of machinery that has nothing to do with the actual problem the application is trying to solve. On the positive side, it helps guarantee employment for software engineers. That said, it’s important to understand that DIFs are just a work around for a deficiency in the underlying language. &lt;br /&gt;&lt;br /&gt;So why not get rid of constructors and have a class declaration create a factory object instead? Well, Smalltalk did just that a generation ago. Every time you define a class, you define the factory object for its instances. I won’t explain the Smalltalk metaclass hierarchy here. Suffice to say that it is a thing of beauty, resolving a potential infinite regress with an elegant circularity.&lt;br /&gt;&lt;br /&gt;Despite this, Smalltalk still does not provide an ideal solution for creating and initializing instances. While it preserves abstraction, it does not easily enable the class to ensure that every instance will always be properly initialized, or that initialization will take place only once. To be sure, these are difficult goals, and Java, despite enormous efforts and complexity focused on these goals, does not fully achieve them either. However, it comes close - at the cost of abstraction failure brought about by the use of constructors.&lt;br /&gt;&lt;br /&gt;So can we do better? Of course.  The solution combines elements of Scala constructors (which are cleaner than Java constructors) with elements of the Smalltalk approach.&lt;br /&gt;&lt;br /&gt;Scala defines a class as a parametric entity, with formal parameters that are in scope in the class body.  The class name, together with its formal parameters, define the primary constructor of the class. This allows the instance to initialize itself, accessing the parameters to the constructor without exposing an initialization method that can be called multiple times on an instance. The latter is one of the problems in Smalltalk.&lt;br /&gt;&lt;br /&gt;We use a similar device to provide access to the constructor parameters from within the instance. However, we require that the class provide a method header for its primary constructor. Instead of creating instances by a special construct (constructor invocation) as in Java or Scala, we create them via method invocation. The class declaration introduces a factory object that supports the primary constructor as an instance method.&lt;br /&gt;&lt;br /&gt;Because we create objects only by invoking methods on other objects, we preserve abstraction. We can create objects by invoking the constructor method on a parameter. We can always define alternative factory objects that support the same constructor method with different behavior, and pass them instead of the class. Furthermore, using a message based programming language, references to the class’ name are always virtual, and can be overridden.&lt;br /&gt;&lt;br /&gt;Unlike Smalltalk, the factory class is not a subclass of the superclass factory class. This prevents the possibility of calling superclass constructors and thereby creating partially initialized objects (doing this requires special effort in Smalltalk - one has to manually override the superclass constructors so that they fail; this is tedious and error prone, and not done much in practice).&lt;br /&gt;&lt;br /&gt;I know I should be writing examples to make this all clear.  However, this post is getting long, so that will wait for another post. I’ll be speaking about some of this work next month a the dynamic language workshop at ECOOP. By then, I imagine I’ll put out some examples of what we’ve been doing these past few months.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-7589575897199820043?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/7589575897199820043/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=7589575897199820043' title='31 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7589575897199820043'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/7589575897199820043'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/06/constructors-considered-harmful.html' title='Constructors Considered Harmful'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>31</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-1285730350106453081</id><published>2007-05-12T18:29:00.001-07:00</published><updated>2010-01-17T17:35:06.150-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Newspeak'/><title type='text'>Message Based Programming</title><content type='html'>Alan Kay was recently quoted as saying that Smalltalk should have been message oriented rather than object oriented. I’m not sure what he meant by that, but it got me thinking.&lt;br /&gt;&lt;br /&gt;Smalltalk terminology refers to method invocations as message sends. Message passing is often associated with asynchrony, but it doesn’t have to be. Smalltalk message sends are synchronous. As such, they seem indistinguishable from virtual method invocations. However, the terminology matters. Insisting that objects communicate exclusively via message sends rules out aberrations such as static methods, non-virtual methods, constructors and public fields. More than that:&lt;br /&gt;&lt;br /&gt;It means that one cannot access another object’s internals - we have to send the object a message. So when we say that an object encapsulates its data, encapsulation can’t be interpreted as just bundling - it means data abstraction. &lt;br /&gt;&lt;br /&gt;it implies that the only thing that matters about an object is how it responds to messages. Two objects that respond the same way to all messages are indistinguishable, regardless of their implementation details. This excludes implementation types for objects, and hence class based encapsulation, as in:&lt;br /&gt;&lt;br /&gt;class Foo {&lt;br /&gt;  private Integer bar = 101; // my secret code&lt;br /&gt;  public Integer baz(Foo f) {return f.bar;}&lt;br /&gt;}&lt;br /&gt; &lt;br /&gt;Overall, the message passing terminology precludes the interpretation of objects we see in mainstream languages - the spawn of C. This has great value. However, while saying that all objects communicate via message passing gives a strong notion of object, it doesn’t ban things that aren’t objects, such as values of primitive type like ints and floats. It doesn’t guarantee that a language is purely object oriented, like Smalltalk.&lt;br /&gt;&lt;br /&gt;We can nevertheless ask: is Smalltalk a message based programming language? I think not. I would take message-based programming to have an even stronger requirement: all computation is done via message passing. That includes the computation done within a single object as well. Whereas Smalltalk objects can access variables and assign them, message based programming would require that an object use messages internally as well. This is exactly what happens in Self, as I discussed in an earlier post about representation independent code.&lt;br /&gt;&lt;br /&gt;The implications of this are quite strong. In particular, this formulation does carry with it the requirement that everything is an object (at least at run time), since the only entities one computes with are those that respond to messages. &lt;br /&gt;&lt;br /&gt;I like the term Message-based Programming (MBP). It implies a lot of valuable design decisions I strongly believe in, while leaving many design alternatives open. The term is, I hope, free of the baggage that is associated with object oriented programming, which has too many flawed interpretations.&lt;br /&gt;&lt;br /&gt;I believe that future programming languages should be message based, in the sense I’ve defined above. This still leaves language developers with a huge design space to work with: message based programming languages can be imperative or declarative; statically or dynamically typed; class-based or prototype based; reflective or not; eager or lazy; and synchronous or asynchronous, to name a few important design options.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-1285730350106453081?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/1285730350106453081/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=1285730350106453081' title='16 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1285730350106453081'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/1285730350106453081'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/05/message-based-programming.html' title='Message Based Programming'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>16</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-2375357816491925619</id><published>2007-03-16T19:09:00.000-07:00</published><updated>2010-01-17T17:52:23.290-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Web Platform and Objects as Software Services'/><title type='text'>SOBs</title><content type='html'>A lurid headline is a proven way of grabbing attention. SOBs are &lt;span style="font-style: italic; font-weight: bold"&gt;Serviced Objects&lt;/span&gt;, objects that can be downloaded to your machine from a server, and thereafter serviced by it.  Any modifications you make to the SOB get saved on the server, and any updates to the SOB by the service provider get delivered to you automatically. The former provides the SOB (and you) with backup, audit trail, and sharing across multiple users or devices. The latter provides for bug fixes and feature updates to the SOB on an ongoing basis. &lt;br /&gt;&lt;br /&gt;All of this sounds suspiciously like a web app, except the fact that the SOB gets downloaded - which means it is available off line, can use all the features of your machine and in general doesn’t suffer the limitations of web apps. Another important distinction is that the core underlying technology is hotswapping, so the applications don’t need to be restarted when updated.&lt;br /&gt;&lt;br /&gt;I’ve given a number of talks about this over the past few years; the latest was at Google earlier this month. Since &lt;a href="http://video.google.com/videoplay?docid=-5886267052339478036" &gt; that talk is available via Google Video&lt;/a&gt;, I thought I’d bring up the topic here.&lt;br /&gt;&lt;br /&gt;The term I usually use for this idea is &lt;span style="font-style: italic; font-weight: bold"&gt;Objects as Software Services.&lt;/span&gt;  It is more descriptive than SOB, but could be confused with generic stuff like SOAs and conventional web apps. Another descriptive name would be &lt;span style="font-style: italic; font-weight: bold"&gt;Live Network-Serviced Applications (LNSA).&lt;/span&gt; The “live” distinguishes them from most existing NSAs, which typically need to be taken down before being updated.&lt;br /&gt;&lt;br /&gt;This leads to us to NSOOP, &lt;span style="font-style: italic; font-weight: bold"&gt;Network-Serviced Object Oriented Programming,&lt;/span&gt; and its support via NSOOPLs (&lt;span style="font-style: italic; font-weight: bold"&gt;NSOOP Languages&lt;/span&gt;).  Though I just made these terms up, that is a big part of what the talk is about: language and platform features that enable LNSAs. &lt;br /&gt;&lt;br /&gt;If this interests you, please see the talk. There are also &lt;a href="http://www.bracha.org/oopsla05-dls-talk.pdf"&gt;earlier slides&lt;/a&gt;, and a &lt;a href= "http://bracha.org/objectsAsSoftwareServices.pdf"&gt;brief write up&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-2375357816491925619?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/2375357816491925619/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=2375357816491925619' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/2375357816491925619'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/2375357816491925619'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/03/sobs.html' title='SOBs'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-3023648275949867996</id><published>2007-02-24T16:47:00.000-08:00</published><updated>2007-02-24T16:58:26.938-08:00</updated><title type='text'>Tuples</title><content type='html'>Tuples are a handy construct found in many programming languages. Oddly enough, they are lacking in mainstream languages like Java, C#, C++ etc.&lt;br /&gt;&lt;br /&gt;Java was actually designed to have tuples from the start, but they never quite got in. At one point, I tried to add a tuple-like construct as part of JSR-65. We wanted to extend array initializers to full blown expressions.  Eventually, that effort got quashed; I didn’t really resist, since the extension was painful, as such things always are in Java. We should have just done straight tuples as a separate construct - but at the time, that was frowned upon too.&lt;br /&gt;&lt;br /&gt;Tuples are useful for a variety of reasons. If you need to package several objects together, but don’t want to define a class just for that purpose,you can use a tuple. This subsumes features like “multiple-value returns” which people have been seeking in Java for many years (without success). It also pretty much covers the need for variable-arity methods. Java went with a special sugar for these instead, a decision I initially opposed; I was eventually convinced to cave in on this, which I really regret (the responsible parties know who they are).  &lt;br /&gt;&lt;br /&gt;Anyway, that’s all water (well, coffee actually) under the bridge. Back to the main point.&lt;br /&gt;&lt;br /&gt;Once you have tuples, handy uses crop up frequently, and you wonder how you ever got along without them.&lt;br /&gt;Smalltalk doesn’t quite have tuples. Instead it has array literals, which are compile time constants and so rather limiting. Squeak does have tuples, though they are not very often used.  It would be best to forego array literals entirely and replace them with tuples.&lt;br /&gt;&lt;br /&gt;There are subtleties. Literal tuples are best defined as read only. &lt;br /&gt;One reason for this is that readonly tuples are more polymorphic. Long tuples are subtypes of short ones:&lt;br /&gt;&lt;br /&gt;{S. T. U. V } &lt;= {S. T. U} &lt;= {S. T} &lt;= {S}&lt;br /&gt;&lt;br /&gt;Read only tuples are covariant:&lt;br /&gt;&lt;br /&gt;T1 &lt;= T2, S1 &lt;= S2 ==&gt; {S1. T1} &lt;= {S2. T2}&lt;br /&gt;&lt;br /&gt;And a read-only literal tuple can be viewed as a list:&lt;br /&gt;&lt;br /&gt;S &lt;= U, T &lt;= U, s: S , t : T ==&gt; {s. t} : List[U]&lt;br /&gt;&lt;br /&gt;where List[E] is a generic (note I'm using square brackets for type parameters) readonly type for lists (such as SeqCltn[E] in Strongtalk).  It should be clear that it is very important to relate tuples to the general collection hierarchy.&lt;br /&gt;Note that it is unsound to assume that&lt;br /&gt;&lt;br /&gt;S &lt;= U, T &lt;= U  ==&gt; {S. T}  &lt;= List[U]&lt;br /&gt;&lt;br /&gt;since&lt;br /&gt;&lt;br /&gt;{S. T. V} &lt;= {S.T} for all V, but if V &lt;= U does not hold, {S. T. V} &lt;= List[U] does not hold either.&lt;br /&gt;&lt;br /&gt;Now if you want writable tuples, you can use an idiom like&lt;br /&gt;{e1. e2. e3} asWritableTuple, which should create a copy of the literal that is writable (don’t worry about the cost of the copy; let the system figure it out).&lt;br /&gt;&lt;br /&gt;So, to summarize: tuples are great. Every language should have them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-3023648275949867996?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/3023648275949867996/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=3023648275949867996' title='18 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/3023648275949867996'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/3023648275949867996'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/02/tuples.html' title='Tuples'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>18</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-5565472747578580880</id><published>2007-01-13T17:24:00.000-08:00</published><updated>2007-01-13T17:32:59.018-08:00</updated><title type='text'>Representation Independent Code</title><content type='html'>In most object oriented languages, replacing a field with a method requires you to track down the uses of that field and changing them from field accesses to method invocations. The canonical example is a class of points. You decided to change your representation from cartesian to polar coordinates, and now all the places you wrote ‘x’ have to be rewritten as ‘x()’.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;This example isn’t so bad, because the odds are you already had an x() method, and you probably had the sense to avoid making the x field public. But maybe you made it protected (perhaps your language is smart enough to disallow public fields, but simple-minded enough to force them to always make them protected, like Smalltalk). If x is protected, you’ll need to find all the subclasses. Maybe you don’t have access to all of them, and you can never get rid of the field x.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Or maybe you make a stupid mistake, and made x public, perhaps in the mad rush toward a release. Won’t happen to you? Take a look at Java’s System.out and ask yourself how it got to be there. Now go find all the uses of x and change them. Even if you can, it’s pretty tiresome.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The fact is, given the ability to publicize a field, programmers will do so. Once that’s happened, tracking down the uses may be impossible, and in any case is a huge amount of work.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;It would be nice if you didn’t have to worry about this sort of thing. If everyone using your object went through a procedural interface, for example. Smalltalk makes all uses outside of an object do that - but uses within the object, in its class and subclasses, are exempt. As for mainstream languages like Java and C# - they allow you to declare public fields; it’s your funeral.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;About 20 years ago, Dave Ungar and Randy Smith introduced &lt;a href= "http://en.wikipedia.org/wiki/Self_programming_language"&gt;Self&lt;/a&gt;, which fixed this problem.  All communication is through method calls (or synchronous message sends, if you will) - even the object’s own code works exclusively by sending messages to itself and other objects. Fields (slots in Selfish) are defined declaratively, and automatically define access methods. The only way to get or set a field is by invoking a method. So if you get rid of the field and replace it with a method that computes the value instead, no source code anywhere can tell the difference. The code is representation independent. Self’s syntax makes it very easy and natural to send a message/call a method - there is no overhead compared to accessing a field in other languages.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In C# they have a thing called properties, which is similar. Except that C# also has fields, and so it requires careful attention by the programmer to ensure representation independence.  In other words, it cannot be relied upon to happen. I don’t know why the designers of C# chose to support both fields and properties. I should ask my friends at Microsoft (yes, I have a few; I’m very non-judgmental). In complex languages, there are always all kinds of strange gotchas and constraints.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;There are of course other ways that languages can undermine representation independence.  In particular, the type system can support class types that make code dependent on which class you use, rather than on what interface is supported.  I don’t want to dive into that right now.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The point of this post is to draw attention to the importance of representation independence. If you are using C#  or something else with such a construct, I’d suggest you make the best of it and use properties or their equivalent religiously.  And future languages should follow Self’s lead and ensure representation independence.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-5565472747578580880?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/5565472747578580880/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=5565472747578580880' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5565472747578580880'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/5565472747578580880'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/01/representation-independent-code.html' title='Representation Independent Code'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-828259184472764778</id><published>2007-01-06T11:31:00.000-08:00</published><updated>2007-01-06T14:10:41.312-08:00</updated><title type='text'>Parser Combinators</title><content type='html'>Many excellent ideas from functional programming apply to object oriented programming (functions are objects too, you know). Parser combinators in particular are a technique with a long history in the functional programming community.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;This is a blog post, so I’m not going to give a proper bibliography of the idea; suffice to say that it goes back decades, and that &lt;a href = "http://homepages.inf.ed.ac.uk/wadler/"&gt;Phil Wadler,&lt;/a&gt; and &lt;a href="http://research.microsoft.com/~emeijer/"&gt;Erik Meijer&lt;/a&gt;, among others, have done important work in this area. I myself was inspired to look into the topic by &lt;a href="http://lampwww.epfl.ch/~odersky/"&gt;Martin Odersky&lt;/a&gt;’s &lt;br /&gt;&lt;a href="http://scala.epfl.ch/docu/files/ScalaByExample.pdf"&gt;Scala tutorial&lt;/a&gt;.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;I’ve built a simple parser combinator library in Smalltalk, and it seems to work out very nicely. I thought it would be good to write about it here, coming at the topic from an OO perspective.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;So, what are parser combinators exactly? The basic idea is to view the operators of BNF (or regular expressions for that matter) as methods that operate on objects representing productions of a grammar. Each such object is a parser that accepts the language specified by a particular production. The results of the method invocations are also such parsers. The operations are called combinators for rather esoteric technical reasons (and to intimidate unwashed illiterates wherever they might lurk).&lt;br /&gt;&lt;br&gt;&lt;br /&gt;To make this concrete, lets look at a fairly standard rule for identifiers:&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-style: italic; font-weight: bold; line-height: 14pt;"&gt;id -&amp;gt; letter (letter | digit)*&lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;using my combinator library in Smalltalk, one defines a subclass of CombinatorialParser, and inside it one writes&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;id := self letter, [(self letter | [self digit]) star]&lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;Here,  &lt;span style="font-weight: bold"&gt;letter&lt;/span&gt; is a a parser that accepts a single letter; &lt;span style="font-weight: bold"&gt;digit&lt;/span&gt; is a parser that accepts a single digit. Both are obtained by invoking a method on  &lt;span style="font-weight: bold"&gt;self&lt;/span&gt; ( &lt;span style="font-weight: bold"&gt;this&lt;/span&gt;, for those unfortunates who don’t program in Smalltalk). The subexpression  &lt;span style="font-weight: bold"&gt;self letter | [self digit]&lt;/span&gt; invokes the method  &lt;/span&gt;&lt;span style="font-weight: bold"&gt;|&lt;/span&gt; on the parser that accepts a letter, with an argument that accepts a digit (ignore the square brackets for a moment). The result will be a parser that accepts either a letter or a digit.  &lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 13pt; margin-bottom: 0pt; margin-left: 0pt; margin-right: 0pt; margin-top: 0pt; padding-bottom: 12pt; padding-top: 0pt; text-align: left; text-indent: 0pt; font-family: 'Helvetica', 'Arial', 'sans-serif'; font-size: 11pt; font-style: italic; font-variant: normal; font-weight: normal; letter-spacing: 0; line-height: 13pt; opacity: 1.00; text-decoration: none; text-transform: none;"&gt;&lt;br /&gt;tangent: No Virginia, Smalltalk does not have operator overloading. It simply allows method names using non-alphanumeric characters. These are always infix and all have the same fixed precedence. How can it be so simple? It’s called minimalism, and it’s not for everyone. Like Mies van der Rohe vs. Rococo. &lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;The only detail I’ve glossed over is the brackets. The brackets denote closures, so that  &lt;span style="font-weight: bold"&gt;[self digit]&lt;/span&gt; is a closure that when applied, will yield a parser that accepts a digit.  Why do I do this? Because grammar rules are often mutually recursive. If a production A is used in a production B and vice versa, one of them needs to be defined first (say, A), at which point the other (B) is not yet defined and yet must be referenced.  Wrapping the reference to the other production in a closure delays its evaluation and gets round this problem. In a lazy language like Haskell this is not an issue - which is one key reason Haskell is very good at defining DSLs. However, Smalltalk’s closure syntax is very lightweight (lighter than lambdas in most functional languages!) so this is not a big deal. And Smalltalk’s infix binary methods and  postfix unary methods give a very pleasing result overall.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;We then invoke the method star on the result&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;&lt;br /&gt;(self letter | [self digit]) star &lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;which yields a parser that accepts zero or more occurrences of either a letter or a digit. &lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 13pt; margin-bottom: 0pt; margin-left: 0pt; margin-right: 0pt; margin-top: 0pt; padding-bottom: 12pt; padding-top: 0pt; text-align: left; text-indent: 0pt;  font-family: 'Helvetica', 'Arial', 'sans-serif'; font-size: 11pt; font-style: italic; font-variant: normal; font-weight: normal; letter-spacing: 0; line-height: 13pt; opacity: 1.00; text-decoration: none; text-transform: none;"&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;In a syntax more people understand, this would look something like: &lt;br /&gt;&lt;br&gt;&lt;br /&gt;   (1) letter().or( new DelayedParser(){ public Parser value(){ return digit();} }).star()&lt;br /&gt;&lt;br&gt;&lt;br /&gt;If Java had closures, it might look like this:&lt;br /&gt;&lt;br&gt;&lt;br /&gt;  (2)  letter().or({=&gt;  digit()}).star()&lt;br /&gt;&lt;br&gt;&lt;br /&gt;This is better, but either way,  the goal of writing an executable grammar tends to get lost in the noise. Nevertheless, it seems most people prefer (1), and the vast majority of the rest prefer (2) over the “bizarre” Smalltalk syntax. Who knows what darkness lies in the hearts of men.  &lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;We pass this parser as an argument to the method , which we invoke on &lt;span style="font-weight: bold"&gt;letter&lt;/span&gt;. The “,” method is the sequencing combinator (which is implicit in BNF). It returns a parser that first accepts the language of the receiver (target, in Javanese) and then accepts the language of its argument. In this case, this means the result accepts a single letter, followed by zero or more occurrences of either a letter or a digit, just as we’d expect. Finally, we assign this result to id, which will now represent the production for identifiers. Other rules can use it by invoking its accessor (i.e., &lt;span style="font-weight: bold"&gt;self id&lt;/span&gt;).&lt;br /&gt;&lt;br&gt;&lt;br /&gt;The example also shows that this approach to parsing covers both lexers and parsers.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;The lack of distinction between lexing and parsing is a bit of a problem. Traditionally, we rely on a lexer to tokenize the input. As it does that, it does away with whitespace (ignoring languages that make whitespace significant) and comments.  This is easily dealt with by defining a new operator, &lt;span style="font-weight: bold"&gt;tokenFor:&lt;/span&gt;, that takes a parser p and returns a new parser that skips any leading whitespace and comments and then accepts whatever p accepts.  This parser can also attach start and end source indices to the result, which is very handy when integrating a parser into an IDE. From the point of view of higher level grammar productions, its useful to refer to a production identifier that produces such tokenized results:&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;&lt;br /&gt; identifier :=  self tokenFor: self id.&lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;We would naturally do this for all the tokens in our language, and then define the syntactical grammar without concern for whitespace or comments, just as we would in traditional BNF. As an example, here’s the rule for the return statement in Smalltalk&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;&lt;br /&gt;returnStatement := self hat,  [self expression]. &lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;Ok, so you can define a parser pretty much by writing down the grammar. However, just accepting a language isn’t all that useful. Typically, you need to produce an AST as a result. To address this, we introduce a new operator, &lt;span style="font-weight: bold"&gt;wrapper:&lt;/span&gt; . The result of this operator is a parser that accepts the same language as the receiver. However, the result it produces from parsing differs. Instead of returning the tokens parsed, it processes these tokens using a closure which it takes as its sole parameter. The closure accepts the output of the parse as input, and yields some result - typically an abstract syntax tree.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;returnStatement := self hat,  [self expression]&lt;/div&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;     wrapper:[:r :e  | ReturnStatAST new expr:e; start: r start; end: e end].   &lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;The grammar production is still clearly separated, with the AST generation on a separate line.  However, I would prefer to leave the grammar pristine. That’s easy - put all the AST generation code in a subclass, where the grammar production accessors are overridden, so:&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;returnStatement&lt;/div&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;^super returnStatement&lt;/div&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-weight: bold; line-height: 14pt;"&gt;    wrapper:[:r :e  | ReturnStatAST new expr:e; start: r start; end: e end].    &lt;/div&gt;&lt;br /&gt;&lt;br&gt;   &lt;br /&gt;This is nice, for example, if you want to parse the same language and feed it to different back ends that each accept their own AST;  or if you need to use the parser for a different purpose, like syntax coloring, but want to share the grammar. Another nice thing about this approach is that one can factor out language extensions very cleanly (especially if you can use mixins). It's one of the benefits of embedding a DSL in a general purpose language - your DSL inherits all the features of the host language. In this case, it inherits inheritance.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;So what’s not to like? Well, one could imagine more efficient approaches to parsing. In Smalltalk, one usually parses one method at a time, and methods tend to be short. Even though I'm using Squeak, which isn't all that fast, and  parses a method on every keystroke to do syntax coloring, it's perfectly acceptable.  For large methods, the delay can be noticeable though. However, there are ways to tune things to improve performance. We have our methods ...&lt;br /&gt;&lt;br&gt;&lt;br /&gt;Another problem is left recursion, as in:&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;div class="paragraph Body" style="line-height: 14pt; font-style: italic; font-weight: bold; line-height: 14pt;"&gt;&lt;br /&gt;expr -&gt; expr + expr | expr * expr | id &lt;/div&gt;&lt;br /&gt;&lt;br&gt;&lt;br /&gt;In such cases one has to refactor the grammar. I don’t see this as a big deal, and in principle, the parser could refactor itself dynamically to solve the problem; this is one of things that one can do relatively easily in Smalltalk, that tends to be harder in other languages.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;In summary, parser combinators are really cool. They work out beautifully in Smalltalk. I had a blast implementing them and using them. Most important, they are a great example of how object orientation and functional programming synergize. If you want to learn more, there’s a lot of literature out there, mainly in the Haskell world.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-828259184472764778?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/828259184472764778/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=828259184472764778' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/828259184472764778'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/828259184472764778'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2007/01/parser-combinators.html' title='Parser Combinators'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-2052534572687542734</id><published>2006-12-03T12:07:00.000-08:00</published><updated>2006-12-03T13:44:03.978-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Aspects Modularity'/><title type='text'>Foozle Wars</title><content type='html'>I saw &lt;a href="http://hope.cs.rice.edu/twiki/pub/WG211/M3Schedule/foozles.pdf"&gt;this piece about Foozles&lt;/a&gt; on &lt;a href="http://homepages.inf.ed.ac.uk/wadler/" &gt;Phil Wadler&lt;/a&gt;’s &lt;a href="http://wadler.blogspot.com/" &gt;blog&lt;/a&gt;. It’s been there for a while, but I missed it until now. It’s great satire, covering many, er, infirmities of our field, from Aspectitis through Theorititis.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;Since Foozles are a category theoretic dual of Aspects, I might as well mention that one of the most interesting things at OOPSLA was the debate over aspects. For the record, I have never believed in AOP. Not that there aren’t real problems that the AOP community highlights; my problem is with the alleged solution.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;In particular, colleagues I respect, like &lt;a href="http://www.st.informatik.tu-darmstadt.de/public/StaffDetail.jsp?id=2" &gt;Mira Mezini&lt;/a&gt;, have told me that AOP is in large part about modularity. And yet, it’s become quite clear that AOP has serious problems with  modularity. I’d encourage everyone to read &lt;a href="http://www.fernuni-hagen.de/ps/steimann.shtml" &gt;Friedrich Steimann&lt;/a&gt;’s excellent &lt;a href="http://www.fernuni-hagen.de/ps/forschung/publikationen/publikation_27624.shtml" &gt;essay&lt;/a&gt; on the topic.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;a href="http://www.cs.virginia.edu/~sullivan/" title="http://www.cs.virginia.edu/~sullivan/" &gt;Kevin Sullivan&lt;/a&gt; has done a great job of pinpointing the modularity problem with AOP, and to propose fixes. There’s fascinating work out of Harvard Business School that informs his approach. See his &lt;a href="http://www.cs.virginia.edu/~sullivan/OOPSLA%202006%20Tutorial%20Final.ppt"&gt;OOPSLA tutorial&lt;/a&gt; for starters.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;The modularity problems with aspects are related to those with inheritance, but more severe.  Let’s look at inheritance. If one simply copies a class (textually) and modifies it, the copy is independent of the original. We can change the original without considering repercussions on the validity of the copy. With inheritance, we have a linguistic mechanism that performs the derivation for us, recognizes the dependency and maintains it. The advantages are well know - changes to the original automatically propagate to the subclass, type systems understand the relationship between the two etc. And there is a benefit - the delta defined by the subclass is defined separately. The disadvantage is that we are much more restricted in what changes we can make to the original without causing breakage in subclasses.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;The situation with aspects is not dissimilar.  Aspects, like inheritance, are a linguistic mechanism that automatically derive a variant of the original base code, and maintains a dependency from the base code to the aspects. The aspect is also textually separated - this is what makes people think it helps modularity. If the base code changes, the aspect can still apply - as long as the changes are restricted enough; on the downside, if the base code changes something the aspect relies on, the aspect will break, and so we are limited in what we can change in the base code.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;Over time, we’ve recognized what the interface between superclasses and subclasses is, made much of it explicit, and understood what limitations that places on superclasses  (and superinterfaces).  &lt;br /&gt;&lt;br&gt;&lt;br /&gt;For example, widely distributed interfaces cannot have methods added to them, which is a pretty harsh restriction. Incidentally, while the common wisdom is that you can add concrete methods to abstract classes,the truth is that these may conflict with subclasses and in principle should be avoided as well.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;With aspects, this interface remains implicit, and without an interface, there is no modularity. The base code does not commit to any interface that the aspects can rely on; they can be undermined at any time. The problem is compounded because the coupling with the base is very tight. Resolving this requires giving up on obliviousness - the idea that the base code is unaware of the aspects. Instead, base code will have to declare itself as being an aspect-base in some way. The restrictions on its evolution in that situation are likely to be pretty limiting.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;Interestingly, the cloning solution - making a copy of the original and applying the aspects changes to it - is more viable if you take the view that aspects are a reflective change to the base program. We can achieve the effect of an aspect using a suitable reflective API, that can quantify over and modify base code from the outside. Using reflection, the base code need not be edited, and the aspect remains textually separate. Unlike real aspects, no dependency is maintained between the base code and the changes to it. Our “aspects” are independent of changes to the original. Of course, we do have a maintenance problem, which may be eased by tools that track and warn about changes to the base - but do not enforce any dependency. One can even take a similar tack with respect to inheritance.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;Given how onerous the restrictions on base code are likely to be if they commit to supporting aspects (or how weak the aspects will be if the commitments by base code are not onerous), I doubt modular aspects will be worth supporting at the language level; I’d put my money on reflection and tool support instead - but time will tell.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-2052534572687542734?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/2052534572687542734/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=2052534572687542734' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/2052534572687542734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/2052534572687542734'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2006/12/foozle-wars.html' title='Foozle Wars'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-6906934024391333083</id><published>2006-10-28T15:09:00.000-07:00</published><updated>2006-10-28T15:40:13.085-07:00</updated><title type='text'>And the Winner is ... Self</title><content type='html'>I just got back from OOPSLA.  Far too much going on to report all of it here. One thing I will mention was the awards session. &lt;br /&gt;&lt;br /&gt;I was very pleased to see Dave Ungar recognized twice - once as an ACM Distinguished Engineer, and once more, with Randy Smith, for the original &lt;a href="http://portal.acm.org/citation.cfm?id=38765.38828&amp;amp;coll=portal&amp;amp;dl=ACM&amp;amp;type=series&amp;amp;idx=38765&amp;amp;part=Proceedings&amp;amp;WantType=Proceedings&amp;amp;title=Conference%2520on%2520Object%2520Oriented%2520Programming%2520Systems%2520Languages%2520and%2520Applications&amp;amp;CFID=4638062&amp;amp;CFTOKEN=77606746" title="http://portal.acm.org/citation.cfm?id=38765.38828&amp;amp;coll=portal&amp;amp;dl=ACM&amp;amp;type=series&amp;amp;idx=38765&amp;amp;part=Proceedings&amp;amp;WantType=Proceedings&amp;amp;title=Conference%2520on%2520Object%2520Oriented%2520Programming%2520Systems%2520Languages%2520and%2520Applications&amp;amp;CFID=4638062&amp;amp;CFTOKEN=77606746" style="color: #1500a0; line-height: 14pt; opacity: 1.00; text-decoration: underline; "&gt;Self paper&lt;/a&gt;&lt;span&gt;,(&lt;/span&gt;&lt;a href="http://research.sun.com/self/papers/selfPower.ps.gz" title="http://research.sun.com/self/papers/selfPower.ps.gz" style="color: #1500a0; line-height: 14pt; opacity: 1.00; text-decoration: underline; "&gt;if you have some difficulties with ACM's digital library, here's the journal version&lt;/a&gt;&lt;span&gt;), which was chosen as one of the three most influential OOPSLA papers published in the years 1986-1996.&lt;br /&gt;&lt;br /&gt;David Bacon got an ACM Distinguished Scientist award as well.&lt;br /&gt;&lt;br /&gt;It was also great to see Bill Harrison and Harold Ossher get an award for their &lt;a href="http://portal.acm.org/citation.cfm?id=165932&amp;amp;coll=portal&amp;amp;dl=ACM&amp;amp;CFID=4638062&amp;amp;CFTOKEN=77606746" title="http://portal.acm.org/citation.cfm?id=165932&amp;amp;coll=portal&amp;amp;dl=ACM&amp;amp;CFID=4638062&amp;amp;CFTOKEN=77606746" style="color: #1500a0; line-height: 14pt; opacity: 1.00; text-decoration: underline; "&gt;paper on Subjectivity&lt;/a&gt;.  It’s especially nice when the winners are such nice people. &lt;br /&gt;&lt;br /&gt;I don’t personally know Pattie Maes, the author of the other (not the third - there was no ranking among the winning papers) &lt;a href="http://portal.acm.org/citation.cfm?id=38765.38821&amp;amp;coll=portal&amp;amp;dl=ACM&amp;amp;type=series&amp;amp;idx=38765&amp;amp;part=Proceedings&amp;amp;WantType=Proceedings&amp;amp;title=Conference%2520on%2520Object%2520Oriented%2520Programming%2520Systems%2520Languages%2520and%2520Applications&amp;amp;CFID=4638062&amp;amp;CFTOKEN=77606746" title="http://portal.acm.org/citation.cfm?id=38765.38821&amp;amp;coll=portal&amp;amp;dl=ACM&amp;amp;type=series&amp;amp;idx=38765&amp;amp;part=Proceedings&amp;amp;WantType=Proceedings&amp;amp;title=Conference%2520on%2520Object%2520Oriented%2520Programming%2520Systems%2520Languages%2520and%2520Applications&amp;amp;CFID=4638062&amp;amp;CFTOKEN=77606746" style="color: #1500a0; line-height: 14pt; opacity: 1.00; text-decoration: underline; "&gt;award winning paper&lt;/a&gt;, but it is a classic and truly deserving of recognition.&lt;br /&gt;&lt;br /&gt;I hope these awards get people to go back and read those papers again. They are all great idea papers. As Dave said when accepting the award, the Self paper had no proofs, no discussion of implementation, and was not an extension of &lt;/span&gt;&lt;span style="font-style: italic; line-height: 14pt; "&gt;anything&lt;/span&gt;&lt;span&gt;. This is in sharp contrast to most of the work that gets published today, which is largely incremental. It’s almost impossible to get a paper accepted that doesn’t have a fairly complete implementation, or a lot of formalism.  This is perhaps inevitable in a maturing field, but it isn’t nearly as much fun, as interesting, as inspiring or as important.&lt;br /&gt;&lt;br /&gt;It would be very good if more people understood the ideas rather than the details of particular manifestations we see today. Comparing Self with some of the languages that are popular right now would be a very good exercise for anyone interested in programming language design.  It helps gauge the quality of the designs and the implementations, and gives a perspective that most people are missing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-6906934024391333083?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/6906934024391333083/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=6906934024391333083' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6906934024391333083'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/6906934024391333083'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2006/10/and-winner-is-self.html' title='And the Winner is ... Self'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2447174102813539049.post-8730134726785731292</id><published>2006-10-16T19:43:00.000-07:00</published><updated>2006-10-16T19:56:30.871-07:00</updated><title type='text'>Why is this blog named this way?</title><content type='html'>Well, there's the obvious reasons of course. Apart from those, in a previous life I was required to attend JavaOne. I noticed that the room where they handed out goodies to speakers was numbered 101.&lt;br /&gt;&lt;br /&gt;Apparently, no one found this disturbing.  I guess they haven't come out with "The complete idiot's guide to Orwell for dummies in 21 days" yet.&lt;br /&gt;In any event, I hope to find time to occasionally post thoughts about programming languages and platforms.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2447174102813539049-8730134726785731292?l=gbracha.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gbracha.blogspot.com/feeds/8730134726785731292/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2447174102813539049&amp;postID=8730134726785731292' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8730134726785731292'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2447174102813539049/posts/default/8730134726785731292'/><link rel='alternate' type='text/html' href='http://gbracha.blogspot.com/2006/10/why-is-this-blog-named-this-way.html' title='Why is this blog named this way?'/><author><name>Gilad Bracha</name><uri>http://www.blogger.com/profile/17934280339206214042</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_k3ghkt1e0I8/S2mQSPNoVpI/AAAAAAAAAFY/K4eUHh7PBMU/S220/IMG_2036.JPG'/></author><thr:total>7</thr:total></entry></feed>
