Room 101: 2022

Monday, August 15, 2022

What You Want Is What You Get

How do we resolve the classic tension between WYSIWYG and markup . Alas, one can't explain that properly in blogger, but if you follow this link, you'll see what I mean.

Thursday, June 30, 2022

The Prospect of an Execution: The Hidden Objects Among Us

Depend upon it, Sir, when a man knows he is to be hanged in a fortnight, it concentrates his mind wonderfully.
-- Samuel Johnson

I wish to concentrate your mind, gentle reader, by focusing on an execution (not yours of course! I see you are already losing focus - no matter ...). My goal is to make you see the objects that are in front of you every day, hiding in plain sight.

So who are we executing? Or what? The condemned operates under a wide variety of aliases that obscure its true nature: a.out alias .exe alias ELF file alias binary and more. I mean to expose the identity that hides beneath these guises: it is an object!

When we run an executable file, we call a function in that file, and that function accesses the data in the file, possibly calling other functions in the same file recursively. Replace function with method, call with invoke and file with object in the previous sentence and you will begin to see what I mean:

When we run an executable object, we invoke a method in that object, and that method accesses the data in the object, possibly invoking other methods in the same object recursively.

The initial function, the entry point, is often called main(). Consider the venerable a.out : it is a serialized object on disk. When the system loads it, it's deserializing it. The system then invokes the object's main() method; essentially, the system expects the executable to have an interface:

interface Executable {main(argc: Integer, argv: Array[String])}

ELF can also be viewed as a serialization format for objects in this way. We aren't used to thinking of loading and running in this way, but that doesn't detract from the point. Once you see it, you cannot unsee it.

Newspeak makes this point of view explicit. In Newspeak, an application is an object which supports the method main:args:. This method takes two arguments: a platform object, and an array object whose elements are any specific arguments required. The platform object provides access to standard Newspeak platform functionality that is not part of the application itself. To deploy an application, we serialize it using conventional object serialization. Objects reference their class, and classes reference mixins which reference methods. All of these are objects, and get serialized when we serialize the application object. So all the application's code gets serialized with it. Running the deployed app is a matter of deserializing it and calling main:args:.

The advantage of recognizing this explicitly is conceptual parsimony, which yields an economy of mechanism. You can literally reuse your object serializer as a deployment format. Serializing data and serializing code are one and the same.

Executables aren't the only objects that aren't recognized as such. Libraries are also objects. It doesn't matter if we are talking about DLLs at the operating system level or about packages/modules/units at the programming language level, or packages in the package-manager sense. The key point about all these things is that they support an API - an Application Programming Interface. We'll dispense with the acronym-happy jargon and just say interface. In all these cases, we have a set of named procedures that are made accessible to callers. They may make use of additional procedures, some publicly available via the interface, and some not. They may access data encapsulated behind the interface; that data may be mutable or not. The key thing is the notion of an interface.

Even if you are programming in a pure functional setting, such objects will make an appearance. The packages of Haskell, and certainly the structures of ML, are not all that different. They may be statically typed or they may be not. They may be statically bound at some level - but as long as we have separate compilation, this is just an optimization that relies on certain rigidities of the programming model. That is, your language may not treat these things a first class values, but your compilation units can bind to different implementations of the same package interface, even if they can only bind to one at a time. So even if the language itself does not treat these entities as true objects with dynamicly bound properties, they have to act as objects in the surrounding environment.

In many systems, the API might expose variables directly. And they very often may expose classes directly as well. Nevertheless, these are all late-bound at the level of linking across compilation units or OS libraries.

The notion of an interface is what truly characterizes objects - not classes, not inheritance, not mutable state. Read William Cook's classic essay for a deep discussion on this.

So the next time someone tells you that they don't believe in objects, that objects are bad and one shouldn't and needn't use them, you can politely inform them that they shouldn't confuse objects with Java and the like, or even with imperative programming. Objects are always with us, because the concept abstracting over implementations via an interface is immensely valuable.

Tuesday, April 19, 2022

Bitrot Revisited: Local First Software and Orthogonal Synchronization

This post is based on a invited talk I gave recently at the Programming 22 conference.

The talk wasn't recorded but I've recorded a reprise at: https://youtu.be/qx6ekxXdidI

The definition of insanity not withstanding, I decided to revisit a topic I have discussed many times before: Objects as Software Services. In particular, I wanted to relate it to recent work others have been doing.

The earliest public presentation I gave on this at the DLS in 2005. There are recordings of talks I gave at Google and Microsoft Research, as well as several blog posts ( March 2007, April 2008, April 2009, January 2010 and April 2010). You can also download the original write up.

The goal is software that combines the advantages of old-school personal computing and modern web-based applications. There are two parts to this.

First, software should be available at all times. Like native apps, software should be available even if the network is slow, unreliable or absent, or if the cloud is otherwise inaccessible (say due to denial-of-service, natural disaster, war or whatever). And, like a cloud app, it should be accessible from any machine at any location (modulo network access if it hasn't run there before). Recently, this idea has started to get more attention under the name local-first software.

Tangent: I've proposed a number of terms for closely related ideas in older posts such as Software Objects, Rich Network Enabled Clients, Network Serviced Applications and Full-Service Computing, but whatever name gets traction is fine with me.

Second, software should always be up-to-date (this is where Bitrot comes in). That means we always want to run the latest version available, just like a web page. This implies automatically updating application code over the network without disrupting the end-user. This latter point goes well beyond the idea of local-first software as I've seen it discussed.

Let's take these two goals in order.

For offline availability, one has to store the application and its data locally on the client device. However, unlike classical personal computing, the data has to be made available, locally, to multiple clients. Now we have multiple replicas of our data, and they have to be kept in sync somehow. My proposal was to turn that responsibility over to the programming language via a concept I dubbed Orthogonal Synchronization. The idea was to extend the concept of orthogonal persistence, which held that the program would identify which fields in every data structure were deemed persistent, and the system would take care of serializing and deserializing their contents, recursively. With orthogonal synchronization, the data would not only be persisted automatically, but synchronized.
To keep the software up-to-date without disrupting the user, we want good support for dynamic software update. When the application code changes, we update the app live. How do we know when the code changes? Well, code is just data, albeit of a particular kind. Hence we sync it, just like any other persistent data. We reuse much of the same orthogonal synchronization mechanism, and since we sync both code and data at the same time, we can migrate data seamlessly whenever the code and data format changes. As I've discussed in the past, this has potentially profound implications for versioning, release cycles and software development. All this goes well beyond the focus of local-first software, and is way outside the scope of this post. See the original materials cited above for more on that aspect.

There's only one small problem: merge conflicts. The natural tendency is to diff the persistent representations to compute a set of changes and detect conflicts. An alternative is to record changes directly, whenever setters of persistent objects are called. Either way, we are comparing the application state at the level of individual objects. This is very low level; it is an extensional approach, which yields no insight into the intention of the changes. As an example, consider a set, represented as an array of elements and an integer indicating the cardinality of the set. If two clients each add a distinct object to the set, we find that they both have the same set object, but the arrays differ. The system has no way to resolve the conflict in a satisfactory manner: choosing either replica is wrong. If one understands the intention of the change, one could decide to resolve the conflict by performing both additions on the original set.

Local first computing approaches this problem differently. It still needs to synchronize the replicas. However, the problem of conflicts is elegantly defined away. The idea is to use Conflict-free Replicated Data Types (CRDTs) for all shareable data, and so conflicts cannot arise. This is truly brilliant as far as it goes. And CRDTs go further than one might think.

CRDT libraries record intentional changes at the level of the CRDT object (in our example, the set, assuming we use a CRDT implementation of a set); sync is then just the union of the change sets, and no conflicts arise. However, the fact that no formal conflict occurs does not necessarily mean that the result is what we actually expect. And CRDTs don't provide a good solution for code update.

Can we apply lessons from CRDTs to orthogonal synchronization? The two approaches seem quite contradictory: CRDTs fly in the face of orthogonal persistence/synchronization. The 'orthogonal' in these terms means that persistence/synchronization is orthogonal to the datatype being persisted/synced. You can persist/sync any datatype. In contrast, using CRDTs for sync means you have to use specific datatypes. One conclusion might be that orthogonal sync is just a bad idea. Maybe we should build software services by using CRDTs for data, and structured source control for code. However, perhaps there's another way.

Notice that the concept of capturing intentional changes is distinct from the core idea of CRDTs. It's just that, once you have intentional changes, CRDTs yield an exceptionally simple merge strategy. So perhaps we can use orthogonal sync, but incorporate intentional change data and then use custom merge functions for specific datatypes. CRDTs could fit into this framework; they'd just use a specific merge strategy that happens to be conflict-free. However, we now have additional options. For example, we can merge code with a special strategy that works a bit like traditional source control (we can do better, but that's not my point here). As a default merge strategy when no intent is specified, we could treat setter operations on persistent slots as changes and just ask the user for help in case of conflict. We always have the option to specify an alternate strategy such as last-write-wins (LWW).

How might we specify what constitutes an intentional change, and what merge strategy to use? One idea is to annotate mutator methods with metadata indicating that they are changes associated with a given merge strategy. Here is what this might look like for a simple counter CRDT:

class Counter = (| count ::= 0. |)(
public value = (^count)
public increment (* :crdt_change: *) = (
count: count + 1
)
public decrement (* :crdt_change: *) = (
count: count - 1
)))
The metadata tag (crdt_change in this case) identifies a tool that modifies the annotated method so that calls are recorded as change records with salient information (name of called method, timestamp, arguments) as well as a merge method that processes such changes according to a standardized API.

Now, to what extent is this orthogonal sync anymore? Unlike orthogonal persistence, we can't just mark slots as persistent and be done; we have to provide merge strategies. Well, since we have a default, we can still argue that sync is supported regardless of datatype. Besides, quibbling over terminology is not the point. We've gained real flexibility, in that we can support both CRDTs and non-CRDTs like code. And CRDT implementations don't need to incorporate special code for serialization and change reporting. The system can do that for them based on the metadata.

I've glossed over many details. If you watch the old talks, you'll see many issues discussed and answered. Of course, the proof of the pudding is in creating such a system and building working applications on top. I only managed to gather funding for such work once, which is how we created Newspeak, but that funding evaporated before we got very far with the sync problem. Sebastián Krynski worked on some prototypes, but again, without funding it's hard to make much progress. Nevertheless, there is more recognition that there is a problem with traditional cloud-based apps. As the saying goes: this time it's different.

Room 101