Depend upon it, Sir, when a man knows he is to be hanged in a fortnight, it concentrates his mind wonderfully.
-- Samuel Johnson
I wish to concentrate your mind, gentle reader, by focusing on an execution (not yours of course! I see you are already losing focus - no matter ...). My goal is to make you see the objects that are in front of you every day, hiding in plain sight.
So who are we executing? Or what? The condemned operates under a wide variety of aliases that obscure its true nature: a.out alias .exe alias ELF file alias binary and more. I mean to expose the identity that hides beneath these guises: it is an object!
When we run an executable file, we call a function in that file, and that function accesses the data in the file, possibly calling other functions in the same file recursively. Replace function with method, call with invoke and file with object in the previous sentence and you will begin to see what I mean:
When we run an executable object, we invoke a method in that object, and that method accesses the data in the object, possibly invoking other methods in the same object recursively.
The initial function, the entry point, is often called main().
Consider the venerable a.out
: it is a serialized object on disk. When the system loads it, it's deserializing it. The system then invokes the object's main() method; essentially, the system expects the executable to have an interface:
interface Executable {main(argc: Integer, argv: Array[String])}
ELF can also be viewed as a serialization format for objects in this way. We aren't used to thinking of loading and running in this way, but that doesn't detract from the point. Once you see it, you cannot unsee it.
Newspeak makes this point of view explicit. In Newspeak, an application is an object which supports the method main:args:. This method takes two arguments: a platform object, and an array object whose elements are any specific arguments required. The platform object provides access to standard Newspeak platform functionality that is not part of the application itself. To deploy an application, we serialize it using conventional object serialization. Objects reference their class, and classes reference mixins which reference methods. All of these are objects, and get serialized when we serialize the application object. So all the application's code gets serialized with it. Running the deployed app is a matter of deserializing it and calling main:args:.
The advantage of recognizing this explicitly is conceptual parsimony, which yields an economy of mechanism. You can literally reuse your object serializer as a deployment format. Serializing data and serializing code are one and the same.
Executables aren't the only objects that aren't recognized as such. Libraries are also objects. It doesn't matter if we are talking about DLLs at the operating system level or about packages/modules/units at the programming language level, or packages in the package-manager sense. The key point about all these things is that they support an API - an Application Programming Interface. We'll dispense with the acronym-happy jargon and just say interface. In all these cases, we have a set of named procedures that are made accessible to callers. They may make use of additional procedures, some publicly available via the interface, and some not. They may access data encapsulated behind the interface; that data may be mutable or not. The key thing is the notion of an interface.
Even if you are programming in a pure functional setting, such objects will make an appearance. The packages of Haskell, and certainly the structures of ML, are not all that different. They may be statically typed or they may be not. They may be statically bound at some level - but as long as we have separate compilation, this is just an optimization that relies on certain rigidities of the programming model. That is, your language may not treat these things a first class values, but your compilation units can bind to different implementations of the same package interface, even if they can only bind to one at a time. So even if the language itself does not treat these entities as true objects with dynamicly bound properties, they have to act as objects in the surrounding environment.
In many systems, the API might expose variables directly. And they very often may expose classes directly as well. Nevertheless, these are all late-bound at the level of linking across compilation units or OS libraries.
The notion of an interface is what truly characterizes objects - not classes, not inheritance, not mutable state. Read William Cook's classic essay for a deep discussion on this.
So the next time someone tells you that they don't believe in objects, that objects are bad and one shouldn't and needn't use them, you can politely inform them that they shouldn't confuse objects with Java and the like, or even with imperative programming. Objects are always with us, because the concept abstracting over implementations via an interface is immensely valuable.
A place to be (re)educated in Newspeak
Thursday, June 30, 2022
Subscribe to:
Post Comments (Atom)
14 comments:
An excellent essay about a worthy language.
My preferred definition of "object" is different, though not contradictory: Following Parnas (the paper is roughly: "on the decomposition of systems into modules"), the most important part of object is putting the code with the data, as opposed to having just varialbes and functions with no relation between them.
Classes (and prototypes) are one way, strong typing with type-based overloading is another (more static) way. The latter is not thought of as "object-oriented", but if you squint hard enough, it achieves something similar. I'd want an IDE that grouped types and relevant functions together in that case.
Hi Dave,
I recognize high praise when I see it! Thank you.
One interesting way this was embraced by an otherwise conventional-appearing operating system:
Apollo Computer's Domain/OS design principles based on a distributed single-level store.
http://bitsavers.org/pdf/apollo/014962-A00_Domain_OS_Design_Principles_Jan89.pdf
I remember the Apollo system. It was so much nicer than the Sun workstations that took over. All written in a systems oriented version of Pascal, not C. Had a very nice symbolic debugger for it too; and you could read and write memory (and share windows from/to) other workstations on the network (which was a token-ring). That last bit had serious security issues. In general, another one of those sad worse-is-better (or rather, better-is-worse) stories that characterize computing.
I don't think this is quite right. A key advantage of OO systems is that two objects (and we can think of closures as a special case) can share the same interface, either because they are instances of the same class or they belong to different classes which overlap in a method signature. I'm not an expert, but I don't think you can link two object files that implement the same top-level symbol, can you? You could try to work around this by forking, unlinking an object file, and relinking a different object file, but that has the unwanted side effect of duplicating all of your other "objects".
Perhaps Newspeak does it differently.
The "objectness" here is that the OS uses the same interface for different executables logically, the main() function/method), not that different binaries (ironically called object files) are dynamically linked against each other.
A different way to look at it might be to say that if you have a single implementation of symbol, the system can inline (or just statically bind) it. By design, linking object files requires unique symbols so that is always the case.
But thanks for highlighting that subtlety. It wasn't what I was trying to say, but it's a good point nonetheless.
See the design principles PDF in my comment above about Apollo Domain/OS. While Domain/OS provides Unix compatibility this is achieved using a distributed object-oriented single-level store. Most functionality is in user space. The top level of what amounts to an object file are global instance identifiers rather than names. Network objects are swizzled into a process address space. Methods on the instances are overloaded by type.
A company since bought by IBM tried to emulate this on Unix and succeeded at the cost of terrible complexity.
Yeah the whole system is fundamentally aimed at a trusting workgroup.
Another object-like OS was QIX which provided Unix compatibility above tuple spaces for all interprocess communication, environments, etc. All namespaces are lightweight dictionaries. All services are user-space participants in the same generative messaging system.
Patrick, thanks again for the comments on Apollo. My memories go back to the original, pure, non-Unix version (Aegis), which I encountered in 1984 I think (an important year both symbolically and concretely - I learned about Emacs, Prolog and Smalltalk that year as well).
Yeah by the time I used Apollo's Domain/OS had become the combination of Aegis, SysV, and BSD. Sadly I was using them during the HP takeover. I wasn't really paying attention until then what the future had in store.
I remembered from past conversations you had experience with Apollo's. I just take every opportunity to point out to other readers the object-oriented nature and compatibility with industry standards has worked commercially in the past and could beat way out of this current mess we're in.
Remember what Dan Ingalls said about the OS: "An operating system is a collection of things that don't fit into a language. There shouldn't be one." The Alto had the best OS ever.
/me looking for the blogspot "like" button.
@Gilad Ah, now I get it. Thanks for the clarification!
Post a Comment