A place to be (re)educated in Newspeak

Friday, March 05, 2010

Through the Looking Glass Darkly

In January, I gave a guest lecture in a class on reflection and metaprogramming at HPI Potsdam. A screencast of the talk is now available. It’s an introduction to the concept of mirrors, which is the goodthink way of doing reflection. It’s mostly language neutral, but there is a brief demo using mirrors in Newspeak.

Because it’s a screencast rather than a video, occasionally some detail may be unclear, but by and large it is the most comprehensive introduction to mirrors available other than the OOPSLA paper.

Some people may not have an hour to watch the entire screen cast, and the paper is by no means an easy read, so I’ve decided to post the executive summary here.

The classic approach to reflection in object-oriented programming languages originates with Smalltalk, and is used in most class based languages that support reflection: define a reflective API on Object. Typically, Object supports an operation like getClass() which returns an object representing the class of the receiver. The API of classes defines most additional reflective operations available. For example, in Java, you can get reflective descriptors for a class’ methods (java.reflect.Method), fields (java.reflect.Field) and constructors (java.reflect.Constructor). You can even use these descriptors to evaluate program code dynamically - say, ask the user for the name of a method and invoke it. In Smalltalk, you can also add and remove methods and fields, change a class’ superclass, remove classes from the system etc.

Another approach is used in many scripting languages. The language constructs themselves introduce code on the fly, modifying the program as they are executed. For example, a class comes into being when a class declaration is evaluated, and might change if another declaration of a class with the same name is executed later.

The third approach is that of mirrors, and originates in Self. Mirrors have been used in class based systems such as Strongtalk, and even in the Java world. JDI, the Java Debugger Interface, is a mirror based reflective API. Here, the reflective operations are separated into distinct objects called mirrors. This seemingly minor restructuring has significant implications. Reflection is no longer tied into the behavior of every object in the system (as it is via getClass()) or (even worse) into the very syntax of the language. Instead, it resides in separable components that can be removed or replaced. Reflection is now a distinct capability, in the sense of the object capability model.

If you are worried about security, this is good news. If you don’t provide a program with the means to manufacture mirrors (e.g., you do not provide the mirror factory object), said program cannot do any reflection. You can also provide mirrors with limited capabilities - say mirrors that only reflect the program’s own code, or mirrors that do not allow you to modify code or access non-public members etc.

Caveat: The truth is, mirrors have not really been used for security. Their utility for security seems clear, but a working API has yet to be demonstrated.

Mirrors are good news for other reasons. Say your program doesn’t use reflection, and needs to fit into a small footprint such as an embedded device. It is easy to take it out. Another advantage is that you can easily plug in alternate implementations of reflection - so if you need to reflect on remote objects, you can do so.

Historical note: This is why JDI uses mirrors; indeed, it is why JDI had to be introduced. The original intent was that Java reflection would be used to support debugging; but once you need to deal with cross-process debugging, you need a distinct implementation of reflection; core reflection is tied to a single built in implementation.

Mirrors support a clear boundary between the base-level of your program (the level which deals with the problem domain your program is intended to solve) and the meta-level (the level where your program is discussing itself, where reflection takes place). The classic design, where the class is the main repository of reflective information, tends to blur these lines. Classes often have both base level functionality (like creating instances) and meta-level functionality (reflection). This is most acute in languages like Smalltalk and CLOS. In Java, the base level roles of classes are often supported by specialized constructs like constructors (which have their own, worse, problems) and
static members (likewise). Even in Java, class objects may be used in a base level capacity (as type tokens, for example).

There is much work to be done in this area. No mirror API has yet fulfilled all my claims and ambitions - least of all the Newspeak mirror API, which needs extensive revisions. Still, I hope you’re curious enough to watch the talk and/or read the paper.

23 comments:

Paul said...

Hi Gilad,

Just watched the presentation. Interesting stuff.The applications seem boundless. You mentioned stratification as a goal and that most dynamic languages today (Smalltalk, Ruby etc) don't separate their reflective API from their base API. It occurred to me that you could retrofit Mirrors to such languages by having their existing reflective API delegate to an internal mirror. By default this mirror would be benign and not do much ( a sort of null object), but if the program wanted true reflective power (and was trusted) then a fully functional mirror could be injected some how.

Any thoughts?

Great presentation BTW, and thanks for making this material available.

Regards,

Paul.

Gilad Bracha said...

Hi Paul,

You can certainly introduce mirrors into Smalltalk - we did that with Strongtalk. But that does mean restricting the traditional API.

I don't quite see how you hide mirrors behind the traditional API though. How do you choose what mirrors you want to use (remote, local, source etc.)? In some globally modal way. Ugh. How do you specify failure modes for remote stuff - you need to expect new exceptions etc. So existing code is going to break. Give it up and change the API.

As for Ruby - I don't believe this applies. If you fix it, it will be a different language. Someone in Ruby-land is free to try of course.

Paul said...

Hi Gilad,

Makes sense. I guess you can't buck the need for a clean separation of concerns. Like you point out, languages aren't immune to the need to adhere to good design principles themselves :)

Which brings me to:

>> As for Ruby - I don't believe this applies.

I think I know what you mean :).

Regards,

Paul.

Mark Lee Smith said...

Hi Gilad,

Another interesting and wonderful talk (and article). Thanks! :)

As I put it to you briefly before, it's my contention that if mirrors are configured externally they negate an objects encapsulation.

For example, you can consider that:

ObjectMirror reflecting: self.

Does not break encapsulation, while:

ObjectMirror reflecting: shoe.

Does.

Moreover, the latter make the unsettling assumption that shoe returns something we actually want to reflect in an ObjectMirror.

As I understand it the problem of selecting the appropriate mirror is solved in most systems using factory methods that query the object for it's personal information, but since objects are free to lie about their personal details this seems like a recipe for disaster.

When we last talked you asked me how I would organise the API to solve this problem. Needless to say I didn't have an answer then, but I've thought a lot about it since and given that the object is the only thing that really knows how it should be reflected on, it seems only logical that all mirrors be configured internally. Thus, mirrors wont break encapsulation since the object is responsible for positioning itself in front of the mirror (applied vanity?)

So instead of telling the ObjectMirror to reflect on the shoe, you give the shoe the ObjectMirror, and let it configure it itself. If the shoe provides a method for returning it's instance variables, it could give the mirror this method. Then, when the mirror is asked for the instance variables it would call the method and return the instance variables.

Note: The method may be a block or some other chunk of code. The idea is simply that the object gets to choose the behaviour.

This satisfies encapsulation, and subsequently supports stratification and correspondence, and solves the problem that I outlines above.

In this way you get a mirror-based API which should be consistent across all objects, and maybe, though not necessarily, a more conventional (but potentially variable) API that underlies this mirror-based API, for use inside objects.

Note: Of course, the methods underlying the externally configured mirrors may themselves be implemented using internally configured mirrors, because it's safe to configure mirrors internally.

Note: Either way, if the object requires reflection to do it's stuff, you can't readily remove reflection from the system anyway.

How then would we support unanticipated reflection (needed for browsers and debuggers etc.) Well, in the worst case you're back to guessing at the right mirror, but if you apply this internal configuration approach consistently across your system the problem would seem to be solved... or that's my feeling here.

Any thoughts? It's quite possible that I'm way of the mark here :)

Gilad Bracha said...

Hi Mark,

Thanks for your comment. It is a hard question.

Tangent: Reading your comment in gmail, I enjoy advertising related to precision optics and make-up mirrors :-)

I agree that mirrors (or any reflection) violates encapsulation. What's good about mirrors is that not everybody gets them. If you get a mirror that can check out another object's internals, all bets are off.

The fact is that in some cases that is what you want, and in others not.

Leaving it up to the reflectee is tempting, but even the reflectee can't always decide. Do I want the debugger checking out my state? It depends who's debugging - my developer, or or some snoop (this is a really intersting case from the software as service perspective, BTW).

Besides, as you say - once you do this, you can't get rid of reflection anymore.

So for the most part, I'm content to rely on the capability approach to provide the right mirror and prevent abuse.

Mark Lee Smith said...

I'm not sure how easy that was to understand so here's some pseudo-code to go along with what I wrote above. It shows how an ObjectMirror may be used safely internally to configure an external mirror that will correctly return it's public fields while respecting the objects god-given right to encapsulate.

...

object: apple is:
{
public-method: (reflectInMirror: mirror) is:
{
mirror publicFields:
{
(mirrors ObjectMirrors reflecting: self) publicFields.
}
}
}

...

Note: the language used here is similar to Newspeak in that it uses nesting to structure programs, and uses top-level objects as modules, but is object-based, not class-based, and uses message-passing exclusively to implement control-structures etc. A work in progress :).

Gilad Bracha said...

Mark,

I think I did understand your point. So, some questions:

What is the default behavior (not everyone needs to specify how they are reflected when they write 'Hello world')?

If the default is no access, how does my debugger work? If it is free access, wither security?

My view is that the right to encapsulation is only at the base level. At the meta level, there are no guarantees. Reflection is fundamentally subjective, so it really isn't up to the object to decide.

I might be convinced otherwise, but it won't be easy :-)

PS: your language is interesting. Classes vs. prototypes is another big discussion.

Mark Lee Smith said...

> I agree that mirrors (or any reflection) violates encapsulation.

I'm not so sure. If you're familiar with the Agora research you'll recall that this is a reflective language which fully respects encapsulation, by providing all reflective operations (and literally everything else) through message-passing.

That meant that only the object could reflect upon itself (encapsulation doesn't exist within objects after all), and if it didn't expose this functionality, you simply couldn't reflect on it.

I completely agree that that is problematic, since unanticipated reflection is so damn useful - and not only for browsers and debugging etc.

Other than that Agora offered pretty conventional reflective API, that didn't provide any of the loveable organisational qualities that mirrors do.

Still the promise of having safe(ish) reflection – reflection that respects encapsulation by default, and frees you from the assumption I wrote about earlier, but still allows you to make that assumption when you want to do unanticipated reflection (and is mirror-based) – would seem to be the ideal solution.

Then, I do see your point about it being more difficult to separate the reflection from the system in this approach. I'll have to think about that some more.

It would be nice to have your cake and eat it too.

Still, great talking it through with you :).

Mark Lee Smith said...

> What is the default behavior (not everyone needs to specify how they are reflected when they write 'Hello world')?

The default would be no-access; I'm treating this as an organisational strategy for mirror-based APIs where anticipated reflection respects encapsulation, but where the system still supports unanticipated reflection through the normal channels.

> If the default is no access, how does my debugger work? If it is free access, wither security?

Your debugger would likely use the latter mentioned "normal channels", but your average programmer, wanting to expose a reflective API on the objects in his program would use the former.

Note: I think this is the difference between two legitimate uses of reflection: reflection at the domain-level (base-level?), and reflection at the system-level (meta-level?). The two don't necessarily have the same organisational solution.

My feeling is that if you know you want your objects to be reflected on ahead of time [1] you probably should (and could) specify how you want them to be reflected on.

This gives the architect more control over how their
objects are used [2]. It wouldn't place any definite restrictions on the user of the objects, but should serve to discourage misuse of reflection (and externally configured mirrors).

There's just something unnerving about:

SomeMirror reflecting: something.

What happens if you try to use a MethodMirror on a Class object in Newspeak?


> I might be convinced otherwise, but it won't be easy :-)

I might give it a go. It's a fascinating subject, and even if I don't succeed I'm sure I'll have learnt a lot in the process :).

> PS: your language is interesting. Classes vs. prototypes is another big discussion.

Thanks. I'd say I'm more in the middle. I think you can put classes and prototypes together in the same language without conflict (hence, object-based.)

Inheritance is a little different (I'm using nested mixin-methods rather than class-hierarchy inheritance), and there are no meta-classes etc. but the experiment is yielding some cute code structures.


[1] Say, you want to write reflective API on an object, and expose it using the mirrors in the system, and not through the object itself, as in conventional APIs.

[2] It could be argued that this is a false sense of control :)

Mark Lee Smith said...
This comment has been removed by the author.
Mark Lee Smith said...
This comment has been removed by the author.
Gilad Bracha said...

Mark,

So if there was, say, a protected method on Object called #mirror that returned an ObjectMirror on self, would you be satisfied?

Mark Lee Smith said...

Ostensibly yes – provided there were a mirror for every useful variation this solution would be perfectly fine, but it might be problematic if ObjectMirror provided a lot of functionality. If for arguments sake we say that: ObjectMirror provided n methods and there are m useful combinations of these methods, would need to define n*m different classes... and if there are a dozen different mirrors i.e. ObjectMirror, MethodMirror etc.

If on the other hand we used ObjectMirror and kin internally, but had a single ExternalMirror which we could set up with the different functionality as needed, it would be quite easy to capture any and all variants, just by binding different methods on local mirrors to their external counterparts.

The freedom to just put custom- code into the external mirror would just be a bonus and isn't really required.

Actually, thinking of it like that it sounds quite trivial and probably wasn't worth wasting your time with :).

P.S. I don't see that the method needs to be protected, although it may well be. Since this choice is encoded inside object (class) the choice of what mirror to use, and what functionality to expose has been made at this point - so need to guess, or use a factory method.

Take care.

Mark Lee Smith said...

On second thought I don't think that would really cut it:

By exposing a method #mirror returning an ObjectMirror, you're making a decision you wouldn't normally make using externally configured mirrors. That feels kind of wrong.

Note: This argument only applies to a non-private #mirror.

For example, ObjectMirror may be a great choice internally, but you don't want the users of your API to be "stuck" with a mirror that can only handle reflection locally, and resort to externally configuration to get the job done... because there goes any discussions the designer of the object made with respect to how the object should be encapsulated.

In contrast, passing #refectInMirror: a DistributedMirror to configure still allows the user to you make that decision after the fact; and use the same API, which respects the objects encapsulation, but from a distance.

Note: You could of course make ObjectMirror distributed by getting a remote proxy on the ObjectMirror, but I think the idea extends beyond this example... I might be wrong :).

P.S.

It occurs to me that if I were to take your stance that encapsulation only applies at the base-level then I should ask: since these mirrors do respect encapsulation, are you still at the meta-level when using these mirrors? And, does using mirrors for your domain-level reflective API (at the base-level) break stratification?

If the answer to only one of these is yes there would seem to be something strange going on.

Gilad Bracha said...

Mark,

a. I'm not exposing it. #mirror is protected, where protected is enforced. So it can only be sent to self.

b. The suggestion was made in the overall context of capability based mirrors. So encapsulation can still be violated by those who are granted access to suitable mirrors.

c. I don't quite see how things like debuggers work if objects can choose to refuse to be exposed (i.e., #reflectInMirror: is the basis for reflection).

I'll let you know if I figure it out.

Mark Lee Smith said...

Hi Gilad,

Maybe I didn't do the best job of explaining the idea; I'm only suggesting a way of organising mirror-based APIs such that you obtain mirrors from the objects you wish to reflect upon, as an alternative to the factory method.

Mirrors retrieved in this way will respects the objects encapsulation, and don't require any guess-work on the part of the programmer i.e. if you retrieve a mirror in this way there's no possibility that you'll try to make a MethodMirror reflect on class object.

The only thing that's changed is how you go about getting mirrors on objects in your application.

Instead of:

http://pastebin.com/wYv9qGzC

Where we have to ask what if the object is an asteroid that is roughly shaped like a spaceship?

I suggest:

http://pastebin.com/dhugFjCA

Where we would still get the right mirror, even if the object is an asteroid that looks exactly like a spaceship.

If we get the mirror from the object, we can only do what the designer of the object explicitly allowed us to do... at least as long as we ourselves don't have a reference to ObjectMirror.

Onto debugging.

> I don't quite see how things like debuggers work if objects can choose to refuse to be exposed (i.e., #reflectInMirror: is the basis for reflection).

Debuggers would work in the normal way; by doing something along the lines of:

ObjectMirror reflect: object.

Where object is the object you wish to point the debugger at.

#reflectInMirror: just gives us (what is in my opinion) a nice way of influencing what mirror we get back. A public #mirror method would do just fine, but it doesn't give us this influence.

> I'm not exposing it. #mirror is protected, where protected is enforced. So it can only be sent to self.

Then this is fine.

I see no problem with mirrors that are reflecting self. My suggestion only becomes interested when we want reflect on other objects, and or pass those capabilities to other objects.



Sorry for any confusion. I hope that clears things up.

kdabw said...

Sorry for my belated message, I had to think a bit how to make it short.

I agree & assume that protected reflectInMirror: is the basis for reflection.

In the unencapsulated world, an exception is thrown to the (unwanted) sender of protected #reflectInMirror:.

This is like, Mr. and Mrs. shoe called for a plumber (whom they let do almost everything in their house) and don't get noticed that the guy knocked at the door.

But in the encapsulated world, in this situation a message like #doesNotUnderstand: is sent to the receiver of protected #reflectInMirror:, which can answer with (^self reflectInMirror: argv), which now passes protection, or can supply another mirror which hides (for example) certain or all things from a debugger (except perhaps the receiver's identity or existence).

I think that access control and security go hand in hand, and encapsulation is the basis for both.

Gilad Bracha said...

Hi Kdabw,

I haven't really digested your proposed design yet. As you say, encapsulation and reflection have to work together, and I'm not convinced this is how it can/should work - at least in Newspeak. I hope to show progress on this front later this year.

Mark Lee Smith said...

I hadn't really considered the security concerns as they might be if mirrors didn't respect encapsulation, but it occurred to me that such mirrors don't just offer the capability to do reflection, but the capability to steal any other capability in the system.

In essence: if you have access to such mirrors you're no longer bound by the rules of the capability-model.

Want to write a to a file? If there's any object in the system that that can, and you have a mirror, so can you. What's more, you're then free to distribute that capability throughout the entire system on a whim.


And a very stupid example:

If there's an object in the system that can "launch the bombs", as it were, so can you. No matter if that object is hidden behind layers and layers of security, and can only be accessed normally by password and what not.


Maybe this to stupid to consider? I just woke up sorry :).

Gilad Bracha said...

Mark,

Unrestricted access to mirrors does of course imply complete loss of security. The only check on this is that you can still only mirror objects you can get hold of; in Newspeak, the absence of a global namespace means that you might not be able to get at everything. However, you likely can wreak enough havoc that it won't matter.

The answer to that is that is that not everyone gets such mirrors. Most of the time, the mirrors are only capable of reflecting selected objects in limited ways.

The precise design is a real challenge, and will no doubt go through several iterations.

Pascal Costanza said...

Hi Gilad,

Your characterization of CLOS is not quite correct: In CLOS, classes don't have functionality, but they are essentially just data containers organized in an inheritance hierarchy. Functionality is defined as methods on generic functions, which reside outside of classes. The way to retrieve the class of an object is by calling the function CLASS-OF, as for example in (class-of object). This function could be dropped when deploying a system, and further functions on class metaobjects could also be dropped or added at will. In fact, most of the CLOS MOP functionality is optional and not available in some Common Lisp implementations. Furthermore, it can surely be imagined to have your own MY-CLASS-OF functions that return 'mirrors' for your objects other than the default class metaobjects - I have just never encountered that.

So I am not convinced that mirrors are so far away from the architecture of the CLOS MOP. In fact, CLASS-OF leads to a relatively clean separation into meta-levels, which is not a coincidence, since the CLOS MOP was strongly influenced by experiences with 3-Lisp, whose primary contribution was the reflective tower with cleanly separated levels of reflection.

Gilad Bracha said...

Hi Pascal,

Never argue with a Lisper :-)
Sorry if I've misrepresented CLOS. But I don't yet see how you deal with, e.g., security.

Pascal Costanza said...

I don't know what you mean by "security" in this context. What do you want your programs to be secure against?