Sunday, February 17, 2008

Cutting out Static

Most imperative languages have some notion of static variable. This is unfortunate, since static variables have many disadvantages. I have argued against static state for quite a few years (at least since the dawn of the millennium), and in Newspeak, I’m finally able to eradicate it entirely. Why is static state so bad, you ask?

Static variables are bad for security. See the E literature for extensive discussion on this topic. The key idea is that static state represents an ambient capability to do things to your system, that may be taken advantage of by evildoers.

Static variables are bad for distribution. Static state needs to either be replicated and sync’ed across all nodes of a distributed system, or kept on a central node accessible by all others, or some compromise between the former and the latter. This is all difficult/expensive/unreliable.

Static variables are bad for re-entrancy. Code that accesses such state is not re-entrant. It is all too easy to produce such code. Case in point: javac. Originally conceived as a batch compiler, javac had to undergo extensive reconstructive surgery to make it suitable for use in IDEs. A major problem was that one could not create multiple instances of the compiler to be used by different parts of an IDE, because javac had significant static state. In contrast, the code in a Newspeak module definition is always re-entrant, which makes it easy to deploy multiple versions of a module definition side-by-side, for example.

Static variables are bad for memory management. This state has to be handed specially by implementations, complicating garbage collection. The woeful tale of class unloading in Java revolves around this problem. Early JVMs lost application’s static state when trying to unload classes. Even though the rules for class unloading were already implicit in the specification, I had to add a section to the JLS to state them explicitly, so overzealous implementors wouldn’t throw away static application state that was not entirely obvious.

Static variables are bad for for startup time. They encourage excess initialization up front. Not to mention the complexities that static initialization engenders: it can deadlock, applications can see uninitialized state, and unless you have a really smart runtime, you find it hard to compile efficiently (because you need to test if things are initialized on every use).

Static variables are bad for for concurrency. Of course, any shared state is bad for concurrency, but static state is one more subtle time bomb that can catch you by surprise.

Given all these downsides, surely there must be some significant upside, something to trade off against the host of evils mentioned above? Well, not really. It’s just a mistake, hallowed by long tradition. Which is why Newspeak has dispensed with it.

It may seem like you need static state, somewhere to start things off, but you don’t. You start off by creating an object, and you keep your state in that object and in objects it references. In Newspeak, those objects are modules.

Newspeak isn’t the only language to eliminate static state. E has also done so, out of concern for security. And so has Scala, though its close cohabitation with Java means Scala’s purity is easily violated. The bottom line, though, should be clear. Static state will disappear from modern programming languages, and should be eliminated from modern programming practice.

48 comments:

  1. Aren't Scala's objects effectively static's? That's how I think of them. I move any Java statics into a Scala companion object.

    What is different about them that you don't think of them as static?

    ReplyDelete
  2. Good point. A top level mutable object literal in Scala is indeed a form of static state.

    I believe this feature is motivated by interoperability with Java (e.g., you model a Java class with static state as a pair of class and object). This is (partly) why I said that Scala's purity is easily violated.

    Scala's a great language, but it ha s made some compromises to play well in the Java world. This is one of them.

    ReplyDelete
  3. Gilad,

    I think you left out testing. I constantly find myself having to do any number of rather insane things to get around statics in order to test a class. Some of that is covered by re-entrancy, but I think its more than just that. Statics are a dependency that just can't be broken without removing them entirely. The problem isn't always that I want my code to be able to get back in to a method, most of the time I don't want my code to be calling a static method at all! I find that these methods are often forcing nearly every class in the system to loaded. I find myself thinking, "If I could just get around this one static method, then maybe I could actually test this class."

    But then I try to remove it, and find that static code is also unnecessarily difficult to refactor and remove.

    One could argue that in that case the system hasn't been designed well enough, or isn't object oriented enough, etc, etc. I might as well just go ahead and blame those things on statics too.

    If "regular" developers weren't given such a bad tool, then maybe systems wouldn't get into such poor states. Then again, maybe those developers will always find a way to get there, regardless. Maybe thats a bit of a rant, but maybe not. It seems that less experienced developers are somehow drawn to use statics more.

    ReplyDelete
  4. Isn't static an inheritance of the C-programming line? I don't remember Smalltalk having statics. One surely has class members, but these again are instance members of metaclasses.
    Will Newspeak include metaobjects?

    ReplyDelete
  5. Oh yes Smalltalk has statics. Lots of statics. In fact, Smalltalk in Smalltalk is the system dictionary --- precisely where many statics live. Then there are the class-side variables, shared between classes, class instance variables, pools, and on it goes.

    Self tried to solve the problem by making everything into objects, but again there is a global object, the lobby that plays the part of statics in Self.

    ReplyDelete
  6. Jack:

    I agree that statics cause problems for testing etc. In a system that provides statics, it is just too easy to structure things to depend upon them. So the only way forward is to completely remove them from the language.

    Stefan: As james says, Smalltalk has plenty of static state. Global variables; Pool variables, of which Class variables are just a special case. Class variables are not the same as class instance variables - rather they are a precise analog of Java static variables.

    Newspeak has metaclasses (I really wanted to get rid of them, but so far that hasn't worked out). However, their instances (the class objects) have no instance variables, only methods. The methods act as factory methods (as previous posts explain).

    Consequently, top level classes are completely stateless, so there is no static state whatsoever.

    ReplyDelete
  7. And what about "static final"?
    Most of your points don't hold up for static final variables (think of constants).

    ReplyDelete
  8. srt:

    Don't confuse shallow immutability (final) with deep immutability (absence of state). If a static final variable is set to a mutable object, everything I've said holds. The variable doesn't change its binding, but the thing it references changes.

    If, on the other hand, the static final is an immutable (say, an int) then it doesn't constitute state at all, and doesn't pose a problem - as long as you can't observe it before it is initialized.

    This is the case with a deeply immutable value in Newspeak (or in E). In Java of course, you can sometimes detect when statics get initialized, so there can still be a problem - indeed, an exceedingly subtle one.

    ReplyDelete
  9. In the absence of "static final", constants are easily modeled with methods:

    pi = ( ^3.141592653589793238462643383279502884197169399375105820974945)

    There is no need to mark this method final, put it on the class side, etc. A decent modern virtual machine will automatically detect if the method is indeed final and inline the value if appropriate.

    ReplyDelete
  10. Gilad, you say that we should do away with statics as much as possible. What about utility classes which have no state at all? They just have utility methods, available in a static way. How can these be re-modeled in a language like Java? Or do they constitute a valid use of statics? I know that in dynamic languages you can extend the objects themselves on the fly to add required behaviour. But, what about Java?

    ReplyDelete
  11. What is your opinion on ThreadLocals as an intermediary between static and instance state? I have found them useful occasionally as a means to provide contextual state without having to add additional arguments to several layers of APIs. And they do not suffer the concurrency problems of static variables.

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. I haven't been following the discussion properly, but isn't this more or less what the functional programming people have been saying for quite some time?

    ReplyDelete
  14. It seems to me that constructors are best regarded as a form of static method, and in fact, they hide a form of static mutable state (memory allocation, object initialization are observable effects). This is also the point of view represented by dependency injection frameworks, which eschew the direct use of constructors along with other statics. How do you feel about this issue? Do you consider direct use of constructors an acceptable form of static linkage, or does Newspeak address this as well?

    (If this should already be clear, I apologize for not following all the Newspeak design discussion.)

    ReplyDelete
  15. I'm all for the anti-static cause, but I think there's more substance to the popularity of "static" in C++ and Java than just hysterical raisins.

    "static" offers a hint of two features you don't otherwise get in those languages: implicit parameters and open classes.

    "static" allows the definition of the top-level program/process to be distributed across independent source modules, for better and for worse.

    The way to get rid of static is to give other ways to pass controlled amounts of environment and context without making argument lists fat and unwieldy, or centralizing too much definition. Scala (composition), Haskell (type classes), and Newspeak (modules) all do that. Those of us stuck with C++ or Java have to live with a lot more manual plumbing and explication if we want the rewards of static-freedom.

    For me, as a C++ developer, testability is the final doom of "static". All the other downsides of static are negotiable -- occasionally static is both correct and far simpler than the alternatives. But as soon as you want dependency injection for testing, every static becomes a toxic liability.

    ReplyDelete
  16. Kodeninja and Matt:

    This post was about static state, not static methods. Static methods have their own issues, closely related to the problems with constructors (see my earlier posts on constructors). It's really hard to avoid static methods entirely in Java, though dependency injection frameworks (DIFs) can help. I've discussed constructors, DIFs etc. at length in several prior posts.

    ReplyDelete
  17. Noteventime:

    You write: "isn't this more or less what the functional programming people have been saying for quite some time?"

    Not really. Pure functional program takes a much more extreme position - eradicate all state, static or not. Impure functional programming allows state on occasion, but seeks to to keep it to a minimum. There is much to be said on this topic, but this post was focused on imperative languages.

    The point is that even if you fully support imperative (and particularly imperative OO) programming, static state should be avoided.

    ReplyDelete
  18. Stefan notes that Thread locals can serve as a halfway house between ordinary statics and instance variables. Thread locals can help avoid the horrible problems of concurrency and shared state, but static state is bad independent of concurrency, and thread locals are just as bad as any other static in that respect.

    The plumbing problems that arise can be alleviated using (non-static) nested classes. Beta, Scala, and Newspeak use variations on this idea. Christopher made that point in his comment, which I completely agree with.

    In Java, nested classes bring in certain complications, but the formulation used in Newspeak (inspired by Beta) doesn't.

    ReplyDelete
  19. When do you think will see a release of the Newspeak language? Hopefully, as good or as bad as some view the Arc release.

    ReplyDelete
  20. Hifimaven:

    When will Newspeak be released? I wish I knew the answer to that question, but I don't right now, and I don't want to speculate.

    FWIW, we've been working on it for a little over a year, which is not very long as such things go. So we can certainly use some more time - but hopefully less time than it took for Arc!

    ReplyDelete
  21. I think singleton objects in Scala are meant to be used as first class modules (think Haskell type classes), not necessarily as containers for global state. They are much more than Java's static. You can of course use them as such, but that's up to you.

    See this blog entry for some examples of proper usage of Scala objects.

    ReplyDelete
  22. Suppose you have a pure function which evaluation takes a long time. In order to optimize, you may want to save previous results and reuse them when arguments are the same as in previous calls.
    I believe keeping those results in static variables is OK since it is transparent to the user.
    In short, static variables may be used in such a way that does not add static state, and avoiding static state does not mean eliminating static variables.

    ReplyDelete
  23. Ah, I think I mixed up static state and static members et al., too. I agree that static state is not really a necessary feature of a programming language, but on the other hand not having static states will increase dependencies between state-holders, I think. What about those information that stay the same throughout an application's lifetime like, e.g., system information?
    For caches etc. (like keeping results of previous operations) no static container is needed, as long as there is a caching instance passed along to operation executors (or operations themselves).

    ReplyDelete
  24. > You start off by creating an object, and you keep your state in that object and in objects it references.

    You haven't eliminated static variables at all, you have just moved them.

    ReplyDelete
  25. It looks like you have to go all the way to full immutability to satisfy your complaints against statics. Shared objects cause as much problems for concurrency and re-entrancy as static objects anyway. So ... what ... Erlang, Haskell?

    ReplyDelete
  26. Hi Gilad,

    I've never thought of all the guises that exists of static state and the associated pitfalls. I wonder if there is a catalogue of code smells somewhere that we can add these to?

    My dislike of static state stems from my procedural design days when good practice meant avoiding global variables. Globals are the worst culprits when it comes to data coupling.

    Most things design issues come down to coupling and cohesion. Static state is just another example.

    ReplyDelete
  27. Interesting thought Newspeak has metaclasses (I really wanted to get rid of them, but so far that hasn't worked out). The consequence can be, that class Class has no class variables nor class methods. Why can't you get rid of metaclasses, what would Newspeak be missing?

    ReplyDelete
  28. kdabw:

    In Newspeak, classes serve as factories for instances, and therefore support factory methods. Without metaclasses, we wouldn't be able to support distinct factory methods for each class. We'd have to fall back on a generic way of producing instances from classes, like "new" or "new:". This is more error prone and less readable than the solution we have.

    Paul: I agree with your comments; however, I try to apply these general principles to specific language constructs.

    ReplyDelete
  29. Gilad,
    factory methods and metaclasses
    Thank you, your comment on supporting distinct factory methods made me get rid of static semantic value of message names from my instance creation/factory invocation constructs :) All that I use now is

    o object to ask (class|factory)
    o number of optional slots
    o parameters by name

    and no static (class|factory) message name :)

    ReplyDelete
  30. kdabw:

    Your idea is intriguing, though the post is a bit vague. Does " parameters by name" mean true keyword parameters? Or call-by-name? The former makes sense to me in this context.

    However, I assume this means no secondary factories? Keyword parameters would handle defaults, but there are examples, like polar vs. cartesian points, where secondary factories are useful and are not being used to provide default parameters.

    ReplyDelete
  31. Gilad,
    parameters by name
    I confess I had JSON's object (RFC 4627) in mind, plain key/value pairs (and no restriction on depth).
    secondary factories
    Yes, delegation to another factory (with and without defaults) must be possible, in principle between any pair of (class|factory)'s, eventually resulting in a new or already existing object.

    What this means to syntax, I think depends on style in language. I plan to be in Potsdam on March 11, there's perhaps some time for this.

    ReplyDelete
  32. kdabw:

    I'd love to discuss this further in Potsdam. Or send me e-mail.

    ReplyDelete
  33. What about special variables as in Common Lisp? I'm not sure about the others, but they don't seem to have a problem with reentrancy and testability.

    ReplyDelete
  34. It's been a long time since I looked at Common Lisp, but you can certainly have static state there. I don't recall what special variables are, but it scarcely matters. What matters is the principle: you can apply it to any language you want.

    ReplyDelete
  35. This comment has been removed by the author.

    ReplyDelete
  36. With "special variable" I was meaning dynamic scoping. Here is the demonstration of it:

    (defvar *x* "top-level")

    (defun print-x ()
      (print *x*))

    (defun foo ()
      (let ((*x* "foo"))
        (print-x))
      (print-x))

    (defun bar ()
      (let ((*x* "bar"))
        (foo))
      (foo))

    (foo)

    And the output:
    "foo"
    "bar"
    "foo"
    "top-level"

    With 'let' you create a new binding for a variable and called functions inside 'let' see the new binding. When you go outside of 'let', the previous binding of the variable is used.

    ReplyDelete
  37. Correction: the last line of the code is (bar) and not (foo)

    ReplyDelete
  38. So when building things "service"-style constructs that would be initialized once and then used Singleton style, how would one do that in Newspeak? Or would we have to pass those instances around?

    ReplyDelete
  39. Christoffer,

    A singleton is a simply unique object. In most languages, you can use the static state associated with a class to ensure it only has one instance, and make singletons that way. But this only works because the class itself is a singleton, and the system takes care of that for you by having a global namespace.

    In Newspeak, there is no global namespace. If you need singletons in your application, they are simply instance variables of your module. When you load your application, you hook up your modules and make a single copy of each.

    If, on the other hand, you need a service that's accessible to an open-ended set of users, it has to be available at some public place - this could be a URL on the internet (the real global state) or a platform wide registry. In other words, it's part of the outside world's state.

    Such world state may be injected into your application when it starts up (but only in as much as the platform trusts you to access it).

    Not sure if this helps. The habit of static state is pervasive in computing and it's hard for people to get rid of it - but we will.

    ReplyDelete
  40. Let me see if I understand this right...

    Instead of using a single global namespace with state, you instead provide a way to create local namespaces with state?

    ReplyDelete
  41. Christoffer,

    Yes; they're called objects, and everybody has them. The point is that by allowing them to nest correctly, yu can avoid the need for a global namespace (with state or otherwise). The word's global state (or some perspective on it or subset thereof) is a parameter to these local namespaces, so they only what they need to know.

    ReplyDelete
  42. Sorry for being so slow-witted, But do you mean you create an object and load classes inside the context of that object, so that they become a bit like inner classes in Java in the sense that they appear as regular classes within the context of the object, but not outside of it?

    ReplyDelete
  43. Christoffer,

    We lack bandwidth to communicate properly about this. What you say may or may not be true of Newspeak - hard to tell.

    The issue of what the base case is an a language without a global namespace, and how one uses it in practice, deserves a detailed discussion.

    I think I'll need to write a separate post to explain this properly. It takes more space and time than I have right now. All I can do at the moment is point you at the material that's out there - earlier posts on this blog, and stuff on the Newspeak web site.

    Sorry if this leaves some things unclear for now.

    ReplyDelete
  44. I think I'm finally getting the hang of it, but I'm looking look forward to a more detailed post some time in the future.

    As for other posts, I did read the posts on "modules", but I think I am too unused to Smalltalk to properly parse the example code.
    I am bit embarrassed to suggest it, but maybe for the sake of clarity you could dumb things down and give some code examples in a java-like syntax?

    ReplyDelete
  45. I'm interested in hearing about this further as well.

    ReplyDelete
  46. I should have gone for the spec http://bracha.org/newspeak-spec.pdf immediately, I think it explains things the best.

    ReplyDelete
  47. Christoffer,

    I'm glad something worked. I'm actually quite surprised the spec is what you found most useful!
    I'll try and write up a proper explanation at some point in the not-too-distant-future. But I'd better get back to doing real work for now.

    ReplyDelete