Archives
Tags
- General (18)
- Food (1)
- Cooking (1)
- Ruby (6)
- Rails (2)
- Svn (2)
- Linux (9)
- Git (1)
- Firefox (8)
- Porn (1)
- Freyja (1)
- Witchhammer (6)
- Music (1)
- Merb (3)
- Poetry (0)
- Bolverk (3)
- Sinatra (1)
- Discogs (1)
- Centos (1)
- Python (1)
- Whinging (2)
- Travel (2)
- Scheme (7)
- Lisp (8)
- Sicp (1)
- Rot13 (1)
- Czech (2)
- Metal (3)
- Passenger (1)
- Fun (5)
- Fractals (2)
- Plt (2)
- Clojure (1)
- Continuations (1)
- Javascript (1)
- Presentation (1)
Comprehending first-class continuations
The notion of continuations as first-class values has been a tricky subject for me to understand to a comfortable level of certainty. I think this is probably true for many PLT-laymen like myself. This article represents my attempt at collecting and presenting my thoughts in a coherant manner! I'd be happy to receive corrections and comments.
I will start by defining a few key terms in my own style.
Continuation
We can observe that every expression, regardless of complexity, has the ultimate goal of returning a value to some surrounding execution context. That context is known as a continuation - it represents everything that is left to compute. Therefore, every computation has an associated continuation, which specifies the place from which execution should continue once control has been returned.
As a simple example, consider the following Scheme form:
The continuation associated with the sub-expression "1" is an anonymous function, whose body consists of a call to the + primitive procedure whose first argument is the local variable n and whose second argument is being sought. With this understanding we can say expressions deliver values and continuations receive values.
Current continuation
By "current", we are referring to the continuation that would be derived from the current point in a programs execution. In other words, the current point of execution, lexical environment and state of the call stack (You should be familiar with the stack and the heap).
First-class object
For an object to be a "first-class citizen" in a programming language, it needs a few properties. Specifically, it must support being:
- passed as a parameter to a function;
- returned from a function;
- stored in a variable or within a data structure;
- constructed at runtime.
Objects like numbers and strings are first-class in most programming languages. Functions are first-class in all functional programming languages. And in Scheme, my language of choice for this article, nearly everything can be considered first-class - including continuations as we'll see shortly. To avoid confusion, note that I am using the term "object" to refer to any entity that can be controlled within the realm of a particular programming language, rather than to the OO-sense of the word (the instance of a class).
Continuations in Scheme
In order to render the abstract notion of the current execution context (the current continuation) to the programmer, Scheme needs some way to represent it. It would have been possible to implement specific objects and syntax into the language to handle this, but it's much easier to simply represent the current continuation as a first-class function. This process of transforming something implicit into something that can be explicitely expressed is known as "reification". So we can say Scheme reifies the current continuation as a function object.
Scheme provides the function call-with-current-continuation (aliased as call/cc) to furnish continuations to the programmer. call/cc is a unary function that accepts a further unary function, f, as it's argument. When invoked, Scheme will reify the current continuation, c, as a function and apply f to c. Therefore:
When c is applied to an argument, v, the existing continuation is terminated and the one represented by c is reinstated. So program flow will continue on from where the continuation was captured and v will become the overall value of the call/cc invokation.
Confusing? Take a look at this simple example:
At first glance, you may think this expression will evaluate to 7, but infact, it will evaluate to 4. This is because we've captured the current continuation in the formal parameter return and then applied it to the number 4. This causes program flow to return instantly to the point at which the continuation was taken. Yep -- when applied, c will never return! We are effectively emulating the "return" control operator that allows early-escape in many popular programming languages.
To demonstrate the concept of early-escape in a more effective manner, let me show you two versions of a small Scheme procedure - one that makes use of continuations and one that does not. The procedure has-sym? accepts as arguments an S-expression (in this case a list of symbols and/or nested lists of symbols) and a symbol. It returns true if the given symbol is present somewhere within the S-expression. First of all, take a look at the non-continuation version:
Whilst concise and easy to understand, this implementation (because it's defined as a recursive process) suffers from the slight hassle in that even if we find a match, we need to wait until the call-stack unwinds before the actual result gets back to the continuation in which the has-sym? procedure was called. If, perhaps, we captured the continuation of the has-sym? procedure with call/cc and then defined a local helper procedure (to perform the search) inside our function argument, we'd be able to invoke the continuation object and jump straight out of the procedure as soon as we found a match. For example:
At this point it may be worth pointing out that continuations in Scheme, whilst far more powerful, are also more conservative than the goto statement of other languages for we can only revert control back to a place we have already visited (obviously, this is a good thing!). Also note that applying a continuation, unlike lexical closures, does not reinstate the referencing environment! If you change the values of variables - they will remain changed.
Continuations as first-class citizens
So far I've tried to explain the notions of both first-class objects and continuations in Scheme, but when we combine the two and start considering continuations themselves as being of first-class status, we open up a treasure trove of cool ideas. For example, consider the following mind-bender:
We can demonstrate the indefinite extent of continuations in Scheme by capturing the continuation (reified as a first-class function) in the local variable k. This definition of factorial works by capturing the continuation object, modifying the parameter n and local variable total in place, and then invoking the continuation with itself as an argument. This causes something of a local GOTO and rebinds the original continuation to the local variable k. This process continues until n reaches 1 at which point we return the total back to the user. The let* form is important here as it will expand into a series of nested let forms. Without this, we'd rebind total to the original value of n on each invokation of the continuation object. With let* we can store a running total.
As you can see, we have used our power over control flow to compute factorial without using recursion or any built in looping construct. Of course, this implementation also suffers from two obvious downsides: 1) It's somewhat esoteric and hard to understand and; 2) It mutates local state, which makes it harder to prove that this function will maintain referential transparency.
Conclusion
With the power of first-class continuation handling at our disposal, we have the ability to define several control flow constructs common to other programming languages, such as backtracking, try/catch exception handling and "green" threads.
Studying a topic like continuations helps us to solidify our knowledge of several concepts we may generally grow to take for granted. The idea of managing control flow is so rudimentary to our field that we may never stop to think about how such an implicitely understood idea can be explained effectively.
In terms of further reading, I'd suggest the call-with-current-continuation, Continuation and Call stack articles on Wikipedia. The R5RS standard documents call/cc and other control-handling procedures that are available in Scheme. The discussion on understanding continuations at Lambda the Ultimate also helped me in writing this article.
And, finally, I think when such high-level control of continuations is given to a programmer, the words of the great Uncle Ben must be remembered:
"WITH GREAT POWER THERE MUST ALSO COME - GREAT RESPONSIBILITY!"
- Amazing Fantasy #15 (August 1962) - The first Spider-Man story.
Understanding lexical closures
A while ago I was having a discussion with a programmer friend of mine. He'd been writing some Javascript at work and asked me about closures. Whilst I understood the concept enough to use it in my own code, I found it difficult to explain effectively. Although common knowledge in functional programming circles, closures (and, more importantly, their usefulness) are often hard to grasp for us younger programmers who are trained to think in terms of OOP. This article represents my attempt at explaining lexical closures in the simplest possible terms. I hope my language choice of Clojure is not too esoteric.
Ok, lets get started...
Definition
A closure is a language feature in which a lambda expression (read: function) is bound to the referencing environment that was available when it was first defined. The function is said to close over it's referencing environment, hence the name. If you are unfamiliar with the term "referencing environment" (some people call it the "lexical environment"), we can think of it as a simple dictionary (key -> value) mapping free variables to their respective values. By "free variables", I refer to bindings that have occured in an outer scope (such as the surrounding function). In lexically scoped languages, a functions referencing environment can generally be known simply by looking at the source code of a program.
I don't expect you have any idea what that really means yet - just keep reading!
Explained simply
A simple way to think of a closure in practical terms is as a package that contains two things:
1) A body of code
2) The referencing environment that was active when said code was defined
So, when we invoke a closure (execute the body of code in the package), rather than creating a new referencing environment the closure will grab it's saved environment from the package and use that instead.
The following example demonstrates how we can use closures to hide helper functions that would serve no purpose outside their defining context. Sorry, I could not help but add a factorial example:
The function "fact-helper" is now private to the letfn expression. The named function "fact" is created in the global scope, but retains access to the referencing environment that was available when it was defined. In OOP, we generally achieve this same effect by making a class and defining private functions within it.
Language prerequisites
In order to allow for closures, a language needs to atleast support nested functions. Most functional languages and many popular imperative languages meet (well, surpass) this criteria. Python, all Lisps, Javascript, Haskell, etc all support true first-class functions and the block construct of Smalltalk and Ruby is sufficient. Without the means to lexically nest scopes (C, from memory), then closures serve no purpose - their local environment will remain the same on each invokation and the global environment will be either identical or larger. A few languages, such as Pascal and Algol, allow for nested functions but do not allow functions to be returned as values. This means that the closure is only available in the scope it was defined. Languages like these are said to have environments with "limited extent", which means that lexical bindings are destroyed upon leaving their scope.
On the flip-side, languages that support first-class and/or anonymous functions must also ensure that a referencing environment has "unlimited extent". That is, the interpreter or compiler must guarantee that the environment will not be destroyed until the garbage collector can prove that there is no reference to it. This is important because a closure may be returned from a function and thus may outlive the scope in which it was defined. The somewhat contrived example below illustrates this point:
It is important to note that the environment that the closure reinstates will contain the same bindings, but their values may not be the same. This is especially true in imperative languages in which assignment (changing the value of a variable) is key.
Conclusion
Closure is a simple concept that all programmers should atleast have a basic understanding of. Many modern programming languages are adding support for first-class functions, which means that this style of programming may well become more popular in mainstream languages. As more multi-paradigm languages hit the mainstream, it will become increasingly more important that effective programmers can reason about their programs from multiple perspectives.
In terms of further reading, I suggest you take a look at the Wikipedia page and/or read SICP. Finally, I cannot claim to be an expert on this subject, so if you have any feedback and or corrections, please let me know. Thanks!