Archives
Tags
- General (18)
- Food (1)
- Cooking (1)
- Ruby (6)
- Rails (2)
- Svn (2)
- Linux (9)
- Git (1)
- Firefox (8)
- Porn (1)
- Freyja (1)
- Witchhammer (6)
- Music (1)
- Merb (3)
- Poetry (0)
- Bolverk (3)
- Sinatra (1)
- Discogs (1)
- Centos (1)
- Python (1)
- Whinging (2)
- Travel (2)
- Scheme (7)
- Lisp (8)
- Sicp (1)
- Rot13 (1)
- Czech (2)
- Metal (3)
- Passenger (1)
- Fun (5)
- Fractals (2)
- Plt (2)
- Clojure (1)
- Continuations (1)
- Javascript (1)
- Presentation (1)
Comprehending first-class continuations
The notion of continuations as first-class values has been a tricky subject for me to understand to a comfortable level of certainty. I think this is probably true for many PLT-laymen like myself. This article represents my attempt at collecting and presenting my thoughts in a coherant manner! I'd be happy to receive corrections and comments.
I will start by defining a few key terms in my own style.
Continuation
We can observe that every expression, regardless of complexity, has the ultimate goal of returning a value to some surrounding execution context. That context is known as a continuation - it represents everything that is left to compute. Therefore, every computation has an associated continuation, which specifies the place from which execution should continue once control has been returned.
As a simple example, consider the following Scheme form:
The continuation associated with the sub-expression "1" is an anonymous function, whose body consists of a call to the + primitive procedure whose first argument is the local variable n and whose second argument is being sought. With this understanding we can say expressions deliver values and continuations receive values.
Current continuation
By "current", we are referring to the continuation that would be derived from the current point in a programs execution. In other words, the current point of execution, lexical environment and state of the call stack (You should be familiar with the stack and the heap).
First-class object
For an object to be a "first-class citizen" in a programming language, it needs a few properties. Specifically, it must support being:
- passed as a parameter to a function;
- returned from a function;
- stored in a variable or within a data structure;
- constructed at runtime.
Objects like numbers and strings are first-class in most programming languages. Functions are first-class in all functional programming languages. And in Scheme, my language of choice for this article, nearly everything can be considered first-class - including continuations as we'll see shortly. To avoid confusion, note that I am using the term "object" to refer to any entity that can be controlled within the realm of a particular programming language, rather than to the OO-sense of the word (the instance of a class).
Continuations in Scheme
In order to render the abstract notion of the current execution context (the current continuation) to the programmer, Scheme needs some way to represent it. It would have been possible to implement specific objects and syntax into the language to handle this, but it's much easier to simply represent the current continuation as a first-class function. This process of transforming something implicit into something that can be explicitely expressed is known as "reification". So we can say Scheme reifies the current continuation as a function object.
Scheme provides the function call-with-current-continuation (aliased as call/cc) to furnish continuations to the programmer. call/cc is a unary function that accepts a further unary function, f, as it's argument. When invoked, Scheme will reify the current continuation, c, as a function and apply f to c. Therefore:
When c is applied to an argument, v, the existing continuation is terminated and the one represented by c is reinstated. So program flow will continue on from where the continuation was captured and v will become the overall value of the call/cc invokation.
Confusing? Take a look at this simple example:
At first glance, you may think this expression will evaluate to 7, but infact, it will evaluate to 4. This is because we've captured the current continuation in the formal parameter return and then applied it to the number 4. This causes program flow to return instantly to the point at which the continuation was taken. Yep -- when applied, c will never return! We are effectively emulating the "return" control operator that allows early-escape in many popular programming languages.
To demonstrate the concept of early-escape in a more effective manner, let me show you two versions of a small Scheme procedure - one that makes use of continuations and one that does not. The procedure has-sym? accepts as arguments an S-expression (in this case a list of symbols and/or nested lists of symbols) and a symbol. It returns true if the given symbol is present somewhere within the S-expression. First of all, take a look at the non-continuation version:
Whilst concise and easy to understand, this implementation (because it's defined as a recursive process) suffers from the slight hassle in that even if we find a match, we need to wait until the call-stack unwinds before the actual result gets back to the continuation in which the has-sym? procedure was called. If, perhaps, we captured the continuation of the has-sym? procedure with call/cc and then defined a local helper procedure (to perform the search) inside our function argument, we'd be able to invoke the continuation object and jump straight out of the procedure as soon as we found a match. For example:
At this point it may be worth pointing out that continuations in Scheme, whilst far more powerful, are also more conservative than the goto statement of other languages for we can only revert control back to a place we have already visited (obviously, this is a good thing!). Also note that applying a continuation, unlike lexical closures, does not reinstate the referencing environment! If you change the values of variables - they will remain changed.
Continuations as first-class citizens
So far I've tried to explain the notions of both first-class objects and continuations in Scheme, but when we combine the two and start considering continuations themselves as being of first-class status, we open up a treasure trove of cool ideas. For example, consider the following mind-bender:
We can demonstrate the indefinite extent of continuations in Scheme by capturing the continuation (reified as a first-class function) in the local variable k. This definition of factorial works by capturing the continuation object, modifying the parameter n and local variable total in place, and then invoking the continuation with itself as an argument. This causes something of a local GOTO and rebinds the original continuation to the local variable k. This process continues until n reaches 1 at which point we return the total back to the user. The let* form is important here as it will expand into a series of nested let forms. Without this, we'd rebind total to the original value of n on each invokation of the continuation object. With let* we can store a running total.
As you can see, we have used our power over control flow to compute factorial without using recursion or any built in looping construct. Of course, this implementation also suffers from two obvious downsides: 1) It's somewhat esoteric and hard to understand and; 2) It mutates local state, which makes it harder to prove that this function will maintain referential transparency.
Conclusion
With the power of first-class continuation handling at our disposal, we have the ability to define several control flow constructs common to other programming languages, such as backtracking, try/catch exception handling and "green" threads.
Studying a topic like continuations helps us to solidify our knowledge of several concepts we may generally grow to take for granted. The idea of managing control flow is so rudimentary to our field that we may never stop to think about how such an implicitely understood idea can be explained effectively.
In terms of further reading, I'd suggest the call-with-current-continuation, Continuation and Call stack articles on Wikipedia. The R5RS standard documents call/cc and other control-handling procedures that are available in Scheme. The discussion on understanding continuations at Lambda the Ultimate also helped me in writing this article.
And, finally, I think when such high-level control of continuations is given to a programmer, the words of the great Uncle Ben must be remembered:
"WITH GREAT POWER THERE MUST ALSO COME - GREAT RESPONSIBILITY!"
- Amazing Fantasy #15 (August 1962) - The first Spider-Man story.
Understanding lexical closures
A while ago I was having a discussion with a programmer friend of mine. He'd been writing some Javascript at work and asked me about closures. Whilst I understood the concept enough to use it in my own code, I found it difficult to explain effectively. Although common knowledge in functional programming circles, closures (and, more importantly, their usefulness) are often hard to grasp for us younger programmers who are trained to think in terms of OOP. This article represents my attempt at explaining lexical closures in the simplest possible terms. I hope my language choice of Clojure is not too esoteric.
Ok, lets get started...
Definition
A closure is a language feature in which a lambda expression (read: function) is bound to the referencing environment that was available when it was first defined. The function is said to close over it's referencing environment, hence the name. If you are unfamiliar with the term "referencing environment" (some people call it the "lexical environment"), we can think of it as a simple dictionary (key -> value) mapping free variables to their respective values. By "free variables", I refer to bindings that have occured in an outer scope (such as the surrounding function). In lexically scoped languages, a functions referencing environment can generally be known simply by looking at the source code of a program.
I don't expect you have any idea what that really means yet - just keep reading!
Explained simply
A simple way to think of a closure in practical terms is as a package that contains two things:
1) A body of code
2) The referencing environment that was active when said code was defined
So, when we invoke a closure (execute the body of code in the package), rather than creating a new referencing environment the closure will grab it's saved environment from the package and use that instead.
The following example demonstrates how we can use closures to hide helper functions that would serve no purpose outside their defining context. Sorry, I could not help but add a factorial example:
The function "fact-helper" is now private to the letfn expression. The named function "fact" is created in the global scope, but retains access to the referencing environment that was available when it was defined. In OOP, we generally achieve this same effect by making a class and defining private functions within it.
Language prerequisites
In order to allow for closures, a language needs to atleast support nested functions. Most functional languages and many popular imperative languages meet (well, surpass) this criteria. Python, all Lisps, Javascript, Haskell, etc all support true first-class functions and the block construct of Smalltalk and Ruby is sufficient. Without the means to lexically nest scopes (C, from memory), then closures serve no purpose - their local environment will remain the same on each invokation and the global environment will be either identical or larger. A few languages, such as Pascal and Algol, allow for nested functions but do not allow functions to be returned as values. This means that the closure is only available in the scope it was defined. Languages like these are said to have environments with "limited extent", which means that lexical bindings are destroyed upon leaving their scope.
On the flip-side, languages that support first-class and/or anonymous functions must also ensure that a referencing environment has "unlimited extent". That is, the interpreter or compiler must guarantee that the environment will not be destroyed until the garbage collector can prove that there is no reference to it. This is important because a closure may be returned from a function and thus may outlive the scope in which it was defined. The somewhat contrived example below illustrates this point:
It is important to note that the environment that the closure reinstates will contain the same bindings, but their values may not be the same. This is especially true in imperative languages in which assignment (changing the value of a variable) is key.
Conclusion
Closure is a simple concept that all programmers should atleast have a basic understanding of. Many modern programming languages are adding support for first-class functions, which means that this style of programming may well become more popular in mainstream languages. As more multi-paradigm languages hit the mainstream, it will become increasingly more important that effective programmers can reason about their programs from multiple perspectives.
In terms of further reading, I suggest you take a look at the Wikipedia page and/or read SICP. Finally, I cannot claim to be an expert on this subject, so if you have any feedback and or corrections, please let me know. Thanks!
Fun with fractals #2
Following on from my last article, I'd like to show you a common fractal pattern known as the Sierpinski Triangle. It was first described formally in 1915 by Waclaw Sierpinski. You've probably seen it before -- it's basically a triangle, which contains three smaller triangles within it. This process can be repeated ad infinitum.
There are many ways to construct the Sierpinski Triangle, but one method that really grabbed my attention involved an approximation from the rows of Pascal's Triangle. It's really quite simple, but I'll briefly explain Pascal's Triangle first.
Pascal's Triangle is an ancient (discovered long before Pascals time) pattern of numbers in which each number within the triangle is the sum of the two numbers directly above it. The first row is simply 1, and the sides of the triangle are made up of 1's. So it grows like this: [1], [1, 1], [1, 2, 1], [1, 3, 3, 1], etc. Take a look at the Wikipedia article for a proper introduction.
As it turns out, if we generate Pascal's Triangle to 64 rows (or, more formally, any 2^n) and then colour all of the odd numbers in, we get a bloody good representation of the Sierpinski Triangle! I'd think that if we continued this process infinitely (or to some very large n), you'd be unable to tell the difference between the two.
Writing a program to demonstrate this cool little trick was pretty simple. It's just a matter of generating the triangle as a list of expanding lists (one for each row) and then painting each row of numbers to the window by traversing it and selecting the correct colour based on the each value (dark blue for odd, light blue for even). It looks like this:

You can take a look at the source code here: http://github.com/buntine/Fractals/blob/master/pascals-triangle.scm
Fun with fractals #1
A few weeks ago I stumbled upon this rather eccentric documentary, narrated by Arthur C. Clarke, about the discovery of the Mandlebrot set and fractal geometry in general: Fractals - The colors of infinity.
A fractal, described very simply, is a geometric pattern that is repeatable at every level of magnification. Fractals, therefore, cannot be described by our traditional models of geometry. Typically, if you inspect a fractal at differing levels of magnification, you will see identical (or very similar) copies of the whole pattern. Of course, such phenomena has always existed, but it hasn't been until the advent of sufficiently powerful computers that we have been able to generate fractals on a large scale of complexity.
In nature, one of the most simple examples of a fractal-like shape is a tree. Using a recursive algorithm and a simple graphics library, we are able to easily generate a fractal tree. I chose the Lisp language PLT Scheme (or Racket, as it's known these days) to write my programs. Here is my first attempt:

Not bad. Well, perhaps a little too uniform. It works by implementing a recursive pattern known (not surprisingly) as tree recursion. That is, each invokation of the function triggers two further recursive calls. Each call to the function plots a new X/Y position by translating a radian (angle * (PI / 180)) and a length into "cartesian coordinates". It then calls itself again with the new X/Y as a starting position and a new angle, once for (angle - 20) and again for (angle + 20) to simulate the splitting of a branch. This process is repeated a finite number of times and the length of each branch is relative to the current depth of recursion.
By simply adding some randomisation into the angles of each branch (instead of always bending 20 degrees), we start to see a slightly more realistic looking tree! See my second attempt below, which draws four trees with some randomised angles to simulate a burnt-out looking forest:

The source code for both of these programs is available on my Github account here http://github.com/buntine/Fractals/blob/master/uniform-tree.scm; and here: http://github.com/buntine/Fractals/blob/master/random-tree.scm.
SICP exercise 1.12 - Pascal's Triangle
Exercise 1.12 in SICP is proposed as follows:
The following pattern of numbers is called Pascal’s triangle.
| Row | Values |
|---|---|
| 1 | 1 |
| 2 | 1 1 |
| 3 | 1 2 1 |
| 4 | 1 3 3 1 |
| 5 | 1 4 6 4 1 |
| ... | |
The numbers at the edge of the triangle are all 1, and each number inside the triangle is the sum of the two numbers above it. Write a procedure that computes elements of Pascal’s triangle by means of a recursive process.
My brain saw it as:
...
Write a procedure that computes THE elements of Pascal’s triangle by means of a recursive process.
I spent about 30 minutes working on a solution that would actually return a structure containing all of the elements of Pascals Triangle up to a particular row/width. I was wondering the whole time how any non-experienced programmer could ever solve the problem at such an early stage of the book! D'oh!
Here is what I came up with:
(define (pascal width)
(reverse (pascal-helper '((1)) width)))
(define (pascal-helper rows width)
(if (= (length rows) width)
rows
(pascal-helper (cons (next-row (car rows))
rows)
width)))
(define (next-row previous)
(let ((body (row-body '() previous)))
(if (null? body)
'(1 1)
(cons 1 (append body '(1))))))
(define (row-body body previous)
(if (null? (cdr previous))
body
(row-body (cons (+ (car previous) (cadr previous))
body)
(cdr previous)))) And lets see it in action:
> (pascal 5) > ((1) (1 1) (1 2 1) (1 3 3 1) (1 4 6 4 1))
It works by building each row of the triangle in reverse order. Each row implements a linearly recursive process to build itself based on the contents of the previous row. We consume the previous row by taking it's cdr on each resursion. This means that the value of element y in row x is the sum of the first two elements in the previous row. This process is continued until the previous row has only one element left. At this point we throw a 1 on either side and we are done!
From now on I will be sure to pay more attention when interpreting the requirements for each exercise (although I secretly think this was much more fun)!
Trees in Scheme: Parsing
This is the second article in a two (maybe three) part series on tree data structures in Scheme. See the "representation article", also.
Previously, I've spoken about one way of representing a tree data structure in Scheme. This time, I'll talk about a simple algorithm for parsing such a structure, which uses a technique called mutual recursion.
Wikipedia defines mutual recursion as "a form of recursion where two mathematical or computational functions are defined in terms of each other". This is an appropriate explanation for our case, as we'll have two procedures which rely upon each other to produce a single result.
The first procedure will extract the child nodes from a tree structure. The second procedure will accept those children, a list of trees, and extract and parse each one. This process is performed recursively.
As an example, lets write an accumulative procedure that consumes a tree and returns a number representing the amount of parent nodes (that is, nodes which contain further children):
; Using the tree we created in the first article > (count-parents family-tree) 3
We can define this as follows:
(define (leaf? node)
(null? (cdr node)))
(define (count-parents tree)
(if (leaf? tree)
0
(+ 1 (count-parents-in-trees (cdr tree)))))
(define (count-parents-in-trees trees)
(if (null? trees)
0
(+ (count-parents (car trees))
(count-parents-in-trees (cdr trees)))))It works by allowing the two procedures to only deal with the information they need. count-parents deals with trees, whilst count-parents-in-trees deals with a list of trees. leaf? is just a simple helper. Each branch of each tree is searched downwards until we reach a dead-end - a node with no children.
The recursive pattern in count-parents-in-trees should be fairly familiar. Basically, we take the first element (car) for inspection, and recurse with the remaining elements (cdr). Because of this, our base-case is the empty list, at which point we've hit every node in the branch and should have our result.
This general pattern can be applied to many tasks involving trees, from representing them graphically to mutating their structure. For example, to write a procedure that takes a tree and returns a new tree without the leaf nodes, all we need to do is use mutual recursion to build up a copy of the tree in question and ignore every leaf node.
Trees in Scheme: Representation
This is the first part of a two (maybe three) part series on tree data structures in Scheme.
In many programming problems, an efficient way of organising data is in a tree-like structure. In essence, a tree in this sense consists of a root node, followed by zero or more sub-nodes, each of which can be followed by a set of further nodes. Common terminology for this node/subnodes relationship is mother and children, respectively. Other common terms are root/branch/leaf.
A picture is a great way to provide an example of such a structure:

The tree above is quite simple. It consists of a root node, followed by two children. The first child contains no further children (a leaf), and the second child is the parent to one child. An important thing to realise at this point is that every child is infact a perfectly valid tree in and of itself. We can see this in more detail a bit further on.
Representing such a structure in Scheme (or any implementation of LISP) is really quite easy. We use the obvious building block: lists. The tree in the image above would be encoded as:
'(a (b) (c (d)))
As I mentioned before, if we snap off the branches from the root node, we are left with a set of new trees; a forest. In LISP terms, the cdr of a tree is a list of lists:
'((b) (c (d)))
The first item in this list is a tree with only a root node. Not a very useful tree, but a tree none-the-less. The second; a tree with a root node and a single child element. The reason I've brought this up is because it shows us that there is most certainly a convenient recursive way to deal with such a structure (this I will talk about in the next article).
To help us with creating trees and get our terminology contextually proper, we can define the following helper procedures:
(define (tree root . children) (cons root children)) (define (leaf-node node) (tree node))
So, with these new procedures in place, lets create a basic tree structure:
> (define family-tree
(tree 'family
(tree 'karl
(leaf-node 'oak))
(tree 'colin
(leaf-node 'celine)
(leaf-node 'ashin))))
> family-tree
(family (karl (oak)) (colin (celine) (ashin))) In the next article I'll talk about a cool way of recursively parsing such a structure.
Simply Scheme: SICP for the rest of us
If you're like me, you've probably read the first 150 pages of SICP (The Structure and Interpretation of Computer Programs) about three times. Personally, I find that it starts out extremely easy, but quickly becomes difficult. After a while I find that I am still reading, but not a whole lot is making sense to me. It's as though there is an underlying peice of knowledge that the authors assume I possess, which I don't.
I am of the school of "SICP should be read by all programmers". But I think my exact statement would be: "SICP should be read, and understood, by all programmers".
If we take a look on Amazon, we will see the strange mix of reviews SICP has received thus far (stars/reviews): 5/89, 4/8, 3/8, 2/3, 1/53
What?? So, I guess this implies that people either LOVE or HATE this book. If you go through the reviews in detail, you can see that a lot of the 1-star reviewers are upset with the text because it's simply too abstract and they do not leave with any confidence. I feel it is also worth mentioning that this text has received an elitist badge in our industry. If you read it, watch the lectures, and brainwash yourself into thinking you understood everything, you are now part of the genius club and can start quoting Paul Graham and Peter Norvig.
Enter Simply Scheme (Brain Harvey, Matthew Wright. 2nd ed, 1999, MIT press). A fundamental computer programming textbook for those of us who have trouble tackling SICP (and also for the people who, out of fear, pretend they could). According to the authors, SS was actually written as a pre-cursor to SICP. This textbook presents all the major concepts in a much more digestible way than SICP. This is good news!
As a word of warning, it's worth knowing that SS does use Scheme, but it teaches a slightly modified version of it. You will learn effective program design, but not the inner workings of Scheme itself. I already knew scheme when I started this book, and it annoyed me a little at the beginning, but I quickly got used to it.
The entire text is available free on the web. Take a look: http://www.eecs.berkeley.edu/~bh/ss-toc2.html.
I have also spent a lot of my freetime publishing my answers to all of the exercises at the end of each chapter. You can take a look at my solutions here: http://github.com/buntine/Simply-Scheme-Exercises (perhaps you can fork me and juxtapose your own solutions??). I hope this can help someone in their travels.
So, once you get through Simply Scheme, you may be much better suited to properly tackle SICP and finally become a master craftsman!