Under A Boddhi Tree: How does scheme do tail call optimization?

Last night, I was curious how to implement tail-call optimization for Shi (the language I am going to work on). I was curious how current scheme implementations did this. Since many schemes are implemented in C how do you do tail call optimization if C itself doesn't do tail call optimization?

But first, what does TCO really do anyway? And why do so many C(++) programmers lambast functional style recursion? I have to laugh when FW engineers at my company poo-poo recursion. Unfortunately, engineers who are not familiar with other languages aren't even aware that it's not recursion that is at fault, but the lack of sophistication of the C(++) compiler.

The unenlightened think that all languages suffer stack overflows. This is not true however. In brief, when a function call is made, a stack frame is allocated on the call stack, and the call stack has a limited amount of memory. One of the duties of a stack frame is to provide a return address so that as one function call completes, the stack frame is popped off, and the program can return to where the execution was left off. So in recursion without TCO, a new stack frame is allocated for every function call, and this is why you can "blow the stack". This is one reason why C(++) programmers (claim) recursion is so bad. In truth, I think most imperative style language programmers simply don't want to wrap their brains around recursion (or if you think lazily, induction) . The other mythical reason imperative style programmers claim recursion is bad is because procedure calls are expensive (because you have to push a new frame onto the call stack). This too is erroneous.

They are myths, however they are correct from a C(++) point of view. But again, don't make the mistake that recursion or function calls themselves are bad which may dissuade open-minded programmers from attempting to learn functional style programming if they only listen to their unenlightened peers. For example, the D programming language DOES do tail call optimization. But how did these myths come about in the first place?

In one part of the important "Lambda Papers" by the legendary Guy Steele called somewhat verbosely, "Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO", Steele explains why function calls are not expensive as believed. As the title somewhat indicates, this gives a historical account over how function calls got a bad rap in the first place and gives an interesting perspective on the attitudes towards GOTO (from way back in the day). I recommend people read this, as it is an interesting account. But germane to this discussion, Steele debunks why function calls are considered "expensive". Steele basically points out three things:

1) GOTO statements are "universal" control flow statements
2) GOTO's are cheap, because in machine code, they are just a branch or a jump (as opposed to a switch or a case for example which becomes many machine code ops
3) Procedure calls are in essence GOTOs that can pass in arguments

Given the above three, function calls are therefore "cheap" and also become control flow in their own right. Interestingly, the paper mentions that stack space does not have to be consumed when using this "GOTO" method for tail recursion when lexical scoping is used (as opposed to dynamic scoping as in lisp). I presume this is because when variables are lexically scoped, the stack frame itself carries the reference(s) to the variables, as opposed to having variables being passed around dynamically. I could be mistaken on this point though.

So, TCO makes it possible to not consume a stack frame on every function call. And although the paper hints at how to do this from an assembly point of view, that still didn't explain how schemes that use C as the Intermediate Represenation performs TCO, since C itself can't do TCO. At first, I thought maybe they used setjmp/longjmp to save off the stack frame, and then on the recursive call use longjmp to go back. The problem is the unwinding of the stack frames (which may point to no longer valid frames). Still, it seems at least possible to do it this way.

I then came across something called a "trampoline", which I recognized somewhat from Clojure. A trampoline in clojure can be used for mutual recursion, but the trampoline described here is a function which "jumps" to other functions. Also after reading this, I came across an abstract discussing how to use the heap to perform tail call recursions.

This is all pretty fascinating, and I wish I had read the lambda papers before. I am even starting to read the legendary SICP book so that I can better understand Scheme. I guess Clojure was kind of like the gateway drug into functional programming and lisps :) It's even made me look a little at haskell....but first, I want to get Shi rolling.

Under A Boddhi Tree

Saturday, May 5, 2012

How does scheme do tail call optimization?

No comments:

Post a Comment