Understanding Evlis Tail Recursion
Fred Akalin
While reading about proper tail recursion in Scheme, I encountered a similar but obscure optimization called evlis tail recursion. In the paper where it was first described, the author claims it "dramatically improve the space performance of many programs," which sounded promising.
However, the few places where its mentioned don't do much more than state its definition and claim its usefulness. Hopefully I can provide a more detailed analysis here.
(define (fact n) (if (<= n 1) 1 (* n (fact (- n 1)))))
It is not tail-recursive, since the recursive call is nested in
another procedure call. However, it's almost tail-recursive;
the call to *
is a tail call, and the recursive call is
its last subexpression, so it will be the last subexpression to be
evaluated.Recall what happens when a procedure call (represented as a list of subexpressions) is evaluated: each subexpression is evaluated, and the first result (the procedure) is passed the other results as arguments.[2]
Evlis tail recursion can be described as follows: when performing a procedure call and during the evaluation of the last subexpression, the calling environment is discarded as soon as it is not required.[3] The distinction between evlis tail recursion and proper tail recursion is subtle. Proper tail recursion requires only that the calling environment be discarded before the actual procedure call; evlis tail recursion discards the calling environment even sooner, if possible.
fact
as
defined above, say you evaluate (fact 10)
and you're in
the procedure call with n = 5
. The call stack of a
properly tail-recursive interpreter would look like this:
evalExpr -------- env = { n: 10 } -> <top-level environment> expr = '(* n (fact (- n 1)))' proc = <native function: *> args = [10, <pending evalExpr('(fact (- n 1))', env)>] evalExpr -------- env = { n: 9 } -> <top-level environment> expr = '(* n (fact (- n 1)))' proc = <native function: *> args = [9, <pending evalExpr('(fact (- n 1))', env)>] ... evalExpr -------- env = { n: 6 } -> <top-level environment> expr = '(* n (fact (- n 1)))' proc = <native function: *> args = [6, <pending evalExpr('(fact (- n 1))', env)>] evalExpr -------- env = { n: 5 } -> <top-level environment> expr = '(if ...)'whereas the call stack of an evlis tail-recursive interpreter would look like this:
evalExpr -------- env = { n: 5 } -> <top-level environment> pendingProcedureCalls = [ [<native function: *>, 10], [<native function: *>, 9], ... [<native function: *>, 6] ] expr = (if ...)In this implementation, the last subexpression of a procedure call is evaluated exactly like a tail expression, but the procedure call and non-last subexpressions are pushed onto a stack. Whenever an expression is reduced to a simple one and the stack is non-empty, a pending procedure call with its other args are popped off, and it is called with the reduced expression as the final argument.
Note that this didn't change the asymptotic behavior of the procedure; it still takes \(Θ(n)\) memory to evaluate. However, only the bare minimum of information is saved: the list of pending functions and their arguments. Other auxiliary variables, and crucially the nested calling environments, aren't preserved, leading to a significant constant-factor reduction in memory.
(define (fact2 n)
(define v (make-vector n))
(* (n (fact2 (- n 1)))))
Before the main body of the function, a vector of size \(n\) is
defined. This means that the environments in the call stack of a
properly tail-recursive interpreter would look like this:[5]
env = { n: 10, v: <vector of size 10> } -> <top-level environment> env = { n: 9, v: <vector of size 9> } -> <top-level environment> env = { n: 8, v: <vector of size 8> } -> <top-level environment> env = { n: 7, v: <vector of size 7> } -> <top-level environment> ...whereas the an evlis tail-recursive interpreter would keep around only the current environment. Therefore, the properly tail-recursive interpreter would require \(Θ(n^2)\) memory to evaluate
(fact2 n)
while the evlis tail-recursive
interpreter would require only \(Θ(n)\)Studying examples like the one above enabled me to finally understand how evlin tail recursion worked and what sort of savings it gives. However, I have yet to find a practical example where evlis tail recursion gives the same sort of asymptotic gains as described above, and I'd be interested to receive some. But perhaps the "large gains" mentioned in the various papers describing it are only constant-factor reductions in memory.
In any case, another important difference in Scheme between proper tail recursion and evlis tail recursion is that the former is a language feature and the latter is an optimization. That means that it is acceptable and even encouraged to write Scheme programs that take advantage of proper tail recursion, but it would be unwise to rely on evlis tail recursion for the asymptotic performance of your function. Instead, one should treat it just as a nice constant-factor speed gain.
Note that it is easy to make evlis tail recursion "smarter." Since
Scheme doesn't specify the order of argument evaluation, an
interpreter could evaluate arguments to maximize the gains from evlis
tail recursion. As an easy example, if we had switched the arguments
to +
in fact
above, making it
non-evlis-tail-recursive, a smart compiler could still treat it as
such. A possible rule of thumb would be to pick a non-trivial
function call to evaluate last.
To complete the picture, I will outline below the evaluation function for a simple evlis tail-recursive Scheme interpreter in Javascript. All of the sources I've found describe it in terms of compilers, so I think it'll be useful to have a reference implementation for an interpreter.
function evalExpr(expr, env) {
// Fake tail calls with a while loop and continue.
while (true) {
// Symbols, constants, quoted expressions, and lambdas.
if (isSimpleExpr(expr)) {
// The only exit point.
return evalSimpleExpr(expr, env);
}
// (if test conseq alt)
if (isSpecialForm(expr, 'if')) {
expr = evalExpr(expr[1], env) ? expr[2] : expr[3];
continue;
}
// (set! var expr)
if (isSpecialForm(expr, 'set!')) {
env.set(expr[1], evalExpr(expr[2], env));
expr = null;
continue;
}
// (define var expr?)
if (isSpecialForm(expr, 'define')) {
env.define(expr[1], evalExpr(expr[2], env));
expr = null;
continue;
}
// (begin expr*)
if (isSpecialForm(expr, 'begin')) {
if (expr.length == 1) {
expr = null;
continue;
}
// Evaluate all but the last subexpression.
for (var i = 1; i < expr.length - 1; ++i) {
evalExpr(expr[i], env);
}
expr = expr[expr.length - 1];
continue;
}
// (proc expr*)
var proc = evalExpr(expr.shift(), env);
var args = expr.map(function(subExpr) { return evalExpr(subExpr, env); });
// proc.run() returns its body in result.expr and the environment
// in which to evaluate it (with all its arguments bound) in
// result.env.
var result = proc.run(args);
expr = result.expr;
// The only time when env is changed.
env = result.env;
continue;
}
}
Then implementing evlis tail recursion requires only a few
changes:
function evalExpr(expr, env) {
// This is a stack of procedures and their non-final arguments that
// are waiting for their final argument to be evaluated.
var pendingProcedureCalls = [];
while (true) {
if (isSimpleExpr(expr)) {
expr = evalSimpleExpr(expr, env);
// Discard calling environment.
env = null;
if (pendingProcedureCalls.length == 0) {
// No pending procedure calls, so we're done (the only exit
// point).
return expr;
}
var args = pendingProcedureCalls.pop();
var proc = args.shift();
args.push(expr);
var result = proc.run(args);
expr = result.expr;
// Change to new environment (the only time when env is
// changed).
env = result.env;
continue;
}
...
// Everything else remains the same.
...
// (proc expr*)
var nonFinalSubExprs =
exprs.slice(0, -1).map(
function(subExpr) { return evalExpr(subExpr, env); });
pendingProcecureCalls.push(nonFinalSubExprs);
// Evaluate the last subexpression as a tail call.
expr = expr[expr.length - 1];
continue;
}
}
Like this post? Subscribe to my feed or follow me on Twitter .
Footnotes
[1] Assume a left-to-right evaluation order for now. ↩
[2] The function that takes a list of expressions, evaluates them,
and returns the results as a list is traditionally
called evlis
, hence the name of the optimization.
↩
[3] This assumes that the calling environment isn't stored somewhere else. ↩
[4] This was adapted from an example in Proper Tail Recursion and Space Efficiency. ↩
[5] Assume that the interpreter isn't smart enough to deduce that \(v\) can be optimized out since it's never used. ↩