Tail call optimization

What is recursion? #

Recursion is a fundamental concept in computer science where a function calls itself. It can be an elegant solution, especially for problems that can be broken down into smaller, similar subproblems. However, a naive implementation of recursion can lead to significant drawbacks, such as excessive memory usage and potential stack overflow errors, especially when dealing with deep recursion. This is where Tail Call Optimization (TCO) comes into play.

So, how does recursion work in a call stack? #

Before diving into tail call optimization, it’s essential to understand how the call stack works during function execution.

When a function is called, the computer stores the context of that function (its arguments, local variables, and the location to return to) on the call stack. As more functions are called (including recursive ones), additional stack frames are pushed onto the stack. When a function completes, its stack frame is popped off the stack, and the control is returned to the previous function.

In the case of recursion, each function call adds a new frame to the stack. Deep recursion can lead to a large number of frames on the stack, potentially exhausting the available memory, leading to a stack overflow error. This is particularly problematic in languages or environments that do not handle large stacks efficiently.

How can we achieve good performance in recursion? #

Guess what, that’s where tail call optimization comes to the table.

Tail Call Optimization (TCO) is a technique used by some compilers and interpreters to optimize recursive functions and reduce the memory footprint of recursion. It occurs when the last action of a function is to call another function (or itself), and there is nothing left to do after that call returns. This is known as a tail call.

In such cases, the current function’s stack frame is no longer needed and can be replaced with the stack frame of the next function call. Instead of adding a new frame to the stack, the existing frame is reused, which prevents the stack from growing and effectively allows recursion to run in constant space.

What compilers and interpreters do not use Tail Call Optimization? #

Python:

The CPython interpreter (the standard and most widely used Python implementation) does not support TCO. This is a design choice to maintain the readability and debuggability of stack traces in Python.

Java:

The Java Virtual Machine (JVM) does not perform TCO. This is primarily because Java was designed with a different philosophy, emphasizing explicit control structures like loops for iteration.

Ruby:

The standard Ruby interpreter (MRI, also known as CRuby) does not support TCO. Like Python, Ruby strongly emphasizes stack traces for debugging purposes.

PHP:

PHP, like Python and Ruby, does not implement TCO in its standard interpreter. Recursive calls in PHP will add to the call stack, and deep recursion can result in stack overflow errors.

👀 C/C++:

While C and C++ do not guarantee TCO, some compilers (like GCC and Clang) may perform it under certain conditions when optimizations are enabled. However, TCO is not a part of the C/C++ standards, so it cannot be relied upon consistently across different compilers or even different compiler versions. Developers often need to use specific compiler flags to enable such optimizations, and it’s not applied in all cases.

How does Tail Call Optimization look? #

First, let’s take a look to a function that is NOT tail call optimized.

1
2
3
4
5
6
7
8
9
function sum(n) {
    if (n === 0) {
        return 0;
    } else {
        return n + sum(n - 1);
    }
}

console.log(sum(5)); // Output: 15

In the preovious example, the recursive call has a sum along with it, that means the function has to make an aditional operation after the recursive call returns, so every recursive call must wait the next one to complete, adding more frames to the stack.

Now, let’s take a look to a function that is tail call optimized.

1
2
3
4
5
6
7
8
9
function sum(n, accumulator = 0) {
    if (n === 0) {
        return accumulator;
    } else {
        return sum(n - 1, accumulator + n);
    }
}

console.log(sum(5)); // Output: 15

In the previous code there is nothing else to do at the end of the function, so the optimization comes by reusing the stack frame instead of adding more as related operations as we have seen in the previous example.

Conclusions #

It is important to understand the impact of our code, runtimes and compilers help use with a lot of stuff that we don’t need to even think about but as much we know about what we write, more powerful, resource optimized and clean will be our software.