Editing Tail call

{{Short description|Subroutine call performed as final action of a procedure}}
In [[computer science]], a '''tail call''' is a [[subroutine]] call performed as the final action of a procedure.<ref name="MuchnickAssociates1997">{{cite book|author1=Steven Muchnick|author2=Muchnick and Associates|title=Advanced Compiler Design Implementation|url=https://archive.org/details/advancedcompiler00much|url-access=registration|date=15 August 1997|publisher=Morgan Kaufmann|isbn=978-1-55860-320-2}}</ref> If the target of a tail is the same subroutine, the subroutine is said to be '''tail recursive''', which is a special case of direct [[recursion (computer science)|recursion]]. '''Tail recursion''' (or '''tail-end recursion''') is particularly useful, and is often easy to optimize in implementations. 

Tail calls can be implemented without adding a new [[stack frame]] to the [[call stack]]. Most of the frame of the current procedure is no longer needed, and can be replaced by the frame of the tail call, modified as appropriate (similar to [[Exec (system call)|overlay]] for processes, but for function calls). The program can then [[jump (computer science)|jump]] to the called subroutine. Producing such code instead of a standard call sequence is called '''tail-call elimination''' or '''tail-call optimization'''. Tail-call elimination allows procedure calls in tail position to be implemented as efficiently as [[goto]] statements, thus allowing efficient [[structured programming]]. In the words of [[Guy L. Steele]], "in general, procedure calls may be usefully thought of as GOTO statements which also pass parameters, and can be uniformly coded as [machine code] JUMP instructions."<ref name="aim-443" />

Not all programming languages require tail-call elimination. However, in [[functional programming language]]s, tail-call elimination is often guaranteed by the [[Programming language specification|language standard]], allowing tail recursion to use a similar amount of memory as an equivalent [[loop (computing)|loop]]. The special case of tail-recursive calls, when a function calls itself, may be more amenable to call elimination than general tail calls. When the language semantics do not explicitly support general tail calls, a compiler can often still optimize '''sibling calls''', or tail calls to functions which take and return the same types as the caller.<ref name="llvm.org">{{cite web|url=http://llvm.org/docs/CodeGenerator.html#sibling-call-optimization|title=The LLVM Target-Independent Code Generator — LLVM 7 documentation|website=llvm.org}}</ref>

==Description==
When a function is called, the computer must "remember" the place it was called from, the ''[[Return address (computing)|return address]]'', so that it can return to that location with the result once the call is complete. Typically, this information is saved on the [[call stack]], a list of return locations in the order that the call locations were reached. In addition, compiler allocates memory for local variables of the called function and pushes register content (if any and/or relevant) onto stack. Typically, it is done by allocating a stack frame including saved registers, space allocated for non-register local variables, return address and call parameters (unless they are passed in registers). For tail calls, there is no need to remember the caller or preserve content of registers – instead, tail-call elimination avoids allocation of new stack frames and makes only the minimum necessary changes to the existing stack frame before passing it on,<ref>{{cite web|url=https://cstheory.stackexchange.com/q/7540 |title=recursion - Stack memory usage for tail calls - Theoretical Computer Science |publisher=Cstheory.stackexchange.com |date=2011-07-29 |access-date=2013-03-21}}</ref> and the tail-called function will return directly to the ''original'' caller. This, however, leads to complete loss of the caller's stack frame, which is sometimes considered as a hindrance in debugging. The tail call doesn't have to appear lexically after all other statements in the source code; it is only important that the calling function return immediately after the tail call, returning the tail call's result if any, since the calling function is bypassed when the optimization is performed.

For non-recursive function calls, this is usually an [[Program optimization|optimization]] that saves only a little time and space, since there are not that many different functions available to call. When dealing with recursive or [[mutually recursive]] functions where recursion happens through tail calls, however, the stack space and the number of returns saved can grow to be very significant, since a function can call itself, directly or indirectly, creating a new call stack frame each time. Tail-call elimination often reduces asymptotic stack space requirements from linear, or [[Big-O notation|O]](n), to constant, or [[Big-O notation|O]](1). Tail-call elimination is thus required by the standard definitions of some programming languages, such as [[Scheme (programming language)|Scheme]],<ref name='SchemeProperTailRec'>{{cite web|url=http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-8.html#node_sec_5.11 |title=Revised^6 Report on the Algorithmic Language Scheme |publisher=R6rs.org |access-date=2013-03-21}}</ref><ref>{{cite web|url=http://www.r6rs.org/final/html/r6rs-rationale/r6rs-rationale-Z-H-7.html#node_sec_5.3 |title=Revised^6 Report on the Algorithmic Language Scheme - Rationale |publisher=R6rs.org |access-date=2013-03-21}}</ref> and languages in the [[ML (programming language)|ML]] family among others. The Scheme language definition formalizes the intuitive notion of tail position exactly, by specifying which syntactic forms allow having results in tail context.<ref>{{cite web|url=http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-14.html#node_sec_11.20 |title=Revised^6 Report on the Algorithmic Language Scheme |publisher=R6rs.org |access-date=2013-03-21}}</ref> Implementations allowing an unlimited number of tail calls to be active at the same moment, thanks to tail-call elimination, can also be called 'properly tail recursive'.<ref name='SchemeProperTailRec' />

Besides space and execution efficiency, tail-call elimination is important in the [[functional programming]] idiom known as [[continuation-passing style]] (CPS), which would otherwise quickly run out of stack space.

==Syntactic form==
A tail call can be located just before the syntactical end of a function:

<syntaxhighlight lang="javascript">
function foo(data) {
    a(data);
    return b(data);
}
</syntaxhighlight>

Here, both <code>a(data)</code> and <code>b(data)</code> are calls, but <code>b</code> is the last thing the procedure executes before returning and is thus in tail position. However, not all tail calls are necessarily located at the syntactical end of a subroutine:

<syntaxhighlight lang="javascript">
function bar(data) {
    if (a(data)) {
        return b(data);
    }
    return c(data);
}
</syntaxhighlight>

Here, both calls to <code>b</code> and <code>c</code> are in tail position. This is because each of them lies in the end of if-branch respectively, even though the first one is not syntactically at the end of <code>bar</code>'s body.

In this code:

<syntaxhighlight lang="javascript">
function foo1(data) {
    return a(data) + 1;
}
</syntaxhighlight>
<syntaxhighlight lang="javascript">
function foo2(data) {
    var ret = a(data);
    return ret;
}
</syntaxhighlight>
<syntaxhighlight lang="javascript">
function foo3(data) {
    var ret = a(data);
    return (ret == 0) ? 1 : ret;
}
</syntaxhighlight>

the call to <code>a(data)</code> is in tail position in <code>foo2</code>, but it is '''not''' in tail position either in <code>foo1</code> or in <code>foo3</code>, because control must return to the caller to allow it to inspect or modify the return value before returning it.

==Example programs==
The following program is an example in [[Scheme (programming language)|Scheme]]:<ref name="sicp">{{cite book|last1=Sussman|first1=G. J.|last2=Abelson|first2=Hal|title=Structure and Interpretation of Computer Programs|date=1984|publisher=MIT Press|location=Cambridge, MA|isbn=0-262-01077-1|url=https://archive.org/details/structureinterpr00abel}}</ref>

<syntaxhighlight lang="scheme">
;; factorial : number -> number
;; to calculate the product of all positive
;; integers less than or equal to n.
(define (factorial n)
 (if (= n 0)
    1
    (* n (factorial (- n 1)))))
</syntaxhighlight>

This is not written in a tail-recursive style, because the multiplication function ("*") is in the tail position. This can be compared to:

<syntaxhighlight lang="scheme">
;; factorial : number -> number
;; to calculate the product of all positive
;; integers less than or equal to n.
(define (factorial n)
  (fact-iter 1 n))
(define (fact-iter product n)
  (if (= n 0)
      product
      (fact-iter (* product n)
                 (- n 1))))
</syntaxhighlight>

This program assumes [[Evaluation strategy#Applicative order|applicative-order]] evaluation. The inner procedure <code>fact-iter</code> calls itself ''last'' in the control flow. This allows an [[interpreter (computer software)|interpreter]] or [[compiler]] to reorganize the execution which would ordinarily look like this:<ref name="sicp" />

   call factorial (4)
    call fact-iter (1 4)
     call fact-iter (4 3)
      call fact-iter (12 2)
       call fact-iter (24 1)
       return 24
      return 24
     return 24
    return 24
   return 24

into the more [[Algorithmic efficiency|efficient]] variant, in terms of both space and time:

   call factorial (4)
    call fact-iter (1 4)
    replace arguments with (4 3)
    replace arguments with (12 2)
    replace arguments with (24 1)
    return 24
   return 24

This reorganization saves space because no state except for the calling function's address needs to be saved, either on the stack or on the heap, and the call stack frame for <code>fact-iter</code> is reused for the intermediate results storage. This also means that the programmer need not worry about running out of stack or heap space for extremely deep recursions. In typical implementations, the tail-recursive variant will be substantially faster than the other variant, but only by a constant factor.

Some programmers working in functional languages will rewrite recursive code to be tail recursive so they can take advantage of this feature. This often requires addition of an "accumulator" argument (<code>product</code> in the above example) to the function. 

==Tail recursion modulo cons==
'''Tail recursion modulo cons''' is a generalization of tail-recursion optimization introduced by [[David H. D. Warren]]<ref>D. H. D. Warren, ''DAI Research Report 141'', University of Edinburgh, 1980.</ref> in the context of [[compiler|compilation]] of [[Prolog]], seen as an ''explicitly'' [[Single assignment#Single assignment|''set once'']] language. It was described (though not named) by [[Daniel P. Friedman]] and [[David S. Wise]] in 1974<ref>Daniel P. Friedman and David S. Wise, [http://www.cs.indiana.edu/cgi-bin/techreports/TRNNN.cgi?trnum=TR19 ''Technical Report TR19: Unwinding Structured Recursions into Iterations''], Indiana University, Dec. 1974. PDF available [https://legacy.cs.indiana.edu/ftp/techreports/TR19.pdf ''here''] (webarchived copy [https://web.archive.org/web/20221023082940/https://legacy.cs.indiana.edu/ftp/techreports/TR19.pdf ''here'']).</ref> as a [[LISP]] compilation technique. As the name suggests, it applies when the only operation left to perform after a recursive call is to prepend a known value in front of the list returned from it (or to perform a constant number of simple data-constructing operations, in general). This call would thus be a ''tail call'' save for ("[[modulo (jargon)|modulo]]") the said ''[[cons]]'' operation. But prefixing a value at the start of a list ''on exit'' from a recursive call is the same as appending this value at the end of the growing list ''on entry'' into the recursive call, thus building the list as a [[side effect (computer science)|side effect]], as if in an implicit accumulator parameter. The following Prolog fragment illustrates the concept:

===Example code===
{|
|-
|<syntaxhighlight lang="prolog">
% Prolog, tail recursive modulo cons:
partition([], _, [], []).
partition([X|Xs], Pivot, [X|Rest], Bigs) :-
  X @< Pivot, !,
  partition(Xs, Pivot, Rest, Bigs).
partition([X|Xs], Pivot, Smalls, [X|Rest]) :-
  partition(Xs, Pivot, Smalls, Rest).
</syntaxhighlight>
|<syntaxhighlight lang="haskell">
-- In Haskell, guarded recursion:
partition [] _ = ([],[])
partition (x:xs) p 
        | x < p     = (x:a,b)
        | otherwise = (a,x:b)
   where
      (a,b) = partition xs p

</syntaxhighlight>
|-
|<syntaxhighlight lang="prolog">
% Prolog, with explicit unifications:
%     non-tail recursive translation:
partition([], _, [], []).
partition(L, Pivot, Smalls, Bigs) :- L=[X|Xs],
 (  X @< Pivot
 -> partition(Xs,Pivot,Rest,Bigs), Smalls=[X|Rest]
 ;  partition(Xs,Pivot,Smalls,Rest), Bigs=[X|Rest]
 ).
</syntaxhighlight>
|<syntaxhighlight lang="prolog">
% Prolog, with explicit unifications:
%     tail-recursive translation:
partition([], _, [], []).
partition(L, Pivot, Smalls, Bigs) :- L=[X|Xs],
 (  X @< Pivot
 -> Smalls=[X|Rest], partition(Xs,Pivot,Rest,Bigs)
 ;  Bigs=[X|Rest], partition(Xs,Pivot,Smalls,Rest)
 ).

</syntaxhighlight>
|}

Thus in tail-recursive translation such a call is transformed into first creating a new [[Node (computer science)|list node]] and setting its <code>first</code> field, and ''then'' making the tail call with the pointer to the node's <code>rest</code> field as argument, to be filled recursively. The same effect is achieved when the recursion is ''guarded'' under a lazily evaluated data constructor, which is automatically achieved in lazy programming languages like Haskell.

===C example===
The following fragment defines a recursive function in [[C (programming language)|C]] that duplicates a linked list (with some equivalent Scheme and Prolog code as comments, for comparison):
{|
|-valign="top"
|rowspan="2"|
<syntaxhighlight lang="c">
typedef struct list {
    void *value;
    struct list *next;
} list;

list *duplicate(const list *ls)
{
    list *head = NULL;

    if (ls != NULL) {
        list *p = duplicate(ls->next);
        head = malloc(sizeof *head);
        head->value = ls->value;
        head->next = p;
    }
    return head;
}
</syntaxhighlight>
|<syntaxhighlight lang="scheme">
;; in Scheme,
(define (duplicate ls)
  (if (not (null? ls))
    (cons (car ls)
          (duplicate (cdr ls)))
    '()))
</syntaxhighlight>
|-
|<syntaxhighlight lang="prolog">
%% in Prolog,
duplicate([X|Xs],R):-
  duplicate(Xs,Ys),
  R=[X|Ys].
duplicate([],[]).
</syntaxhighlight>
|}

In this form the function is not tail recursive, because control returns to the caller after the recursive call duplicates the rest of the input list. Even if it were to allocate the ''head'' node before duplicating the rest, it would still need to plug in the result of the recursive call into the <code>next</code> field ''after'' the call.{{efn|Like this:
<syntaxhighlight lang="c">
if (ls != NULL) {
  head = malloc( sizeof *head);
  head->value = ls->value;
  head->next = duplicate( ls->next);
}</syntaxhighlight>
}}
So the function is ''almost'' tail recursive. Warren's method pushes the responsibility of filling the <code>next</code> field into the recursive call itself, which thus becomes tail call.{{efn|Like this:
<syntaxhighlight lang="c">
if (ls != NULL) {
  head = malloc( sizeof *head);
  head->value = ls->value;
  duplicate( ls->next, &(head->next));
}</syntaxhighlight>
}} Using  sentinel head node to simplify the code, 

{|
|-valign="top"
|rowspan="2"|
<syntaxhighlight lang="c">
typedef struct list {
    void *value;
    struct list *next;
} list;
void duplicate_aux(const list *ls, list *end);

list *duplicate(const list *ls)
{  
    list head;

    duplicate_aux(ls, &head);
    return head.next;
}

void duplicate_aux(const list *ls, list *end)
{
    if (ls != NULL) {
        end->next = malloc(sizeof *end);
        end->next->value = ls->value;
        duplicate_aux(ls->next, end->next);
    } else {
        end->next = NULL;
    }
}
</syntaxhighlight>
|<syntaxhighlight lang="scheme">
;; in Scheme,
(define (duplicate ls)
  (let ((head (list 1)))
    (let dup ((ls  ls)
              (end head))
      (cond
        ((not (null? ls))
         (set-cdr! end (list (car ls)))
         (dup (cdr ls) (cdr end)))))
    (cdr head)))
</syntaxhighlight>
|-
|<syntaxhighlight lang="prolog">
%% in Prolog,
duplicate([X|Xs],R):-
   R=[X|Ys],
   duplicate(Xs,Ys).
duplicate([],[]).
</syntaxhighlight>
|}

The callee now appends to the end of the growing list, rather than have the caller prepend to the beginning of the returned list. The work is now done on the way ''forward'' from the list's start, ''before'' the recursive call which then proceeds further, instead of ''backward'' from the list's end, ''after'' the recursive call has returned its result. It is thus similar to the accumulating parameter technique, turning a recursive computation into an iterative one.

Characteristically for this technique, a parent [[call frame|frame]] is created on the execution call stack, which the tail-recursive callee can reuse as its own call frame if the tail-call optimization is present.

The tail-recursive implementation can now be converted into an explicitly iterative implementation, as an accumulating [[Loop (computing)#Loops|loop]]:
{|
|-valign="top"
|rowspan="2"|
<syntaxhighlight lang="c">
typedef struct list {
    void *value;
    struct list *next;
} list;

list *duplicate(const list *ls)
{
    list head, *end;
    end = &head;
    while (ls != NULL)
    {
        end->next        = malloc(sizeof *end);
        end->next->value = ls->value;
        ls = ls->next;
        end = end->next;
    }
    end->next = NULL;
    return head.next;
}
</syntaxhighlight>
|<syntaxhighlight lang="scheme">
 ;; in Scheme,
 (define (duplicate ls)
   (let ((head (list 1)))
     (do ((end head (cdr end))
          (ls  ls   (cdr ls )))
         ((null? ls) (cdr head))
       (set-cdr! end (list (car ls))))))
</syntaxhighlight>
|-
|<syntaxhighlight lang="prolog">
%% in Prolog,
%% N/A
</syntaxhighlight>
|}

==History==
In a paper delivered to the [[Association for Computing Machinery|ACM]] conference in Seattle in 1977, [[Guy L. Steele]] summarized the debate over the [[GOTO]] and [[structured programming]], and observed that procedure calls in the tail position of a procedure can be best treated as a direct transfer of control to the called procedure, typically eliminating unnecessary stack manipulation operations.<ref name="aim-443">{{Cite book|doi=10.1145/800179.810196|isbn=978-1-4503-2308-6|hdl=1721.1/5753|chapter=Debunking the “expensive procedure call” myth or, procedure call implementations considered harmful or, LAMBDA: The Ultimate GOTO|title=Proceedings of the 1977 annual conference on - ACM '77|year=1977|last1=Steele|first1=Guy Lewis|pages=153–162 |s2cid=9807843}}</ref> Since such "tail calls" are very common in [[Lisp (programming language)|Lisp]], a language where procedure calls are ubiquitous, this form of optimization considerably reduces the cost of a procedure call compared to other implementations. Steele argued that poorly-implemented procedure calls had led to an artificial perception that the GOTO was cheap compared to the procedure call. Steele further argued that "in general procedure calls may be usefully thought of as GOTO statements which also pass parameters, and can be uniformly coded as [machine code] JUMP instructions", with the machine code stack manipulation instructions "considered an optimization (rather than vice versa!)".<ref name="aim-443"/> Steele cited evidence that well-optimized numerical algorithms in Lisp could execute faster than code produced by then-available commercial Fortran compilers because the cost of a procedure call in Lisp was much lower. In [[Scheme (programming language)|Scheme]], a Lisp dialect developed by Steele with [[Gerald Jay Sussman]], tail-call elimination is guaranteed to be implemented in any interpreter.<ref name="r5rs">R5RS Sec. 3.5, {{Cite journal
|author1=Richard Kelsey |author2=William Clinger |author3=Jonathan Rees |date=August 1998
| title = Revised<sup>5</sup> Report on the Algorithmic Language Scheme
| url = http://www.schemers.org/Documents/Standards/R5RS/
| journal = Higher-Order and Symbolic Computation
| volume = 11
| issue = 1
| pages = 7–105
| doi = 10.1023/A:1010051815785
|s2cid=14069423 |display-authors=etal| url-access = subscription
}}</ref>

==Implementation methods==
Tail recursion is important to some [[high-level programming language|high-level languages]], especially [[functional programming|functional]] and [[logic programming|logic]] languages and members of the [[Lisp programming language|Lisp]] family. In these languages, tail recursion is the most commonly used way (and sometimes the only way available) of implementing iteration. The language specification of Scheme requires that tail calls are to be optimized so as not to grow the stack. Tail calls can be made explicitly in [[Perl]], with a variant of the "goto" statement that takes a function name: <code>goto &NAME;</code><ref>{{cite web|author=Contact details |url=http://perldoc.perl.org/functions/goto.html |title=goto |publisher=perldoc.perl.org |access-date=2013-03-21}}</ref>

However, for language implementations which store function arguments and local variables on a [[call stack]] (which is the default implementation for many languages, at least on systems with a [[hardware stack]], such as the [[x86]]), implementing generalized tail-call optimization (including mutual tail recursion) presents an issue: if the size of the callee's activation record is different from that of the caller, then additional cleanup or resizing of the stack frame may be required. For these cases, optimizing tail recursion remains trivial, but general tail-call optimization may be harder to implement efficiently.

For example, in the [[Java virtual machine]] (JVM), tail-recursive calls can be eliminated (as this reuses the existing call stack), but general tail calls cannot be (as this changes the call stack).<ref>"[https://stackoverflow.com/questions/12045299/what-is-difference-between-tail-calls-and-tail-recursion What is difference between tail calls and tail recursion?]", ''Stack Overflow''</ref><ref>"[http://programmers.stackexchange.com/questions/157684/what-limitations-does-the-jvm-impose-on-tail-call-optimization What limitations does the JVM impose on tail-call optimization]", ''Programmers Stack Exchange''</ref> As a result, functional languages such as [[Scala (programming language)|Scala]] that target the JVM can efficiently implement direct tail recursion, but not mutual tail recursion.

The [[GNU Compiler Collection|GCC]], [[Clang|LLVM/Clang]], and [[Intel C Compiler|Intel]] compiler suites perform tail-call optimization for [[C (programming language)|C]] and other languages at higher optimization levels or when the <code>-foptimize-sibling-calls</code> option is passed.<ref name="llvm-documentation-tco">{{cite web |last1=Lattner |first1=Chris |title=LLVM Language Reference Manual, section: The LLVM Target-Independent Code Generator, sub: Tail Call Optimization |url=http://llvm.org/docs/CodeGenerator.html#tail-call-optimization |website=The LLVM Compiler Infrastructure |publisher=The LLVM Project |access-date=24 June 2018 |ref=llvm.org}}</ref><ref>{{cite web|url=https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html|title=Using the GNU Compiler Collection (GCC): Optimize Options|website=gcc.gnu.org}}</ref><ref>{{cite web |url=https://software.intel.com/en-us/node/522809 |title=foptimize-sibling-calls|website=software.intel.com}}</ref> Though the given language syntax may not explicitly support it, the compiler can make this optimization whenever it can determine that the return types for the caller and callee are equivalent, and that the argument types passed to both function are either the same, or require the same amount of total storage space on the call stack.<ref>{{cite web|url=http://www.drdobbs.com/tackling-c-tail-calls/184401756|title=Tackling C++ Tail Calls}}</ref>

Various implementation methods are available.

===In assembly===
{{unreferenced section|date=June 2014}}
Tail calls are often optimized by [[interpreter (computing)|interpreters]] and [[compiler]]s of [[functional programming]] and [[logic programming]] languages to more efficient forms of [[iteration]]. For example, [[Scheme (programming language)|Scheme]] programmers commonly express [[while loop]]s as calls to procedures in tail position and rely on the Scheme compiler or interpreter to substitute the tail calls with more efficient [[jump (computer science)|jump]] instructions.<ref>{{cite web | url=https://gcc.gnu.org/ml/gcc/2000-07/msg00595.html | title=proper tail recursion for gcc | publisher=GCC Project | date=20 July 2000 | access-date=10 March 2015 | author=Probst, Mark}}</ref>

For compilers generating assembly directly, tail-call elimination is easy: it suffices to replace a call opcode with a jump one, after fixing parameters on the stack. From a compiler's perspective, the first example above is initially translated into pseudo-[[assembly language]] (in fact, this is valid [[x86 assembly language|x86 assembly]]):

<syntaxhighlight lang="asm">
 foo:
  call B
  call A
  ret
</syntaxhighlight>

Tail-call elimination replaces the last two lines with a single jump instruction:

<syntaxhighlight lang="asm">

 foo:
  call B
  jmp  A
</syntaxhighlight>

After subroutine <code>A</code> completes, it will then return directly to the return address of <code>foo</code>, omitting the unnecessary <code>ret</code> statement.

Typically, the subroutines being called need to be supplied with [[parameter (computer science)|parameter]]s. The generated code thus needs to make sure that the [[call frame]] for A is properly set up before jumping to the tail-called subroutine. For instance, on [[computing platform|platform]]s where the [[call stack]] does not just contain the [[return statement|return address]], but also the parameters for the subroutine, the compiler may need to emit instructions to adjust the call stack. On such a platform, for the code:

 '''function''' foo(data1, data2)
    B(data1)
    '''return''' A(data2)

(where <code>data1</code> and <code>data2</code> are parameters) a compiler might translate that as:{{efn| The <code>call</code> instruction first pushes the current code location onto the stack and then performs an unconditional jump to the code location indicated by the label. The <code>ret</code> instruction first pops a code location off the stack, then performs an unconditional jump to the retrieved code location. }}

<syntaxhighlight lang="nasm" line>
 foo:
   mov  reg,[sp+data1] ; fetch data1 from stack (sp) parameter into a scratch register.
   push reg            ; put data1 on stack where B expects it
   call B              ; B uses data1
   pop                 ; remove data1 from stack
   mov  reg,[sp+data2] ; fetch data2 from stack (sp) parameter into a scratch register.
   push reg            ; put data2 on stack where A expects it
   call A              ; A uses data2
   pop                 ; remove data2 from stack.
   ret
</syntaxhighlight>

A tail-call optimizer could then change the code to:

<syntaxhighlight lang="nasm" line>
 foo:
   mov  reg,[sp+data1] ; fetch data1 from stack (sp) parameter into a scratch register.
   push reg            ; put data1 on stack where B expects it
   call B              ; B uses data1
   pop                 ; remove data1 from stack
   mov  reg,[sp+data2] ; fetch data2 from stack (sp) parameter into a scratch register.
   mov  [sp+data1],reg ; put data2 where A expects it
   jmp  A              ; A uses data2 and returns immediately to caller.
</syntaxhighlight>

This code is more efficient both in terms of execution speed and use of stack space.

===Through trampolining===
Since many [[Scheme (programming language)|Scheme]] compilers use [[C (programming language)|C]] as an intermediate target code, the tail recursion must be encoded in C without growing the stack, even if the C compiler does not optimize tail calls. Many implementations achieve this by using a device known as a [[Trampoline (computers)|trampoline]], a piece of code that repeatedly calls functions. All functions are entered via the trampoline. When a function has to tail-call another, instead of calling it directly and then returning the result, it returns the address of the function to be called and the call parameters back to the trampoline (from which it was called itself), and the trampoline takes care of calling this function next with the specified parameters. This ensures that the C stack does not grow and iteration can continue indefinitely.

It is possible to implement trampolines using [[higher-order function]]s in languages that support them, such as [[Groovy (programming language)|Groovy]], [[Visual Basic .NET]] and [[C Sharp (programming language)|C#]].<ref name="onyourtail">Samuel Jack, [http://blog.functionalfun.net/2008/04/bouncing-on-your-tail.html Bouncing on your tail]. ''Functional Fun''. April 9, 2008.</ref>

Using a trampoline for all function calls is rather more expensive than the normal C function call, so at least one Scheme compiler, [[Chicken (Scheme implementation)|Chicken]], uses a technique first described by [[Henry Baker (computer scientist)|Henry Baker]] from an unpublished suggestion by [[Andrew Appel]],<ref name="Chicken">Henry Baker, [http://home.pipeline.com/~hbaker1/CheneyMTA.html "CONS Should Not CONS Its Arguments, Part II: Cheney on the M.T.A."]</ref> in which normal C&nbsp;calls are used but the stack size is checked before every call. When the stack reaches its maximum permitted size, objects on the stack are [[garbage collection (computer science)|garbage-collected]] using the [[Cheney algorithm]] by moving all live data into a separate heap. Following this, the stack is unwound ("popped") and the program resumes from the state saved just before the garbage collection. Baker says "Appel's method avoids making a large number of small trampoline bounces by occasionally jumping off the Empire State Building."<ref name="Chicken" /> The garbage collection ensures that mutual tail recursion can continue indefinitely. However, this approach requires that no C function call ever returns, since there is no guarantee that its caller's stack frame still exists; therefore, it involves a much more dramatic internal rewriting of the program code: [[continuation-passing style]].

==Relation to the ''while'' statement==
Tail recursion can be related to the [[while loop|''while'' statement]], an explicit iteration, for instance by transforming

 '''procedure''' foo(''x'')
     '''if''' ''p''(''x'')
         '''return''' bar(''x'')
     '''else'''
         '''return''' foo(baz(''x''))

into

 '''procedure''' foo(''x'')
     '''while''' '''true'''
         '''if''' ''p''(''x'')
             '''return''' bar(''x'')
         '''else'''
             ''x'' ← baz(''x'')

where ''x'' may be a tuple involving more than one variable: if so, care must be taken in implementing the [[Assignment (computer science)|assignment statement]] ''x'' ← baz(''x'') so that dependencies are respected. One may need to introduce auxiliary variables or use a ''[[Swap (computer science)|swap]]'' construct.

More generally,

 '''procedure''' foo(''x'')
     '''if''' ''p''(''x'')
         '''return''' bar(''x'')
     '''else if''' ''q''(''x'')
         '''return''' baz(''x'')
     ...
     '''else if''' ''r''(''x'')
         '''return''' foo(qux(''x''))
     ...
     '''else'''
         '''return''' foo(quux(''x''))

can be transformed into

 '''procedure''' foo(''x'')
     '''while''' '''true'''
         '''if''' ''p''(''x'')
             '''return''' bar(''x'')
         '''else if''' ''q''(''x'')
             '''return''' baz(''x'')
         ...
         '''else if''' ''r''(''x'')
             ''x'' ← qux(''x'')
         ...
         '''else'''
             ''x'' ← quux(''x'')

For instance, this [[Julia (programming language)|Julia]] program gives a non-tail recursive definition <code>fact</code> of the factorial:

<syntaxhighlight lang="julia" line="1">
function factorial(n)
    if n == 0
        return 1
    else
        return n * factorial(n - 1)
    end
end 
</syntaxhighlight>

Indeed, <code>n * factorial(n - 1)</code> wraps the call to <code>factorial</code>. But it can be transformed into a tail-recursive definition by adding an argument <code>a</code> called an ''accumulator''.<ref name="sicp" />

This Julia program gives a tail-recursive definition <code>fact_iter</code> of the factorial:

<syntaxhighlight lang="julia" line="1">
function factorial(n::Integer, a::Integer)
    if n == 0:
        return a
    else
        return factorial(n - 1, n * a)
    end
end

function factorial(n::Integer)
    return factorial(n, 1)
end
</syntaxhighlight>

This Julia program gives an iterative definition <code>fact_iter</code> of the factorial:

<syntaxhighlight lang="julia">
function fact_iter(n::Integer, a::Integer)
    while n > 0
        a = n * a
        n = n - 1
    end
    return a
end

function factorial(n::Integer)
    return fact_iter(n, one(n))
end
</syntaxhighlight>

==Language support==
* [[C++ (programming language)|C++]] {{En dash}} C and C++ both do tail-call optimization.<ref>{{cite web|url=https://stackoverflow.com/questions/34125/which-if-any-c-compilers-do-tail-recursion-optimization|title=Which, if any, C++ compilers do tail-recursion optimization?|website=stackoverflow}}</ref>
* [[Clojure (programming language)|Clojure]] {{En dash}} Clojure has <code>recur</code> special form.<ref>{{cite web|url=https://clojure.org/reference/special_forms#recur|title=(recur expr*)|website=clojure.org}}</ref>
* [[Common Lisp]] {{En dash}} Some implementations perform tail-call optimization during compilation if optimizing for speed
* [[Elixir (programming language)|Elixir]] {{En dash}} Elixir implements tail-call optimization, <ref>{{cite web|url=http://elixir-lang.org/getting-started/recursion.html|title=Recursion|website=elixir-lang.github.com}}</ref>as do all languages currently targeting the BEAM VM.
* [[Elm (programming language)|Elm]] {{En dash}} Yes<ref>{{ cite web|url=https://functional-programming-in-elm.netlify.app/recursion/tail-call-elimination.html|title=Functional Programming in Elm: Tail-Call Elimination|first=Evan|last=Czaplicki}}</ref>
* [[Erlang (programming language)|Erlang]] {{En dash}} Yes
* [[F Sharp (programming language)|F#]] {{En dash}} F# implements TCO by default where possible <ref>{{cite web|url=https://blogs.msdn.microsoft.com/fsharpteam/2011/07/08/tail-calls-in-f/|title=Tail Calls in F#|website=msdn|date=8 July 2011 |publisher=Microsoft}}</ref>
* [[Go (programming language)|Go]] {{En dash}} No support<ref>{{cite web|url=https://github.com/golang/go/issues/22624|title=proposal: Go 2: add become statement to support tail calls|website=github.com}}</ref>
* [[Haskell (programming language)|Haskell]] {{En dash}} Yes<ref>{{Cite web|url=https://wiki.haskell.org/Tail_recursion|title=Tail recursion - HaskellWiki|website=wiki.haskell.org|access-date=2019-06-08}}</ref>
* [[JavaScript]] {{En dash}} [[ECMAScript]] 6.0 compliant engines should have tail calls<ref>{{cite web|url=http://bdadam.com/blog/video-douglas-crockford-about-the-new-good-parts.html|title=Worth watching: Douglas Crockford speaking about the good new parts of JavaScript in 2014|first=Adam|last=Beres-Deak|website=bdadam.com}}</ref> which is now implemented on [[Safari (browser)|Safari]]/[[WebKit]]<ref>{{cite web|url=https://www.webkit.org/blog/4054/es6-in-webkit/|title=ECMAScript 6 in WebKit| date=13 October 2015}}</ref> but rejected by V8 and SpiderMonkey
* [[Kotlin (programming language)|Kotlin]] {{En dash}} Has <code>tailrec</code> modifier for functions<ref>{{cite web| url=https://kotlinlang.org/docs/reference/functions.html#tail-recursive-functions|title=Functions: infix, vararg, tailrec - Kotlin Programming Language|website=Kotlin}}</ref>
* [[Lua (programming language)|Lua]] {{En dash}} Tail recursion is required by the language definition<ref>{{cite web| url=https://www.lua.org/manual/5.3/manual.html#3.4.10|title=Lua 5.3 Reference Manual|website=www.lua.org}}</ref>
* [[Objective-C]] {{En dash}} Compiler optimizes tail calls when -O1 (or higher) option specified, but it is easily disturbed by calls added by [[Automatic Reference Counting]] (ARC).
* [[OCaml]] {{En dash}} Yes
* [[Perl (programming language)|Perl]] {{En dash}} Explicit with a variant of the "goto" statement that takes a function name: <code>goto &NAME;</code><ref>{{cite web|url=http://perldoc.perl.org/functions/goto.html|title=goto - perldoc.perl.org| website=perldoc.perl.org}}</ref>
* [[PureScript]] {{En dash}} Yes
* [[Python (programming language)|Python]] {{En dash}} Stock Python implementations do not perform tail-call optimization, though a third-party module is available to do this.<ref>{{cite web|url=https://github.com/baruchel/tco|title=baruchel/tco|website=GitHub|date=29 March 2022}}</ref> Language inventor [[Guido van Rossum]] contended that [[stack traces]] are altered by tail-call elimination making debugging harder, and preferred that programmers use explicit [[iteration]] instead.<ref>{{cite web|url=http://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html|title=Neopythonic: Tail Recursion Elimination|first=Guido Van|last=Rossum|date=22 April 2009}}</ref> In Python 3.14, a new interpreter was introduced that uses tail-call based dispatch of Python opcodes.<ref name="Python Tail-Call Interpreter Dev Discussion">{{cite web |date=2024-01-08 |title=Tail-calling interpreter |url=https://github.com/faster-cpython/ideas/issues/642 |access-date=2025-03-08 |website=GitHub}}</ref> This resulted in overall improved performance when compared to Python 3.13.<ref>{{Cite web |title=What's new in Python 3.14 |url=https://docs.python.org/3.14/whatsnew/3.14.html#a-new-type-of-interpreter |access-date=2025-02-19 |website=Python documentation |language=en}}</ref><ref name="Python Tail-Calling Interpreter Proposal">{{cite web |date=2025-01-06 |title=A new tail-calling interpreter for significantly better interpreter performance |url=https://github.com/python/cpython/issues/128563 |access-date=2025-03-08 |website=GitHub}}</ref>
* [[R (programming language)|R]] {{En dash}} Yes, {{Code|tailcall()}} function introduced in R.4.4.0<ref>{{Cite web |date=2024-04-25 |title=What's new in R 4.4.0? |url=https://www.jumpingrivers.com/blog/whats-new-r44/ |access-date=2024-04-28 |website=www.jumpingrivers.com |language=en-gb}}</ref>
* [[Racket (programming language)|Racket]] {{En dash}} Yes<ref>{{cite web|url=https://docs.racket-lang.org/reference/eval-model.html#(part._.Tail_.Position)|title=The Racket Reference|website=docs.racket-lang.org}}</ref>
* [[Ruby (programming language)|Ruby]] {{En dash}} Yes, but disabled by default <ref>{{cite web|url=https://docs.ruby-lang.org/en/master/RubyVM/InstructionSequence.html#method-c-compile_option-3D|title=Ruby Tail Call Optimisation}}</ref>
* [[Rust (programming language)|Rust]] {{En dash}} tail-call optimization may be done in limited circumstances, but is not guaranteed<ref>{{cite web|url=https://prev.rust-lang.org/en-US/faq.html#does-rust-do-tail-call-optimization|title=Rust FAQ|website=prev.rust-lang.org}}</ref>
* [[Scala (programming language)|Scala]] {{En dash}} Tail-recursive functions are automatically optimized by the compiler. Such functions can also optionally be marked with a <code>@tailrec</code> annotation, which makes it a compilation error if the function is not tail recursive<ref>{{Cite web|url=https://www.scala-lang.org/api/2.13.0/scala/annotation/tailrec.html|title=Scala Standard Library 2.13.0 - scala.annotation.tailrec|website=www.scala-lang.org|access-date=2019-06-20}}</ref>
* [[Scheme (programming language)|Scheme]] {{En dash}} Required by the language definition<ref>{{cite web|url=http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-6.html#%25_sec_3.5|title=Revised^5 Report on the Algorithmic Language Scheme| website=www.schemers.org}}</ref><ref>{{cite web|url=http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-8.html#node_sec_5.11| title=Revised^6 Report on the Algorithmic Language Scheme|website=www.r6rs.org}}</ref>
* [[Swift_(programming_language)|Swift]] {{En dash}} In some cases (as of 2014).<ref>{{cite web |title=Does Swift implement tail call optimization? |url=https://stackoverflow.com/questions/24023580/does-swift-implement-tail-call-optimization-and-in-mutual-recursion-case |access-date=13 March 2024 |date=2014}}</ref>
* [[Tcl]] {{En dash}} Since Tcl 8.6, Tcl has a {{Code|tailcall}} command<ref>{{cite web|url=http://www.tcl.tk/man/tcl/TclCmd/tailcall.htm| title=tailcall manual page - Tcl Built-In Commands|website=www.tcl.tk}}</ref>
* [[Zig (programming language)|Zig]] {{En dash}} Yes<ref>{{cite web | url=https://ziglang.org/documentation/master/#call | title=Documentation - the Zig Programming Language }}</ref>

==See also==
{{Portal|Computer programming}}
{{Wiktionary|tail recursion}}
* [[Course-of-values recursion]]
* [[Recursion (computer science)]]
* [[Primitive recursive function]]
* [[Inline expansion]]
* [[Leaf subroutine]]
* [[Corecursion]]

==Notes==
{{Notelist}}

==References==
{{Reflist}}

{{DEFAULTSORT:Tail Call}}
[[Category:Programming language implementation]]
[[Category:Implementation of functional programming languages]]
[[Category:Subroutines]]
[[Category:Control flow]]
[[Category:Recursion]]
[[Category:Scheme (programming language)]]
[[Category:Articles with example C code]]
[[Category:Articles with example Scheme (programming language) code]]
[[pt:Recursividade (ciência da computação)#Funções recursivas em cauda]]
[[es:Recursión (ciencias de computación)#Funciones de recursión de cola]]