Editing Cycle detection (section)

==Algorithms==

If the input is given as a subroutine for calculating {{mvar|f}}, the cycle detection problem may be trivially solved using only {{math|''λ'' + ''μ''}} function applications, simply by computing the sequence of values {{math|''x<sub>i</sub>''}} and using a [[data structure]] such as a [[hash table]] to store these values and test whether each subsequent value has already been stored. However, the space complexity of this algorithm is proportional to {{math|''λ'' + ''μ''}}, unnecessarily large. Additionally, to implement this method as a [[pointer algorithm]] would require applying the equality test to each pair of values, resulting in quadratic time overall. Thus, research in this area has concentrated on two goals: using less space than this naive algorithm, and finding pointer algorithms that use fewer equality tests.

{{Anchor|Tortoise and hare}}
===Floyd's tortoise and hare===
[[File:Tortoise and hare algorithm.svg|thumb|upright=1.25|Floyd's "tortoise and hare" cycle detection algorithm, applied to the sequence 2, 0, 6, 3, 1, 6, 3, 1, ...]]
'''Floyd's cycle-finding algorithm''' is a pointer algorithm that uses only two pointers, which move through the sequence at different speeds. It is also called the "tortoise and the hare algorithm", alluding to Aesop's fable of [[The Tortoise and the Hare]].

The algorithm is named after [[Robert W. Floyd]], who was credited with its invention by [[Donald Knuth]].<ref name="knuth">{{citation|first=Donald E.|last=Knuth|author-link=Donald Knuth|title=The Art of Computer Programming, vol. II: Seminumerical Algorithms|publisher=Addison-Wesley|year=1969|page=7, exercises 6 and 7}}</ref><ref>''Handbook of Applied Cryptography,'' by Alfred J. Menezes, Paul C. van Oorschot, Scott A. Vanstone, [https://books.google.com/books?id=nSzoG72E93MC&pg=PA125 p. 125], describes this algorithm and others</ref> However, the algorithm does not appear in Floyd's published work, and this may be a misattribution: Floyd describes algorithms for listing all simple cycles in a [[directed graph]] in a 1967 paper,<ref>{{citation|last=Floyd|first=R.W.|author-link=Robert W. Floyd|title=Nondeterministic Algorithms|journal=J. ACM|pages=636–644|volume=14|issue=4|year=1967|doi=10.1145/321420.321422|s2cid=1990464|doi-access=free}}</ref> but this paper does not describe the cycle-finding problem in functional graphs that is the subject of this article. In fact, Knuth's statement (in 1969), attributing it to Floyd, without citation, is the first known appearance in print, and it thus may be a [[Mathematical folklore|folk theorem]], not attributable to a single individual.<ref>''The Hash Function BLAKE,'' by Jean-Philippe Aumasson, Willi Meier, Raphael C.-W. Phan, Luca Henzen (2015), [https://books.google.com/books?id=nhPmBQAAQBAJ&pg=PA21 p. 21], footnote 8</ref>

The key insight in the algorithm is as follows. If there is a cycle, then, for any integers {{math|''i'' ≥ ''μ''}} and {{math|''k'' ≥ 0}}, {{math|1=''x<sub>i</sub>'' = ''x''<sub>''i'' + ''kλ''</sub>}}, where {{mvar|λ}} is the length of the loop to be found, {{mvar|μ}} is the index of the first element of the cycle, and {{mvar|k}} is a whole integer representing the number of loops. Based on this, it can then be shown that {{math|1=''i'' = ''kλ'' ≥ ''μ''}} for some {{math|''k''}} if and only if {{math|1=''x<sub>i</sub>'' = ''x''<sub>2''i''</sub>}} (if {{math|1=''x<sub>i</sub>'' = ''x''<sub>2''i''</sub>}} in the cycle, then there exists some {{mvar|k}} such that {{math|1=2''i'' = ''i'' + ''kλ''}}, which implies that {{math|1=''i'' = ''kλ''}}; and if there are some {{mvar|i}} and {{mvar|k}} such that {{math|1=''i'' = ''kλ''}}, then {{math|1=''2i'' = ''i'' + ''kλ''}} and {{math|1= ''x''<sub>2''i''</sub> = ''x''<sub>''i'' + ''kλ''</sub>}}).  Thus, the algorithm only needs to check for repeated values of this special form, one twice as far from the start of the sequence as the other, to find a period {{mvar|ν}} of a repetition that is a multiple of {{mvar|λ}}. Once {{mvar|ν}} is found, the algorithm retraces the sequence from its start to find the first repeated value {{math|''x''<sub>''μ''</sub>}} in the sequence, using the fact that {{mvar|λ}} divides {{mvar|ν}} and therefore that {{math|1=''x''<sub>''μ''</sub> = ''x''<sub>''μ'' + ''v''</sub>}}.  Finally, once the value of {{mvar|μ}} is known it is trivial to find the length {{mvar|λ}} of the shortest repeating cycle, by searching for the first position {{math|''μ'' + ''λ''}} for which {{math|1=''x''<sub>''μ'' + ''λ''</sub> = ''x''<sub>''μ''</sub>}}.

The algorithm thus maintains two pointers into the given sequence, one (the tortoise) at {{math|''x<sub>i</sub>''}}, and the other (the hare) at {{math|''x''<sub>2''i''</sub>}}. At each step of the algorithm, it increases {{mvar|i}} by one, moving the tortoise one step forward and the hare two steps forward in the sequence, and then compares the sequence values at these two pointers. The smallest value of {{math|''i'' &gt; 0}} for which the tortoise and hare point to equal values is the desired value {{mvar|ν}}.

The following [[Python (programming language)|Python]] code shows how this idea may be implemented as an algorithm.

<syntaxhighlight lang="python">
def floyd(f, x0) -> (int, int):
    """Floyd's cycle detection algorithm."""
    # Main phase of algorithm: finding a repetition x_i = x_2i.
    # The hare moves twice as quickly as the tortoise and
    # the distance between them increases by 1 at each step.
    # Eventually they will both be inside the cycle and then,
    # at some point, the distance between them will be
    # divisible by the period λ.
    tortoise = f(x0) # f(x0) is the element/node next to x0.
    hare = f(f(x0))
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(f(hare))
  
    # At this point the tortoise position, ν, which is also equal
    # to the distance between hare and tortoise, is divisible by
    # the period λ. So hare moving in cycle one step at a time, 
    # and tortoise (reset to x0) moving towards the cycle, will 
    # intersect at the beginning of the cycle. Because the 
    # distance between them is constant at 2ν, a multiple of λ,
    # they will agree as soon as the tortoise reaches index μ.

    # Find the position μ of first repetition.    
    mu = 0
    tortoise = x0
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(hare)   # Hare and tortoise move at same speed
        mu += 1
 
    # Find the length of the shortest cycle starting from x_μ
    # The hare moves one step at a time while tortoise is still.
    # lam is incremented until λ is found.
    lam = 1
    hare = f(tortoise)
    while tortoise != hare:
        hare = f(hare)
        lam += 1
 
    return lam, mu
</syntaxhighlight>

This code only accesses the sequence by storing and copying pointers, function evaluations, and equality tests; therefore, it qualifies as a pointer algorithm. The algorithm uses {{math|''O''(''λ'' + ''μ'')}} operations of these types, and {{math|''O''(1)}} storage space.<ref>{{harvtxt|Joux|2009}}, Section 7.1.1, Floyd's cycle-finding algorithm, pp. 225–226.</ref>

===Brent's algorithm===

[[Richard Brent (scientist)|Richard P. Brent]] described an alternative cycle detection algorithm that, like the tortoise and hare algorithm, requires only two pointers into the sequence.<ref name="brent">{{citation|first=R. P.|last=Brent|author-link=Richard Brent (scientist)|title=An improved Monte Carlo factorization algorithm|journal=[[BIT Numerical Mathematics ]]|volume=20|year=1980|pages=176–184|url=https://maths-people.anu.edu.au/~brent/pd/rpb051i.pdf|doi=10.1007/BF01933190|issue=2|s2cid=17181286}}.</ref> However, it is based on a different principle: searching for the smallest [[power of two]] {{math|2<sup>''i''</sup>}} that is larger than both {{mvar|λ}} and {{mvar|μ}}. For {{math|1=''i'' = 0, 1, 2, ...}}, the algorithm compares {{math|''x''<sub>2<sup>''i''</sup>&minus;1</sub>}} with each subsequent sequence value up to the next power of two, stopping when it finds a match. It has two advantages compared to the tortoise and hare algorithm: it finds the correct length {{mvar|λ}} of the cycle directly, rather than needing to search for it in a subsequent stage, and its steps involve only one evaluation of the function {{mvar|f}} rather than three.<ref>{{harvtxt|Joux|2009}}, Section 7.1.2, Brent's cycle-finding algorithm, pp. 226–227.</ref>

The following Python code shows how this technique works in more detail.
<syntaxhighlight lang="python">
def brent(f, x0) -> (int, int):
    """Brent's cycle detection algorithm."""
    # main phase: search successive powers of two
    power = lam = 1
    tortoise = x0
    hare = f(x0)  # f(x0) is the element/node next to x0.
    # this assumes there is a cycle; otherwise this loop won't terminate
    while tortoise != hare:
        if power == lam:  # time to start a new power of two?
            tortoise = hare
            power *= 2
            lam = 0
        hare = f(hare)
        lam += 1

    # Find the position of the first repetition of length λ
    tortoise = hare = x0
    for i in range(lam):
        hare = f(hare)
    # The distance between the hare and tortoise is now λ.

    # Next, the hare and tortoise move at same speed until they agree
    mu = 0
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(hare)
        mu += 1
 
    return lam, mu
</syntaxhighlight>

Like the tortoise and hare algorithm, this is a pointer algorithm that uses {{math|''O''(''λ'' + ''μ'')}} tests and function evaluations and {{math|''O''(1)}} storage space. It is not difficult to show that the number of function evaluations can never be higher than for Floyd's algorithm. Brent claims that, on average, his cycle finding algorithm runs around 36% more quickly than Floyd's and that it speeds up the [[Pollard rho algorithm]] by around 24%. He also performs an [[Average-case complexity|average case]] analysis for a randomized version of the algorithm in which the sequence of indices traced by the slower of the two pointers is not the powers of two themselves, but rather a randomized multiple of the powers of two. Although his main intended application was in integer factorization algorithms, Brent also discusses applications in testing pseudorandom number generators.<ref name="brent"/>

===Gosper's algorithm===
[[Bill Gosper|R. W. Gosper]]'s algorithm<ref name="gosper-impl">{{Cite web |title=Loop detectors of Floyd and Gosper |url=http://www.hackersdelight.org/hdcodetxt/loopdet.c.txt |website=[[Hacker's Delight]] |first=Henry S. Jr. |last=Warren |archive-url=https://web.archive.org/web/20160414011322/http://hackersdelight.org/hdcodetxt/loopdet.c.txt |archive-date=2016-04-14 |access-date=2017-02-08}}</ref><ref name="hakmem-132">{{Cite web |title=Hakmem -- Flows and Iterated Functions -- Draft, Not Yet Proofed |url=http://www.inwap.com/pdp10/hbaker/hakmem/flows.html |archive-url=https://web.archive.org/web/20200318230848/http://www.inwap.com/pdp10/hbaker/hakmem/flows.html |archive-date=2020-03-18 |access-date=2024-05-02}}</ref> finds the period <math>\lambda</math>, and the lower and upper bound of the starting point, <math>\mu_l</math> and <math>\mu_u</math>, of the first cycle. The difference between the lower and upper bound is of the same order as the period, i.e. <math>\mu_l + \lambda \approx \mu_h</math>.

The algorithm maintains an array of tortoises <math>T_j</math>.  For each <math>x_i</math>:
* For each <math>0 \le j \le \log_2 i,</math> compare <math>x_i</math> to <math>T_j</math>.
* If <math>x_i = T_j</math>, a cycle has been detected, of length <math>\lambda = (i - 2^j) \bmod 2^{j+1} + 1.</math>
* If no match is found, set <math>T_k \leftarrow x_i</math>, where <math>k</math> is the [[Count trailing zeros|number of trailing zeros]] in the binary representation of <math>i+1</math>.  I.e. the greatest power of 2 which divides <math>i+1</math>.

If it is inconvenient to vary the number of comparisons as <math>i</math> increases, you may initialize all of the <math>T_j = x_0</math>, but must then return <math>\lambda = i</math> if <math>x_i = T_j</math> while <math>i < 2^j</math>.

====Advantages====

The main features of Gosper's algorithm are that it is economical in space, very economical in evaluations of the generator function, and always finds the exact cycle length (never a multiple). The cost is a large number of equality comparisons.  It could be roughly described as a concurrent version of Brent's algorithm. While Brent's algorithm uses a single tortoise, repositioned every time the hare passes a power of two, Gosper's algorithm uses several tortoises (several previous values are saved), which are roughly exponentially spaced. According to the note in [[HAKMEM]] item 132,<ref name="hakmem-132" /> this algorithm will detect repetition before the third occurrence of any value, i.e. the cycle will be iterated at most twice. HAKMEM also states that it is sufficient to store <math>\lceil\log_2\lambda\rceil</math> previous values; however, this only offers a saving if we know ''a priori'' that <math>\lambda</math> is significantly smaller than <math>\mu</math>.  The standard implementations<ref name="gosper-impl"/> store <math>\lceil\log_2 (\mu + 2\lambda)\rceil</math> values. For example, assume the function values are 32-bit integers, so <math>\mu + \lambda \le 2^{32}</math> and <math>\mu + 2\lambda \le 2^{33}.</math>  Then Gosper's algorithm will find the cycle after less than <math>\mu + 2\lambda</math> function evaluations (in fact, the most possible is <math>3\cdot 2^{31} - 1</math>), while consuming the space of 33 values (each value being a 32-bit integer).

====Complexity====

Upon the <math>i</math>-th evaluation of the generator function, the algorithm compares the generated value with <math>\log_2 i</math> previous values; observe that <math>i</math> goes up to at least <math>\mu + \lambda</math> and at most <math>\mu + 2\lambda</math>. Therefore, the time complexity of this algorithm is <math>O((\mu + \lambda) \cdot \log (\mu + \lambda))</math>. Since it stores <math>\log_2 (\mu + 2\lambda)</math> values, its space complexity is <math>\Theta(\log (\mu + \lambda))</math>. This is under the usual [[transdichotomous model]], assumed throughout this article, in which the size of the function values is constant. Without this assumption, we know it requires <math>\Omega(\log (\mu + \lambda))</math> space to store <math>\mu + \lambda</math> distinct values, so the overall space complexity is <math>\Omega(\log^2 (\mu + \lambda)).</math>

===Time–space tradeoffs===
A number of authors have studied techniques for cycle detection that use more memory than Floyd's and Brent's methods, but detect cycles more quickly. In general these methods store several previously-computed sequence values, and test whether each new value equals one of the previously-computed values. In order to do so quickly, they typically use a [[hash table]] or similar data structure for storing the previously-computed values, and therefore are not pointer algorithms: in particular, they usually cannot be applied to Pollard's rho algorithm. Where these methods differ is in how they determine which values to store. Following Nivasch,<ref name="nivasch">{{citation|first=Gabriel|last=Nivasch|title=Cycle detection using a stack|journal=[[Information Processing Letters]]|volume=90|year=2004|pages=135–140|doi=10.1016/j.ipl.2004.01.016|issue=3}}.</ref> we survey these techniques briefly.

*Brent<ref name="brent"/> already describes variations of his technique in which the indices of saved sequence values are powers of a number {{mvar|R}} other than two. By choosing {{mvar|R}} to be a number close to one, and storing the sequence values at indices that are near a sequence of consecutive powers of {{mvar|R}}, a cycle detection algorithm can use a number of function evaluations that is within an arbitrarily small factor of the optimum {{math|''λ'' + ''μ''}}.<ref>{{citation|first1=Claus P.|last1=Schnorr|author-link1=Claus P. Schnorr|first2=Hendrik W.|last2=Lenstra|author-link2=Hendrik Lenstra|title=A Monte Carlo factoring algorithm with linear storage|journal=[[Mathematics of Computation]]|volume=43|issue=167|year=1984|pages=289–311|doi=10.2307/2007414|jstor=2007414|hdl=1887/3815|hdl-access=free}}.</ref><ref name="teske">{{citation|first=Edlyn|last=Teske|title=A space-efficient algorithm for group structure computation|journal=[[Mathematics of Computation]]|volume=67|issue=224|year=1998|pages=1637–1663|doi=10.1090/S0025-5718-98-00968-5|bibcode=1998MaCom..67.1637T|doi-access=free}}.</ref>
*Sedgewick, Szymanski, and Yao<ref>{{citation|first1=Robert|last1=Sedgewick|author-link1=Robert Sedgewick (computer scientist)|first2=Thomas G.|last2=Szymanski|first3=Andrew C.-C.|last3=Yao|author-link3=Andrew Yao|title=The complexity of finding cycles in periodic functions|journal=[[SIAM Journal on Computing]]|volume=11|issue=2|year=1982|pages=376–390|doi=10.1137/0211030}}.</ref>  provide a method that uses {{mvar|M}} memory cells and requires in the worst case only <math>(\lambda+\mu)(1+cM^{-1/2})</math> function evaluations, for some constant {{mvar|c}}, which they show to be optimal. The technique involves maintaining a numerical parameter {{mvar|d}}, storing in a table only those positions in the sequence that are multiples of {{mvar|d}}, and clearing the table and doubling {{mvar|d}} whenever too many values have been stored.
*Several authors have described ''distinguished point'' methods that store function values in a table based on a criterion involving the values, rather than (as in the method of Sedgewick et al.) based on their positions. For instance, values equal to zero modulo some value {{mvar|d}} might be stored.<ref>{{citation|first1=Paul C.|last1=van Oorschot|first2=Michael J.|last2=Wiener|title=Parallel collision search with cryptanalytic applications|journal=[[Journal of Cryptology]]|volume=12|issue=1|year=1999|pages=1–28|doi=10.1007/PL00003816|s2cid=5091635|url=https://ir.library.carleton.ca/pub/19076|doi-access=free}}.</ref><ref name="qd">{{citation|first1=J.-J.|last1=Quisquater|first2=J.-P.|last2=Delescaille|contribution=How easy is collision search? Application to DES|title=Advances in Cryptology – EUROCRYPT '89, Workshop on the Theory and Application of Cryptographic Techniques|series=Lecture Notes in Computer Science|date=1990 |volume=434|publisher=Springer-Verlag|pages=429–434|doi=10.1007/3-540-46885-4_43|doi-access=free|isbn=978-3-540-53433-4 }}.</ref> More simply, Nivasch<ref name="nivasch"/> credits D. P. Woodruff with the suggestion of storing a random sample of previously seen values, making an appropriate random choice at each step so that the sample remains random.
*Nivasch<ref name="nivasch"/> describes an algorithm that does not use a fixed amount of memory, but for which the expected amount of memory used (under the assumption that the input function is random) is logarithmic in the sequence length. An item is stored in the memory table, with this technique, when no later item has a smaller value. As Nivasch shows, the items with this technique can be maintained using a [[Stack (data structure)|stack data structure]], and each successive sequence value need be compared only to the top of the stack. The algorithm terminates when the repeated sequence element with smallest value is found. Running the same algorithm with multiple stacks, using random permutations of the values to reorder the values within each stack, allows a time–space tradeoff similar to the previous algorithms. However, even the version of this algorithm with a single stack is not a pointer algorithm, due to the comparisons needed to determine which of two values is smaller.

Any cycle detection algorithm that stores at most {{mvar|M}} values from the input sequence must perform at least <math>(\lambda+\mu)\left(1+\frac{1}{M-1}\right)</math> function evaluations.<ref name="fich">{{citation|first=Faith Ellen|last=Fich|author-link=Faith Ellen|contribution=Lower bounds for the cycle detection problem|title=Proc. 13th ACM [[Symposium on Theory of Computing]]|series=Stoc '81 |year=1981|doi=10.1145/800076.802462|pages=96–105|isbn=978-1-4503-7392-0 |s2cid=119742106 }}.</ref><ref>{{citation|first1=Eric W.|last1=Allender|author-link1=Eric Allender|first2=Maria M.|last2=Klawe|author-link2=Maria Klawe|title=Improved lower bounds for the cycle detection problem|journal=[[Theoretical Computer Science (journal)|Theoretical Computer Science]]|volume=36|issue=2–3|pages=231–237|year=1985|doi=10.1016/0304-3975(85)90044-1|doi-access=free}}.</ref>