Editing Divide-and-conquer algorithm (section)

== Advantages ==

=== Solving difficult problems ===
Divide and conquer is a powerful tool for solving conceptually difficult problems: all it requires is a way of breaking the problem into sub-problems, of solving the trivial cases, and of combining sub-problems to the original problem. Similarly, decrease and conquer only requires reducing the problem to a single smaller problem, such as the classic [[Tower of Hanoi]] puzzle, which reduces moving a tower of height <math>n</math> to move a tower of height <math>n-1</math>.

=== Algorithm efficiency ===
The divide-and-conquer paradigm often helps in the discovery of efficient algorithms.  It was the key, for example, to [[Karatsuba algorithm|Karatsuba]]'s fast multiplication method, the quicksort and mergesort algorithms, the [[Strassen algorithm]] for [[matrix multiplication]], and fast Fourier transforms.

In all these examples, the D&C approach led to an improvement in the [[asymptotic complexity|asymptotic cost]] of the solution. For example, if (a) the [[Recursion (computer science)|base cases]] have constant-bounded size, the work of splitting the problem and combining the partial solutions is proportional to the problem's size <math>n</math>, and (b) there is a bounded number <math>p</math> of sub-problems of size ~ <math>\frac{n}{p}</math> at each stage, then the cost of the divide-and-conquer algorithm will be <math>O(n\log_{p}n)</math>.

For other types of divide-and-conquer approaches, running times can also be generalized. For example, when a) the work of splitting the problem and combining the partial solutions take <math>cn</math> time, where <math>n</math> is the input size and <math>c</math> is some constant; b) when <math>n < 2</math>, the algorithm takes time upper-bounded by <math>c</math>, and c) there are <math>q</math> subproblems where each subproblem has size ~ <math>\frac{n}{2}</math>. Then, the running times are as follows:

* if the number of subproblems <math>q > 2</math>, then the divide-and-conquer algorithm's running time is bounded by <math>O(n^{\log_{2}q})</math>.
* if the number of subproblems is exactly one, then the divide-and-conquer algorithm's running time is bounded by <math>O(n)</math>.<ref name="kleinberg&Tardos">{{cite book |last1=Kleinberg |first1=Jon |last2=Tardos |first2=Eva |title=Algorithm Design |date=March 16, 2005 |publisher=[[Pearson Education]] |isbn=9780321295354 |pages=214-220 |edition=1 |url=https://www.pearson.com/en-us/subject-catalog/p/algorithm-design/P200000003259/9780137546350 |access-date=26 January 2025}}</ref>

If, instead, the work of splitting the problem and combining the partial solutions take <math>cn^2</math> time, and there are 2 subproblems where each has size <math>\frac{n}{2}</math>, then the running time of the divide-and-conquer algorithm is bounded by <math>O(n^2)</math>.<ref name="kleinberg&Tardos"/>

=== Parallelism ===
Divide-and-conquer algorithms are naturally adapted for execution in [[Multiprocessing|multi-processor]] machines, especially [[shared-memory]] systems where the communication of data between [[Central processing unit|processors]] does not need to be planned in advance because distinct sub-problems can be executed on different processors.

=== Memory access ===
Divide-and-conquer algorithms naturally tend to make efficient use of [[memory cache]]s. The reason is that once a sub-problem is small enough, it and all its sub-problems can, in principle, be solved within the [[Cache (computing)|cache]], without accessing the slower [[Computer data storage|main memory]]. An algorithm designed to exploit the cache in this way is called ''[[cache-oblivious algorithm|cache-oblivious]]'', because it does not contain the cache size as an explicit [[Parameter (computer programming)|parameter]].<ref name="cahob">{{cite book | author = M. Frigo |author2=C. E. Leiserson |author3=H. Prokop |title=40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039) |chapter=Cache-oblivious algorithms |pages=285–297 | year = 1999|chapter-url=https://dspace.mit.edu/bitstream/handle/1721.1/80568/43558192-MIT.pdf;sequence=2|doi=10.1109/SFFCS.1999.814600 |isbn=0-7695-0409-4 |s2cid=62758836 }}</ref> Moreover, D&C algorithms can be designed for important algorithms (e.g., sorting, FFTs, and matrix multiplication) to be ''optimal'' cache-oblivious algorithms–they use the cache in a probably optimal way, in an asymptotic sense, regardless of the cache size. In contrast, the traditional approach to exploiting the cache is ''blocking'', as in [[loop nest optimization]], where the problem is explicitly divided into chunks of the appropriate size—this can also use the cache optimally, but only when the algorithm is tuned for the specific cache sizes of a particular machine.

The same advantage exists with regards to other hierarchical storage systems, such as [[Non-uniform memory access|NUMA]] or [[virtual memory]], as well as for multiple levels of cache: once a sub-problem is small enough, it can be solved within a given level of the hierarchy, without accessing the higher (slower) levels.

=== Roundoff control ===
In computations with rounded arithmetic, e.g. with [[floating-point]] numbers, a divide-and-conquer algorithm may yield more accurate results than a superficially equivalent iterative method. For example, one can add ''N'' numbers either by a simple loop that adds each datum to a single variable, or by a D&C algorithm called [[pairwise summation]] that breaks the data set into two halves, recursively computes the sum of each half, and then adds the two sums.  While the second method performs the same number of additions as the first and pays the overhead of the recursive calls, it is usually more accurate.<ref>Nicholas J. Higham, "[https://pdfs.semanticscholar.org/5c17/9d447a27c40a54b2bf8b1b2d6819e63c1a69.pdf The accuracy of floating-point summation]", ''SIAM J. Scientific Computing'' '''14''' (4), 783–799 (1993).</ref>