Merge algorithm

Template:Short description Merge algorithms are a family of algorithms that take multiple sorted lists as input and produce a single list as output, containing all the elements of the inputs lists in sorted order. These algorithms are used as subroutines in various sorting algorithms, most famously merge sort.

ApplicationEdit

File:Merge sort algorithm diagram.svg

A graph exemplifying merge sort. Two red arrows starting from the same node indicate a split, while two green arrows ending at the same node correspond to an execution of the merge algorithm.

The merge algorithm plays a critical role in the merge sort algorithm, a comparison-based sorting algorithm. Conceptually, the merge sort algorithm consists of two steps:

Recursively divide the list into sublists of (roughly) equal length, until each sublist contains only one element, or in the case of iterative (bottom up) merge sort, consider a list of n elements as n sub-lists of size 1. A list containing a single element is, by definition, sorted.
Repeatedly merge sublists to create a new sorted sublist until the single list contains all elements. The single list is the sorted list.

The merge algorithm is used repeatedly in the merge sort algorithm.

An example merge sort is given in the illustration. It starts with an unsorted array of 7 integers. The array is divided into 7 partitions; each partition contains 1 element and is sorted. The sorted partitions are then merged to produce larger, sorted, partitions, until 1 partition, the sorted array, is left.

Merging two listsEdit

Merging two sorted lists into one can be done in linear time and linear or constant space (depending on the data access model). The following pseudocode demonstrates an algorithm that merges input lists (either linked lists or arrays) Template:Mvar and Template:Mvar into a new list Template:Mvar.<ref name="skiena">Template:Cite book</ref>Template:R Template:Rp The function Template:Mono yields the first element of a list; "dropping" an element means removing it from its list, typically by incrementing a pointer or index.

algorithm merge(A, B) is
    inputs A, B : list
    returns list

    C := new empty list
    while A is not empty and B is not empty do
        if head(A) ≤ head(B) then
            append head(A) to C
            drop the head of A
        else
            append head(B) to C
            drop the head of B

    // By now, either A or B is empty. It remains to empty the other input list.
    while A is not empty do
        append head(A) to C
        drop the head of A
    while B is not empty do
        append head(B) to C
        drop the head of B

    return C

When the inputs are linked lists, this algorithm can be implemented to use only a constant amount of working space; the pointers in the lists' nodes can be reused for bookkeeping and for constructing the final merged list.

In the merge sort algorithm, this subroutine is typically used to merge two sub-arrays Template:Mono, Template:Mono of a single array Template:Mono. This can be done by copying the sub-arrays into a temporary array, then applying the merge algorithm above.Template:R The allocation of a temporary array can be avoided, but at the expense of speed and programming ease. Various in-place merge algorithms have been devised,<ref>Template:Cite journal</ref> sometimes sacrificing the linear-time bound to produce an Template:Math algorithm;<ref>Template:Cite conference</ref> see Template:Slink for discussion.

K-way mergingEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}} Template:Mvar-way merging generalizes binary merging to an arbitrary number Template:Mvar of sorted input lists. Applications of Template:Mvar-way merging arise in various sorting algorithms, including patience sorting<ref name="Chandramouli">Template:Cite conference</ref> and an external sorting algorithm that divides its input into Template:Math blocks that fit in memory, sorts these one by one, then merges these blocks.Template:R Template:Rp

Several solutions to this problem exist. A naive solution is to do a loop over the Template:Mvar lists to pick off the minimum element each time, and repeat this loop until all lists are empty:

Template:Framebox

Input: a list of Template:Mvar lists.
While any of the lists is non-empty:
- Loop over the lists to find the one with the minimum first element.
- Output the minimum element and remove it from its list.

Template:Frame-footer

In the worst case, this algorithm performs Template:Math element comparisons to perform its work if there are a total of Template:Mvar elements in the lists.<ref name="greene">Template:Cite conference</ref> It can be improved by storing the lists in a priority queue (min-heap) keyed by their first element:

Template:Framebox

Build a min-heap Template:Mvar of the Template:Mvar lists, using the first element as the key.
While any of the lists is non-empty:
- Let Template:Math.
- Output the first element of list Template:Mvar and remove it from its list.
- Re-heapify Template:Mvar.

Template:Frame-footer

Searching for the next smallest element to be output (find-min) and restoring heap order can now be done in Template:Math time (more specifically, Template:Math comparisonsTemplate:R), and the full problem can be solved in Template:Math time (approximately Template:Math comparisons).Template:R<ref name="toolbox">Template:Cite book</ref>Template:Rp

A third algorithm for the problem is a divide and conquer solution that builds on the binary merge algorithm:

Template:Framebox

If Template:Math, output the single input list.
If Template:Math, perform a binary merge.
Else, recursively merge the first Template:Math lists and the final Template:Math lists, then binary merge these.

Template:Frame-footer

When the input lists to this algorithm are ordered by length, shortest first, it requires fewer than Template:Math comparisons, i.e., less than half the number used by the heap-based algorithm; in practice, it may be about as fast or slow as the heap-based algorithm.Template:R

Parallel mergeEdit

A parallel version of the binary merge algorithm can serve as a building block of a parallel merge sort. The following pseudocode demonstrates this algorithm in a parallel divide-and-conquer style (adapted from Cormen et al.<ref name="clrs">Template:Introduction to Algorithms</ref>Template:Rp). It operates on two sorted arrays Template:Mvar and Template:Mvar and writes the sorted output to array Template:Mvar. The notation Template:Mono denotes the part of Template:Mvar from index Template:Mvar through Template:Mvar, exclusive.

algorithm merge(A[i...j], B[k...ℓ], C[p...q]) is
    inputs A, B, C : array
           i, j, k, ℓ, p, q : indices

    let m = j - i,
        n = ℓ - k

    if m < n then
        swap A and B  // ensure that A is the larger array: i, j still belong to A; k, ℓ to B
        swap m and n

    if m ≤ 0 then
        return  // base case, nothing to merge

    let r = ⌊(i + j)/2⌋
    let s = binary-search(A[r], B[k...ℓ])
    let t = p + (r - i) + (s - k)
    C[t] = A[r]

    in parallel do
        merge(A[i...r], B[k...s], C[p...t])
        merge(A[r+1...j], B[s...ℓ], C[t+1...q])

The algorithm operates by splitting either Template:Mvar or Template:Mvar, whichever is larger, into (nearly) equal halves. It then splits the other array into a part with values smaller than the midpoint of the first, and a part with larger or equal values. (The binary search subroutine returns the index in Template:Mvar where Template:Math would be, if it were in Template:Mvar; that this always a number between Template:Mvar and Template:Mvar.) Finally, each pair of halves is merged recursively, and since the recursive calls are independent of each other, they can be done in parallel. Hybrid approach, where serial algorithm is used for recursion base case has been shown to perform well in practice <ref name="vjd">Template:Citation</ref>

The work performed by the algorithm for two arrays holding a total of Template:Mvar elements, i.e., the running time of a serial version of it, is Template:Math. This is optimal since Template:Mvar elements need to be copied into Template:Mvar. To calculate the span of the algorithm, it is necessary to derive a Recurrence relation. Since the two recursive calls of merge are in parallel, only the costlier of the two calls needs to be considered. In the worst case, the maximum number of elements in one of the recursive calls is at most <math display="inline">\frac 3 4 n</math> since the array with more elements is perfectly split in half. Adding the <math>\Theta\left( \log(n)\right)</math> cost of the Binary Search, we obtain this recurrence as an upper bound:

<math>T_{\infty}^\text{merge}(n) = T_{\infty}^\text{merge}\left(\frac {3} {4} n\right) + \Theta\left( \log(n)\right)</math>

The solution is <math>T_{\infty}^\text{merge}(n) = \Theta\left(\log(n)^2\right)</math>, meaning that it takes that much time on an ideal machine with an unbounded number of processors.Template:R Template:Rp

Note: The routine is not stable: if equal items are separated by splitting Template:Mvar and Template:Mvar, they will become interleaved in Template:Mvar; also swapping Template:Mvar and Template:Mvar will destroy the order, if equal items are spread among both input arrays. As a result, when used for sorting, this algorithm produces a sort that is not stable.

Parallel merge of two listsEdit

There are also algorithms that introduce parallelism within a single instance of merging of two sorted lists. These can be used in field-programmable gate arrays (FPGAs), specialized sorting circuits, as well as in modern processors with single-instruction multiple-data (SIMD) instructions.

Existing parallel algorithms are based on modifications of the merge part of either the bitonic sorter or odd-even mergesort.<ref name="flimsj">Template:Cite journal</ref> In 2018, Saitoh M. et al. introduced MMS <ref>Template:Cite book</ref> for FPGAs, which focused on removing a multi-cycle feedback datapath that prevented efficient pipelining in hardware. Also in 2018, Papaphilippou P. et al. introduced FLiMS <ref name="flimsj" /> that improved the hardware utilization and performance by only requiring <math>\log_2(P)+1</math> pipeline stages of Template:Math compare-and-swap units to merge with a parallelism of Template:Math elements per FPGA cycle.

Language supportEdit

Some computer languages provide built-in or library support for merging sorted collections.

C++Edit

The C++'s Standard Template Library has the function Template:Mono, which merges two sorted ranges of iterators, and Template:Mono, which merges two consecutive sorted ranges in-place. In addition, the Template:Mono (linked list) class has its own Template:Mono method which merges another list into itself. The type of the elements merged must support the less-than (Template:Mono) operator, or it must be provided with a custom comparator.

C++17 allows for differing execution policies, namely sequential, parallel, and parallel-unsequenced.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

PythonEdit

Python's standard library (since 2.6) also has a Template:Mono function in the Template:Mono module, that takes multiple sorted iterables, and merges them into a single iterator.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

ReferencesEdit

Template:Reflist

External linksEdit

High Performance Implementation of Parallel and Serial Merge in C# with source in GitHub and in C++ GitHub

Template:Sorting

Merge algorithm

Contents

ApplicationEdit

Merging two listsEdit

K-way mergingEdit

Parallel mergeEdit

Parallel merge of two listsEdit

Language supportEdit

C++Edit

PythonEdit

See alsoEdit

ReferencesEdit

Further readingEdit

External linksEdit