Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
XOR swap algorithm
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Binary arithmetic algorithm}} [[Image:XOR Swap.svg|thumb|upright=2|alt=With three XOR operations the binary values 1010 and 0011 are exchanged between variables.|Using the XOR swap algorithm to exchange [[nibble]]s between variables without the use of temporary storage]] In [[computer programming]], the '''exclusive or swap''' (sometimes shortened to '''XOR swap''') is an [[algorithm]] that uses the [[exclusive or]] [[bitwise operation]] to [[swap (computer science)|swap]] the values of two [[variable (programming)|variable]]s without using the temporary variable which is normally required. The algorithm is primarily a novelty and a way of demonstrating properties of the ''exclusive or'' operation. It is sometimes discussed as a [[program optimization]], but there are almost no cases where swapping via ''exclusive or'' provides benefit over the standard, obvious technique. ==The algorithm== Conventional swapping requires the use of a temporary storage variable. Using the XOR swap algorithm, however, no temporary storage is needed. The algorithm is as follows:<ref>{{cite web|url=http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BitOp/xor.html|archive-url=https://web.archive.org/web/20140401081922/http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BitOp/xor.html|url-status=dead|archive-date=2014-04-01 |title=The Magic of XOR |publisher=Cs.umd.edu |access-date=2014-04-02}}</ref><ref>{{cite web|url=http://graphics.stanford.edu/~seander/bithacks.html#SwappingValuesXOR|title=Swapping Values with XOR|publisher=graphics.stanford.edu|access-date=2014-05-02}}</ref> <syntaxhighlight lang="pascal"> X := Y XOR X; // XOR the values and store the result in X Y := X XOR Y; // XOR the values and store the result in Y X := Y XOR X; // XOR the values and store the result in X </syntaxhighlight> Since XOR is a [[commutative operation]], either X XOR Y or Y XOR X can be used interchangeably in any of the foregoing three lines. Note that on some architectures the first operand of the XOR instruction specifies the target location at which the result of the operation is stored, preventing this interchangeability. The algorithm typically corresponds to three [[machine code|machine-code]] instructions, represented by corresponding pseudocode and assembly instructions in the three rows of the following table: {| class="wikitable" style="width: 45em;" |- ! Pseudocode !! IBM [[System/370]] assembly !! x86 assembly !! RISC-V assembly |- | {{code|1=X := X XOR Y|2=pascal}} || {{code|1=XR R1,R2|2=asm}} || {{code|1=xor eax, ebx|2=asm}} || {{code|1=xor x10, x11 |2=asm}} |- | {{code|1=Y := Y XOR X|2=pascal}} || {{code|1=XR R2,R1|2=asm}} || {{code|1=xor ebx, eax|2=asm}} || {{code|1=xor x11, x10 |2=asm}} |- | {{code|1=X := X XOR Y|2=pascal}} || {{code|1=XR R1,R2|2=asm}} || {{code|1=xor eax, ebx|2=asm}} || {{code|1=xor x10, x11 |2=asm}} |} In the above System/370 assembly code sample, R1 and R2 are distinct [[processor register|register]]s, and each {{code|XR|2=asm}} operation leaves its result in the register named in the first argument. Using x86 assembly, values X and Y are in registers eax and ebx (respectively), and {{code|xor|2=asm}} places the result of the operation in the first register. In RISC-V assembly, value X and Y are in registers X10 and X11, and {{code|xor|2=asm}} places the result of the operation in the first register (same as X86) However, in the pseudocode or high-level language version or implementation, the algorithm fails if ''x'' and ''y'' use the same storage location, since the value stored in that location will be zeroed out by the first XOR instruction, and then remain zero; it will not be "swapped with itself". This is ''not'' the same as if ''x'' and ''y'' have the same values. The trouble only comes when ''x'' and ''y'' use the same storage location, in which case their values must already be equal. That is, if ''x'' and ''y'' use the same storage location, then the line: <syntaxhighlight lang="pascal"> X := X XOR Y </syntaxhighlight> sets ''x'' to zero (because ''x'' = ''y'' so X XOR Y is zero) ''and'' sets ''y'' to zero (since it uses the same storage location), causing ''x'' and ''y'' to lose their original values. ==Proof of correctness== The [[binary operation]] XOR over bit strings of length <math>N</math> exhibits the following properties (where <math>\oplus</math> denotes XOR):{{Efn|The first three properties, along with the existence of an inverse for each element, are the definition of an [[abelian group]]. The last property is the statement that every element is an [[Involution (mathematics)|involution]], that is, having [[Order (group theory)|order]] 2, which is not true of all abelian groups.}} * '''L1.''' [[Commutative operation|Commutativity]]: <math>A \oplus B = B \oplus A</math> * '''L2.''' [[Associativity]]: <math>(A \oplus B) \oplus C = A \oplus (B \oplus C)</math> * '''L3.''' [[Identity element|Identity exists]]: there is a bit string, 0, (of length ''N'') such that <math>A \oplus 0 = A</math> for any <math>A</math> * '''L4.''' Each element is its own [[inverse element|inverse]]: for each <math>A</math>, <math>A \oplus A = 0</math>. Suppose that we have two distinct registers <code>R1</code> and <code>R2</code> as in the table below, with initial values ''A'' and ''B'' respectively. We perform the operations below in sequence, and reduce our results using the properties listed above. {| class="wikitable" |- ! Step ! Operation ! Register 1 ! Register 2 ! Reduction |- | 0 || Initial value || <math>A</math> || <math>B</math> || — |- | 1 || <code>R1 := R1 XOR R2</code> || <math>A \oplus B</math> ||<math>B</math>|| — |- | 2 || <code>R2 := R1 XOR R2</code> || <math>A \oplus B</math> || <math>(A \oplus B) \oplus B = A \oplus (B \oplus B)</math><br><math>= A \oplus 0</math><br><math>=A</math> || '''L2<br> L4<br> L3''' |- | 3 || <code>R1 := R1 XOR R2</code> || <math>(A \oplus B) \oplus A = (B \oplus A) \oplus A</math><br><math> = B \oplus (A \oplus A)</math><br><math> = B \oplus 0 </math><br><math> = B </math> || <math>\ A</math> || '''L1<br>L2<br>L4<br>L3''' |} === Linear algebra interpretation === As XOR can be interpreted as binary addition and a pair of bits can be interpreted as a vector in a two-dimensional [[vector space]] over the [[field with two elements]], the steps in the algorithm can be interpreted as multiplication by 2×2 matrices over the field with two elements. For simplicity, assume initially that ''x'' and ''y'' are each single bits, not bit vectors. For example, the step: <syntaxhighlight lang="pascal"> X := X XOR Y </syntaxhighlight> which also has the implicit: <syntaxhighlight lang="pascal"> Y := Y </syntaxhighlight> corresponds to the matrix <math>\left(\begin{smallmatrix}1 & 1\\0 & 1\end{smallmatrix}\right)</math> as :<math>\begin{pmatrix}1 & 1\\0 & 1\end{pmatrix} \begin{pmatrix}x\\y\end{pmatrix} = \begin{pmatrix}x+y\\y\end{pmatrix}. </math> The sequence of operations is then expressed as: :<math> \begin{pmatrix}1 & 1\\0 & 1\end{pmatrix} \begin{pmatrix}1 & 0\\1 & 1\end{pmatrix} \begin{pmatrix}1 & 1\\0 & 1\end{pmatrix} = \begin{pmatrix}0 & 1\\1 & 0\end{pmatrix} </math> (working with binary values, so <math>1 + 1 = 0</math>), which expresses the [[elementary matrix]] of switching two rows (or columns) in terms of the [[Shear mapping|transvections]] (shears) of adding one element to the other. To generalize to where X and Y are not single bits, but instead bit vectors of length ''n'', these 2×2 matrices are replaced by 2''n''×2''n'' [[block matrices]] such as <math>\left(\begin{smallmatrix}I_n & I_n\\0 & I_n\end{smallmatrix}\right).</math> These matrices are operating on ''values,'' not on ''variables'' (with storage locations), hence this interpretation abstracts away from issues of storage location and the problem of both variables sharing the same storage location. ==Code example== A [[C (programming language)|C]] function that implements the XOR swap algorithm: <!-- This display template is broken in IE 6, the top half of the pretty bordered box is cut off. --> <syntaxhighlight lang="c"> void XorSwap(int *x, int *y) { if (x == y) return; *x ^= *y; *y ^= *x; *x ^= *y; } </syntaxhighlight> The code first checks if the addresses are distinct and uses a [[Guard (computer science)#Flatter code with less nesting|guard clause]] to exit the function early if they are equal. Without that check, if they were equal, the algorithm would fold to a triple <code>*x ^= *x</code> resulting in zero. The XOR swap algorithm can also be defined with a macro: <syntaxhighlight lang="c"> #define XORSWAP_UNSAFE(a, b) \ ((a) ^= (b), (b) ^= (a), \ (a) ^= (b)) /* Doesn't work when a and b are the same object - assigns zero \ (0) to the object in that case */ #define XORSWAP(a, b) \ ((&(a) == &(b)) ? (a) /* Check for distinct addresses */ \ : XORSWAP_UNSAFE(a, b)) </syntaxhighlight> ==Reasons for avoidance in practice== On modern [[CPU architecture]]s, the XOR technique can be slower than using a temporary variable to do swapping. At least on recent x86 CPUs, both by AMD and Intel, moving between registers regularly incurs zero latency. (This is called MOV-elimination.) Even if there is not any architectural register available to use, the <code>XCHG</code> instruction will be at least as fast as the three XORs taken together. Another reason is that modern CPUs strive to execute instructions in parallel via [[instruction pipeline]]s. In the XOR technique, the inputs to each operation depend on the results of the previous operation, so they must be executed in strictly sequential order, negating any benefits of [[instruction-level parallelism]].<ref>{{cite web |first1=Saman |last1=Amarasinghe |first2=Charles |last2=Leiserson |title=6.172 Performance Engineering of Software Systems, Lecture 2 |year=2010 |publisher=Massachusetts Institute of Technology |website=MIT OpenCourseWare |url=http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2010/video-lectures/lecture-2-bit-hacks/ |archive-url=https://web.archive.org/web/20150125102630/http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2010/video-lectures/lecture-2-bit-hacks/|url-status=dead|archive-date=2015-01-25 |access-date=27 January 2015}}</ref> ===Aliasing=== The XOR swap is also complicated in practice by [[aliasing (computing)|aliasing]]. If an attempt is made to XOR-swap the contents of some location with itself, the result is that the location is zeroed out and its value lost. Therefore, XOR swapping must not be used blindly in a high-level language if aliasing is possible. This issue does not apply if the technique is used in assembly to swap the contents of two registers. Similar problems occur with [[call by name]], as in [[Jensen's Device]], where swapping <code>i</code> and <code>A[i]</code> via a temporary variable yields incorrect results due to the arguments being related: swapping via <code>temp = i; i = A[i]; A[i] = temp</code> changes the value for <code>i</code> in the second statement, which then results in the incorrect <code>i</code> value for <code>A[i]</code> in the third statement. ==Variations== The underlying principle of the XOR swap algorithm can be applied to any operation meeting criteria L1 through L4 above. Replacing XOR by addition and subtraction gives various slightly different, but largely equivalent, formulations. For example:<ref>{{cite book |last1=Warren |first1=Henry S. |title=Hacker's delight |date=2003 |publisher=Addison-Wesley |location=Boston |isbn=0201914654 |page=39}}</ref> <syntaxhighlight lang="c"> void AddSwap( unsigned int* x, unsigned int* y ) { *x = *x + *y; *y = *x - *y; *x = *x - *y; } </syntaxhighlight> Unlike the XOR swap, this variation requires that the underlying processor or programming language uses a method such as [[modular arithmetic]] or [[bignum]]s to guarantee that the computation of <code>X + Y</code> cannot cause an error due to [[integer overflow]]. Therefore, it is seen even more rarely in practice than the XOR swap. However, the implementation of <code>AddSwap</code> above in the C programming language always works even in case of integer overflow, since, according to the C standard, addition and subtraction of unsigned integers follow the rules of [[modular arithmetic]], i. e. are done in the [[cyclic group]] <math>\mathbb Z/2^s\mathbb Z</math> where <math>s</math> is the number of bits of <code>unsigned int</code>. Indeed, the correctness of the algorithm follows from the fact that the formulas <math>(x + y) - y = x</math> and <math>(x + y) - ((x + y) - y) = y</math> hold in any [[abelian group]]. This generalizes the proof for the XOR swap algorithm: XOR is both the addition and subtraction in the abelian group <math>(\mathbb Z/2\mathbb Z)^{s}</math> (which is the [[direct sum]] of ''s'' copies of <math>\mathbb Z/2\mathbb Z</math>). This doesn't hold when dealing with the <code>signed int</code> type (the default for <code>int</code>). Signed integer overflow is an undefined behavior in C and thus modular arithmetic is not guaranteed by the standard, which may lead to incorrect results. The sequence of operations in <code>AddSwap</code> can be expressed via matrix multiplication as: :<math> \begin{pmatrix}1 & -1\\0 & 1\end{pmatrix} \begin{pmatrix}1 & 0\\1 & -1\end{pmatrix} \begin{pmatrix}1 & 1\\0 & 1\end{pmatrix} = \begin{pmatrix}0 & 1\\1 & 0\end{pmatrix} </math> == Application to register allocation == On architectures lacking a dedicated swap instruction, because it avoids the extra temporary register, the XOR swap algorithm is required for optimal [[register allocation]]. This is particularly important for [[compilers]] using [[static single assignment form]] for register allocation; these compilers occasionally produce programs that need to swap two registers when no registers are free. The XOR swap algorithm avoids the need to reserve an extra register or to spill any registers to main memory.<ref>{{cite book |last1=Pereira |first1=Fernando Magno Quintão |last2=Palsberg |first2=Jens |chapter=SSA Elimination after Register Allocation |title=Compiler Construction |series=Lecture Notes in Computer Science |date=2009 |volume=5501 |pages=158–173 |doi=10.1007/978-3-642-00722-4_12 |isbn=978-3-642-00721-7 |chapter-url=http://compilers.cs.ucla.edu/fernando/publications/papers/CC09.pdf |access-date=17 April 2022}}</ref> The addition/subtraction variant can also be used for the same purpose.<ref>{{cite book |last1=Hack |first1=Sebastian |last2=Grund |first2=Daniel |last3=Goos |first3=Gerhard |chapter=Register Allocation for Programs in SSA-Form |title=Compiler Construction |series=Lecture Notes in Computer Science |date=2006 |volume=3923 |pages=247–262 |doi=10.1007/11688839_20|isbn=978-3-540-33050-9 |doi-access=free }}</ref> This method of register allocation is particularly relevant to [[GPU]] shader compilers. On modern GPU architectures, spilling variables is expensive due to limited memory bandwidth and high memory latency, while limiting register usage can improve performance due to dynamic partitioning of the [[register file]]. The XOR swap algorithm is therefore required by some GPU compilers.<ref>{{cite web |last1=Abbott |first1=Connor |last2=Schürmann |first2=Daniel |title=SSA-based Register Allocation for GPU Architectures |url=https://indico.freedesktop.org/event/1/contributions/7/attachments/8/11/SSA-based%20Allocation%20for%20GPU%20Architectures.pdf |access-date=17 April 2022}}</ref> == See also == * [[Symmetric difference]] * [[XOR linked list]] * [[Feistel cipher]] (the XOR swap algorithm is a degenerate form of a Feistel cipher) == Notes == {{Notelist}} == References == {{Reflist}} [[Category:Algorithms]] [[Category:Articles with example C code]] [[Category:Binary arithmetic]] [[fr:Échange (informatique)#En utilisant l.27op.C3.A9ration XOR]] [[pl:Zamiana wartości zmiennych]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Cite book
(
edit
)
Template:Cite web
(
edit
)
Template:Code
(
edit
)
Template:Efn
(
edit
)
Template:Notelist
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)