Editing Endianness (section)

== Characteristics ==
[[file:32bit-Endianess.svg|thumb|upright=2|Diagram demonstrating big- versus little-endianness]]

[[Computer memory]] consists of a sequence of storage cells (smallest [[address space|addressable]] units); in machines that support [[byte addressing]], those units are called ''[[byte]]s''. Each byte is identified and accessed in hardware and software by its [[memory address]]. If the total number of bytes in memory is ''n'', then addresses are enumerated from 0 to ''n''&nbsp;−&nbsp;1.

Computer programs often use data structures or [[Field (computer science)|fields]] that may consist of more data than can be stored in one byte. In the context of this article where its type cannot be arbitrarily complicated, a "field" consists of a consecutive sequence of bytes and represents a "simple data value" which – at least potentially – can be manipulated by ''one'' single [[Instruction set architecture|hardware instruction]]. On most systems, the address of a multi-byte simple data value is the address of its first byte (the byte with the lowest address). There are exceptions to this rule – for example, the Add instruction of the [[IBM 1401]] addresses variable-length fields at their low-order (highest-addressed) position with their lengths being defined by a [[Word mark (computer hardware)|word mark]] set at their high-order (lowest-addressed) position. When an operation such as addition is performed, the processor begins at the low-order positions at the high addresses of the two fields and works its way down to the high-order.{{cn|date=November 2023}}

Another important attribute of a byte being part of a "field" is its "significance".
These attributes of the parts of a field play an important role in the sequence the bytes are accessed by the computer hardware, more precisely: by the low-level algorithms contributing to the results of a computer instruction.

=== Numbers ===
[[Positional notation|Positional number systems]] (mostly base 2, or less often base 10) are the predominant way of representing and particularly of manipulating [[Integer (computer science)|integer data]] by computers. In pure form this is valid for moderate sized non-negative integers, e.g. of C data type <code>[[Signedness|unsigned]]</code>. In such a number system, the ''value'' of a digit that contributes to the whole number is determined not only by its value as a single digit, but also by the position it holds in the complete number, called its significance. These positions can be mapped to memory mainly in two ways:<ref name="TanenbaumAustin2012">{{cite book |first1=Andrew S. |last1=Tanenbaum |first2=Todd M. |last2=Austin |title=Structured Computer Organization |url=https://books.google.com/books?id=m0HHygAACAAJ |access-date=18 May 2013 |date=4 August 2012 |publisher=Prentice Hall PTR |isbn=978-0-13-291652-3}} </ref>
* Decreasing numeric significance with increasing memory addresses, known as ''big-endian'' and
* Increasing numeric significance with increasing memory addresses, known as ''little-endian''.

In ''big-endian'' and ''little-endian'', the ''end'' is the extremity where the ''big'' or ''little'' significance is written in the location indexed by the lowest memory address.

The integer data that are directly supported by the [[Arithmetic logic unit|computer hardware]] have a fixed width of a low power of 2, e.g. 8 bits ≙ 1 byte, 16 bits ≙ 2 bytes, 32 bits ≙ 4 bytes, 64 bits ≙ 8 bytes, 128 bits ≙ 16 bytes. The low-level access sequence to the bytes of such a field depends on the operation to be performed. The least-significant byte is accessed first for [[addition]], [[subtraction]] and [[multiplication]]. The most-significant byte is accessed first for [[Division (mathematics)|division]] and [[Natural number# Order|comparison]]. See {{section link||Calculation order}}.

=== Text ===

When character (text) strings are to be compared with one another, e.g. in order to support some mechanism like [[Sorting algorithm|sorting]], this is very frequently done [[lexicographically]] where a single positional element (character) also has a positional value. Lexicographical comparison means almost everywhere: first character ranks highest – as in the telephone book. Almost all machines which can do this using a single instruction are big-endian or at least mixed-endian.{{cn|date=November 2023}}

Integer numbers written as text are always represented most significant digit first in memory, which is similar to big-endian, independently of [[text direction]].

=== Byte addressing ===
{{See also|Byte addressing}}

When memory bytes are printed sequentially from left to right (e.g. in a [[hex dump]]), little-endian representation of integers has the significance increasing from right to left. In other words, it appears backwards when visualized, which can be counter-intuitive.

This behavior arises, for example, in [[FourCC]] or similar techniques that involve packing characters into an integer, so that it becomes a sequence of specific characters in memory. For example, take the string "JOHN", stored in hexadecimal [[ASCII]]. On big-endian machines, the value appears left-to-right, coinciding with the correct string order for reading the result ("J O H N"). But on a little-endian machine, one would see "N H O J". Middle-endian machines complicate this even further; for example, on the [[PDP-11]], the 32-bit value is stored as two 16-bit words "JO" "HN" in big-endian, with the characters in the 16-bit words being stored in little-endian, resulting in "O J N H".<ref name=":0" />

=== Byte swapping ===

Byte-swapping consists of rearranging bytes to change endianness. Many compilers provide [[Intrinsic function|built-ins]] that are likely to be compiled into native processor instructions ({{code|bswap}}/{{code|movbe}}), such as {{code|__builtin_bswap32}}. Software interfaces for swapping include:

* Standard [[#Networking|network endianness]] functions (from/to BE, up to 32-bit).<ref>{{man|3|byteorder|Linux}}</ref> Windows has a 64-bit extension in {{code|winsock2.h}}.
* BSD and Glibc {{code|endian.h}} functions (from/to BE and LE, up to 64-bit).<ref>{{man|3|endian|Linux}}</ref>
* [[macOS]] {{code|OSByteOrder.h}} macros (from/to BE and LE, up to 64-bit).
* The {{code|std::byteswap}} function in [[C++23]].<ref>{{cite web |title=std::byteswap |url=https://en.cppreference.com/w/cpp/numeric/byteswap |website=en.cppreference.com |access-date=3 October 2023 |archive-date=20 November 2023 |archive-url=https://web.archive.org/web/20231120095109/https://en.cppreference.com/w/cpp/numeric/byteswap |url-status=live }}</ref>

Some [[CPU]] instruction sets provide native support for endian byte swapping, such as {{code|bswap}}<ref>{{cite web|url=http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf |archive-date=2022-10-09 |url-status=live|title=Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z|at=p. 3–112|publisher=Intel|date=September 2016|access-date=2017-02-05}}</ref> ([[x86]] — [[Intel 80486|486]] and later, [[i960]] — i960Jx and later<ref>{{cite web|url=https://datasheets.chipdb.org/Intel/80960/manuals/27317301.PDF|title=i960® VH Processor Developer's Manual|publisher=Intel|date=October 1998|access-date=2024-04-02|archive-date=2024-04-02|archive-url=https://web.archive.org/web/20240402165236/https://datasheets.chipdb.org/Intel/80960/manuals/27317301.PDF|url-status=live}}</ref>), and {{code|rev}}<ref>{{cite web|url=http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487a.k_10775/index.html|title=ARMv8-A Reference Manual|publisher=[[ARM Holdings]]|access-date=2017-02-05|archive-date=2019-01-19|archive-url=https://web.archive.org/web/20190119214452/http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487a.k_10775/index.html|url-status=live}}</ref> ([[ARM architecture|ARMv6]] and later).

Some [[compiler]]s have built-in facilities for byte swapping. For example, the [[Intel]] [[Fortran]] compiler supports the non-standard {{code|CONVERT}} specifier when opening a file, e.g.: {{code|1=OPEN(unit, CONVERT='BIG_ENDIAN',...)|2=fortran|class=nowrap}}. Other compilers have options for generating code that globally enables the conversion for all file IO operations. This permits the reuse of code on a system with the opposite endianness without code modification.