Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Magic number (programming)
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Format indicators {{anchor|Format indicator}} == === Origin {{anchor|Magic number origin}} === Format indicators were first used in early [[Version 7 Unix]] source code.{{citation needed|date=November 2019}} [[Unix]] was ported to one of the first [[Digital Equipment Corporation|DEC]] [[PDP-11]]/20s, which did not have [[memory protection]]. So early versions of Unix used the [[Position-independent code|relocatable memory reference]] model.<ref name="dmr">{{cite web |url=http://cm.bell-labs.com/cm/cs/who/dmr/odd.html |title=Odd Comments and Strange Doings in Unix |date=22 June 2002 |website=[[Bell Labs]] |url-status=dead |archive-url=https://web.archive.org/web/20061104034450/http://cm.bell-labs.com/cm/cs/who/dmr/odd.html |archive-date=2006-11-04}}</ref> Pre-[[Sixth Edition Unix]] versions read an executable file into [[magnetic-core memory|memory]] and jumped to the first low memory address of the program, [[relative address]] zero. With the development of [[Memory page|paged]] versions of Unix, a [[header (computing)|header]] was created to describe the [[executable|executable image]] components. Also, a [[branch instruction]] was inserted as the first word of the header to skip the header and start the program. In this way a program could be run in the older relocatable memory reference (regular) mode or in paged mode. As more executable formats were developed, new constants were added by incrementing the branch [[Offset (computer science)|offset]].<ref>Personal communication with Dennis M. Ritchie.</ref> In the [[Version 6 Unix|Sixth Edition]] [[Lions' Commentary on UNIX 6th Edition, with Source Code|source code]] of the Unix program loader, the exec() function read the executable ([[Binary numeral system|binary]]) image from the file system. The first 8 [[byte]]s of the file was a [[header (computing)|header]] containing the sizes of the program (text) and initialized (global) data areas. Also, the first 16-bit word of the header was compared to two [[constant (programming)|constant]]s to determine if the [[Executable|executable image]] contained [[Position-independent code|relocatable memory references]] (normal), the newly implemented [[Memory page|paged]] read-only executable image, or the separated instruction and data paged image.<ref name="V6sys1">{{cite web |url=https://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/sys/ken/sys1.c |title=The Unix Tree V6/usr/sys/ken/sys1.c |work=[[The Unix Heritage Society]] |archive-url=https://web.archive.org/web/20230326024616/https://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/sys/ken/sys1.c |archive-date=26 March 2023 |url-status=live }}</ref> There was no mention of the dual role of the header constant, but the high order byte of the constant was, in fact, the [[operation code]] for the PDP-11 branch instruction ([[octal]] 000407 or [[Hexadecimal|hex]] 0107). Adding seven to the program counter showed that if this constant was [[Executable|executed]], it would branch the Unix exec() service over the executable image eight byte header and start the program. Since the Sixth and Seventh Editions of Unix employed paging code, the dual role of the header constant was hidden. That is, the exec() service read the executable file header ([[Meta (prefix)|meta]]) data into a [[kernel space]] buffer, but read the executable image into [[user space]], thereby not using the constant's branching feature. Magic number creation was implemented in the Unix [[Linker (computing)|linker]] and [[Loader (computing)|loader]] and magic number branching was probably still used in the suite of [[Standalone program|stand-alone]] [[diagnostic program]]s that came with the Sixth and Seventh Editions. Thus, the header constant did provide an illusion and met the criteria for [[Magic (programming)|magic]]. In Version Seven Unix, the header constant was not tested directly, but assigned to a variable labeled '''ux_mag'''<ref name="V7sys1">{{cite web |url=https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/sys/sys/sys1.c |title=The Unix Tree V7/usr/sys/sys/sys1.c |work=[[The Unix Heritage Society]] |archive-url=https://web.archive.org/web/20230326024632/https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/sys/sys/sys1.c |archive-date=26 March 2023 |url-status=live }}</ref> and subsequently referred to as the '''magic number'''. Probably because of its uniqueness, the term '''magic number''' came to mean executable format type, then expanded to mean file system type, and expanded again to mean any type of file. === In files {{anchor|Magic numbers in files}} === <!-- Courtesy note per [[MOS:LINK2SECT]]: [[File format#Magic number]] links here. --> {{Main|File format#Magic number}} {{See also|List of file signatures}} Magic numbers are common in programs across many operating systems. Magic numbers implement [[strongly typed]] data and are a form of [[in-band signaling]] to the controlling program that reads the data type(s) at program run-time. Many files have such constants that identify the contained data. Detecting such constants in files is a simple and effective way of distinguishing between many [[file format]]s and can yield further run-time [[information]]. ;Examples * [[Compiler|Compiled]] [[Java class file]]s ([[Java bytecode|bytecode]]) and [[Mach (kernel)|Mach-O]] binaries start with hex <code>CAFEBABE</code>. When compressed with [[Pack200]] the bytes are changed to <code>CAFED00D</code>. * [[GIF]] image files have the [[ASCII]] code for "GIF89a" (<code>47</code> <code>49</code> <code>46</code> <code>38</code> <code>39</code> <code>61</code>) or "GIF87a" (<code>47</code> <code>49</code> <code>46</code> <code>38</code> <code>37</code> <code>61</code>) * [[JPEG]] image files begin with <code>FF</code> <code>D8</code> and end with <code>FF</code> <code>D9</code>. JPEG/[[JFIF]] files contain the [[Null-terminated string|null terminated string]] "JFIF" (<code>4A</code> <code>46</code> <code>49</code> <code>46</code> <code>00</code>). JPEG/[[Exif]] files contain the [[Null-terminated string|null terminated string]] "Exif" (<code>45</code> <code>78</code> <code>69</code> <code>66</code> <code>00</code>), followed by more [[Metadata (computing)|metadata]] about the file. * [[PNG]] image files begin with an 8-[[byte]] signature which identifies the file as a PNG file and allows detection of common file transfer problems: "\211PNG\r\n\032\n" (<code>89</code> <code>50</code> <code>4E</code> <code>47</code> <code>0D</code> <code>0A</code> <code>1A</code> <code>0A</code>). That signature contains various [[newline]] characters to permit detecting unwarranted automated newline conversions, such as transferring the file using [[File Transfer Protocol|FTP]] with the ''ASCII'' [[File Transfer Protocol#Protocol overview|transfer mode]] instead of the ''binary'' mode.<ref>{{cite web |url=http://www.libpng.org/pub/png/spec/1.0/PNG-Rationale.html#R.PNG-file-signature |title=PNG (Portable Network Graphics) Specification Version 1.0: 12.11. PNG file signature |date=1 October 1996 |work=MIT |archive-url=https://web.archive.org/web/20230326024630/http://www.libpng.org/pub/png/spec/1.0/PNG-Rationale.html#R.PNG-file-signature |archive-date=26 March 2023 |url-status=live }}</ref> * Standard [[MIDI]] audio files have the [[ASCII]] code for "MThd" ('''M'''IDI '''T'''rack '''h'''ea'''d'''er, <code>4D</code> <code>54</code> <code>68</code> <code>64</code>) followed by more metadata. * [[Unix]] or [[Linux]] scripts may start with a [[shebang (Unix)|shebang]] ("#!", <code>23</code> <code>21</code>) followed by the path to an [[interpreter directive|interpreter]], if the interpreter is likely to be different from the one from which the script was invoked. * [[Executable and Linkable Format|ELF]] executables start with the byte <code>7F</code> followed by "ELF" (<code>7F</code> <code>45</code> <code>4C</code> <code>46</code>). * [[PostScript]] files and programs start with "%!" (<code>25</code> <code>21</code>). * [[PDF]] files start with "%PDF" (hex <code>25</code> <code>50</code> <code>44</code> <code>46</code>). * [[DOS MZ executable]] files and the [[EXE#Other|EXE stub]] of the [[Microsoft Windows]] [[Portable Executable|PE]] (Portable Executable) files start with the characters "MZ" (<code>4D</code> <code>5A</code>), the initials of the designer of the file format, [[Mark Zbikowski]]. The definition allows the uncommon "ZM" (<code>5A</code> <code>4D</code>) as well for dosZMXP, a non-PE EXE.<ref name="doszmxp">{{cite web |url=https://blogs.msdn.microsoft.com/oldnewthing/20080324-00/?p=23033 |title=What's the difference between the COM and EXE extensions? |first=Raymond |last=Chen |date=24 March 2008 |work=The Old New Thing |url-status=dead |archive-url=https://web.archive.org/web/20190218083526/https://blogs.msdn.microsoft.com/oldnewthing/20080324-00/?p=23033 |archive-date=18 February 2019}}</ref> * The [[Berkeley Fast File System]] superblock format is identified as either <code>19</code> <code>54</code> <code>01</code> <code>19</code> or <code>01</code> <code>19</code> <code>54</code> depending on version; both represent the birthday of the author, [[Marshall Kirk McKusick]]. * The [[Master Boot Record]] of bootable storage devices on almost all [[IA-32]] [[IBM PC compatible]]s has a code of <code>55</code> <code>AA</code> as its last two bytes. * Executables for the [[Game Boy]] and [[Game Boy Advance]] handheld video game systems have a 48-byte or 156-byte magic number, respectively, at a fixed spot in the header. This magic number encodes a bitmap of the [[Nintendo]] logo. * [[Amiga]] software executable [[Amiga Hunk|Hunk]] files running on Amiga classic [[68000]] machines all started with the hexadecimal number $000003f3, nicknamed the "Magic Cookie." * In the Amiga, the only absolute address in the system is hex $0000 0004 (memory location 4), which contains the start location called SysBase, a pointer to exec.library, the so-called [[kernel (operating system)|kernel]] of Amiga. * [[Preferred Executable Format|PEF]] files, used by the [[classic Mac OS]] and [[BeOS]] for [[PowerPC]] executables, contain the [[ASCII]] code for "Joy!" (<code>4A</code> <code>6F</code> <code>79</code> <code>21</code>) as a prefix. * [[TIFF]] files begin with either "II" or "MM" followed by [[Answer to Life, the Universe, and Everything|42]] as a two-byte integer in little or big [[endianness|endian]] byte ordering. "II" is for Intel, which uses [[Endianness|little endian]] byte ordering, so the magic number is <code>49</code> <code>49</code> <code>2A</code> <code>00</code>. "MM" is for Motorola, which uses [[Endianness|big endian]] byte ordering, so the magic number is <code>4D</code> <code>4D</code> <code>00</code> <code>2A</code>. * [[Unicode]] text files encoded in [[UTF-16]] often start with the [[Byte Order Mark]] to detect [[endianness]] (<code>FE</code> <code>FF</code> for big endian and <code>FF</code> <code>FE</code> for little endian). And on [[Microsoft Windows]], [[UTF-8]] text files often start with the UTF-8 encoding of the same character, <code>EF</code> <code>BB</code> <code>BF</code>. * [[LLVM]] Bitcode files start with "BC" (<code>42</code> <code>43</code>). * [[Doom WAD|WAD]] files start with "IWAD" or "PWAD" (for ''[[Doom (1993 video game)|Doom]]''), "WAD2" (for ''[[Quake (video game)|Quake]]'') and "WAD3" (for ''[[Half-Life (video game)|Half-Life]]''). * Microsoft [[Compound File Binary Format]] (mostly known as one of the older formats of [[Microsoft Office]] documents) files start with <code>D0</code> <code>CF</code> <code>11</code> <code>E0</code>, which is visually suggestive of the word "DOCFILE0". * Headers in [[ZIP (file format)|ZIP]] files often show up in text editors as "PKβ₯β¦" (<code>50</code> <code>4B</code> <code>03</code> <code>04</code>), where "PK" are the initials of [[Phil Katz]], author of [[DOS]] compression utility [[PKZIP]]. * Headers in [[7z]] files begin with "7z" (full magic number: <code>37</code> <code>7A</code> <code>BC</code> <code>AF</code> <code>27</code> <code>1C</code>). ;Detection The Unix utility program <code>[[File (command)|file]]</code> can read and interpret magic numbers from files, and the file which is used to parse the information is called ''magic''. The Windows utility TrID has a similar purpose. === In protocols {{anchor|Magic numbers in protocols}} === ;Examples * The [[OSCAR protocol]], used in [[AOL Instant Messenger|AIM]]/[[ICQ]], prefixes requests with <code>2A</code>. * In the [[RFB protocol]] used by [[VNC]], a client starts its conversation with a server by sending "RFB" (<code>52</code> <code>46</code> <code>42</code>, for "Remote Frame Buffer") followed by the client's protocol version number. * In the [[Server Message Block|SMB]] protocol used by Microsoft Windows, each SMB request or server reply begins with '<code>FF</code> <code>53</code> <code>4D</code> <code>42</code>', or <code>"\xFFSMB"</code> at the start of the SMB request. * In the [[MSRPC]] protocol used by Microsoft Windows, each TCP-based request begins with <code>05</code> at the start of the request (representing Microsoft DCE/RPC Version 5), followed immediately by a <code>00</code> or <code>01</code> for the minor version. In UDP-based MSRPC requests the first byte is always <code>04</code>. * In [[Component Object Model|COM]] and [[Distributed Component Object Model|DCOM]] marshalled interfaces, called [[OBJREF]]s, always start with the byte sequence "MEOW" (<code>4D</code> <code>45</code> <code>4F</code> <code>57</code>). Debugging extensions (used for DCOM channel hooking) are prefaced with the byte sequence "MARB" (<code>4D</code> <code>41</code> <code>52</code> <code>42</code>). * Unencrypted [[BitTorrent tracker]] requests begin with a single byte containing the value <code>19</code> representing the header length, followed immediately by the phrase "BitTorrent protocol" at byte position 1. * [[eDonkey2000]]/[[eMule]] traffic begins with a single byte representing the client version. Currently <code>E3</code> represents an eDonkey client, <code>C5</code> represents eMule, and <code>D4</code> represents compressed eMule. * The first 4 bytes of a block in the [[Bitcoin]] Blockchain contains a magic number which serves as the network identifier. The value is a constant <code>0xD9B4BEF9</code>, which indicates the main network, while the constant <code> 0xDAB5BFFA</code> indicates the testnet. * [[Secure Sockets Layer|SSL]] transactions always begin with a "client hello" message. The record encapsulation scheme used to prefix all SSL packets consists of two- and three- byte header forms. Typically an SSL version 2 client hello message is prefixed with an <code>80</code> and an SSLv3 server response to a client hello begins with <code>16</code> (though this may vary). * [[DHCP]] packets use a "magic cookie" value of '<code>0x63</code> <code>0x82</code> <code>0x53</code> <code>0x63</code>' at the start of the options section of the packet. This value is included in all DHCP packet types. * [[HTTP/2]] connections are opened with the preface '<code>0x505249202a20485454502f322e300d0a0d0a534d0d0a0d0a</code>', or "<code>PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n</code>". The preface is designed to avoid the processing of frames by servers and intermediaries which support earlier versions of HTTP but not 2.0. * The [[WebSocket#Opening handshake|WebSocket opening handshake]] uses the string <code>''258EAFA5-E914-47DA-95CA-C5AB0DC85B11''</code>. === In interfaces {{anchor|Magic numbers in interfaces}} === Magic numbers are common in [[API function]]s and [[interface (computing)|interface]]s across many [[operating system]]s, including [[DOS]], [[Windows]] and [[NetWare]]: ;Examples * [[IBM PC]]-compatible [[BIOS]]es use magic values <code>0000</code> and <code>1234</code> to decide if the system should count up memory or not on reboot, thereby performing a cold or a warm boot. Theses values are also used by [[EMM386]] memory managers intercepting boot requests.<ref name="Paul_2002_MAGIC"/> BIOSes also use magic values <code>55 AA</code> to determine if a disk is bootable.<ref>{{Cite web |url=http://neosmart.net/wiki/mbr-boot-process/ |title=The BIOS/MBR Boot Process |date=2015-01-25 |website=NeoSmart Knowledgebase |language=en-US |access-date=2019-02-03 |archive-url=https://web.archive.org/web/20230326024702/https://neosmart.net/wiki/mbr-boot-process/ |archive-date=26 March 2023 |url-status=live }}</ref> * The [[MS-DOS]] disk cache [[SMARTDRV]] (codenamed "Bambi") uses magic values BABE and EBAB in API functions.<ref name="Paul_2002_MAGIC"/> * Many [[DR-DOS]], [[Novell DOS]] and [[OpenDOS]] drivers developed in the former ''European Development Centre'' in the UK use the value 0EDC as magic token when invoking or providing additional functionality sitting on top of the (emulated) standard DOS functions, NWCACHE being one example.<ref name="Paul_2002_MAGIC"/> === Other uses {{anchor|Magic numbers in other uses}} === ;Examples * The default [[MAC address]] on Texas Instruments [[System on a chip|SOCs]] is DE:AD:BE:EF:00:00.<ref>{{cite web |url=http://e2e.ti.com/support/wireless_connectivity/f/307/p/131036/589272.aspx |title=TI E2E Community: Does anyone know if the following configurations can be done with MCP CLI Tool? |date=27 August 2011 |work=Texas Instruments |archive-url=https://web.archive.org/web/20221007161243/https://e2e.ti.com/support/processors-group/processors/f/processors-forum/589272/ccs-tms320c5545-c5545-uart_test-to-run-without-msp430 |archive-date=7 October 2022 |url-status=live }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)