Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
X86
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Family of instruction set architectures}} {{About|the Intel microprocessor architecture in general|the 32-bit generation of this architecture that is also referred to as "x86"|IA-32}} {{Use mdy dates|date=June 2016}} {{Lowercase title}} {{Infobox CPU architecture | name = x86 | designer = [[Intel]], [[Advanced Micro Devices|AMD]] | bits = [[16-bit computing|16-bit]], [[32-bit computing|32-bit]] and [[64-bit computing|64-bit]] | introduced = 1978 (16-bit), 1985 (32-bit), 2003 (64-bit) | version = | design = [[Complex instruction set computer|CISC]] | type = [[Register–memory architecture|Register–memory]] | encoding = Variable (1 to 15 bytes) | branching = [[Status register|Condition code]] | endianness = Little | page size = [[8086]]–[[i286]]: None<br/>[[Intel 386|i386]], [[Intel 486|i486]]: 4 KB pages<br/>[[P5 (microarchitecture)|P5]] [[Pentium]]: added 4 MB pages<br/>(Legacy [[Physical Address Extension|PAE]]: 4 KB→2 MB)<br/>[[Long mode|x86-64]]: added 1 GB pages | extensions = [[x87]], [[IA-32]], [[x86-64]], [[MMX (instruction set)|MMX]], [[3DNow!]], [[Streaming SIMD Extensions|SSE]], [[Machine Check Architecture|MCA]], [[Advanced Configuration and Power Interface|ACPI]], [[SSE2]], [[NX bit]], [[simultaneous multithreading|SMT]], [[SSE3]], [[SSSE3]], [[SSE4]], [[SSE4.2]], [[AES-NI]], [[CLMUL instruction set|CLMUL]], [[SM3 (hash function)|SM3]], [[SM4 (cipher)|SM4]], [[RDRAND]], [[Intel SHA extensions|SHA]], [[Intel MPX|MPX]], [[Secure Memory Encryption|SME]], [[Software Guard Extensions|SGX]], [[XOP instruction set|XOP]], [[F16C]], [[Intel ADX|ADX]], [[Bit Manipulation Instruction Sets|BMI]], [[FMA instruction set|FMA]], [[Advanced Vector Extensions|AVX]], [[AVX2]], [[AVX-VNNI]], [[AVX512]], [[AVX10]], [[Advanced Matrix Extensions|AMX]], [[VT-x]], [[VT-d]], [[AMD-V]], [[AMD-Vi]], [[Transactional Synchronization Extensions|TSX]], [[Advanced Synchronization Facility|ASF]], [[Trusted Execution Technology|TXT]], [[Advanced Performance Extensions|APX]] | open = Partly. For some advanced features, x86 may require license from Intel, though some do not need it;{{cn|date=December 2024}} x86-64 may require an additional license from AMD. The [[Pentium Pro]] processor (and [[NetBurst]]) has been on the market for more than 21 years<ref>{{cite press release |last=Pryce |first=Dave |date=May 11, 1989 |title=80486 32-bit CPU breaks new ground in chip density and operating performance. (Intel Corp.) (product announcement) EDN}}</ref> and so cannot be subject to patent claims. The [[i686]] subset of the x86 architecture is therefore fully open. The [[Opteron]] 1000 series processors have been on the market for more than 21 years<ref>{{cite web|url=https://www.hpcwire.com/2003/09/12/amd-announces-new-amd-opteron-processors/ |last=Swoyer |first=Stephen |date=April 24, 2003 |title=AMD introduces 64-bit Opteron Chip (ESJ) (news article)}}</ref> and so cannot be subject to patent claims. The [[AMD K8]] subset of the x86 architecture is therefore fully open. | gpr = {{blist|16-bit: 6 semi-dedicated registers, BP and SP are not general-purpose|32-bit: 8 GPRs, including EBP and ESP|64-bit: 16 GPRs, including RBP and RSP}} | fpr = {{blist|16-bit: optional separate [[x87]] FPU|32-bit: optional separate or integrated [[x87]] FPU, integrated [[Streaming SIMD Extensions|SSE]] units in later processors|64-bit: integrated [[x87]] and [[SSE2]] units, later implementations extended to [[AVX2]] and [[AVX512]]}} }} '''x86''' (also known as '''80x86'''<ref>{{cite book |last1=Rao |first1=P.V.S. |title=Computer System Architecture |date=2009 |publisher=Prentice-Hall of India |isbn=978-81-203-3594-3 |page=402 (Section 19.1, ''The x86 family of processors'')}}</ref> or the '''8086 family''')<ref>{{cite book |last1=Mhatre |first1=Swapneel Chandrakant |title=Microprocessors and Interfacing Techniques: For S. E. (Computer Engineering) Semester II of University of Pune |date=2012 |publisher=Jaico Publishing House |isbn=978-81-8495-325-1}}</ref> is a family of [[complex instruction set computer]] (CISC) [[instruction set architecture]]s{{Efn|Unlike the [[microarchitecture]] (and specific electronic and physical implementation) used for a specific microprocessor design.}} initially developed by [[Intel]], based on the [[Intel 8086|8086]] microprocessor and its 8-bit-external-bus variant, the [[Intel 8088|8088]]. The 8086 was introduced in 1978 as a fully [[16-bit computing|16-bit]] extension of [[8-bit computing|8-bit]] Intel's [[Intel 8080|8080]] microprocessor, with [[x86 memory segmentation|memory segmentation]] as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the [[Intel 80186|80186]], [[Intel 80286|80286]], [[i386|80386]] and [[i486|80486]]. Colloquially, their names were "186", "286", "386" and "486". The term is not synonymous with [[IBM PC compatible|IBM PC compatibility]], as this implies a multitude of other [[computer hardware]]. [[Embedded system]]s and general-purpose computers used x86 chips [[Influence of the IBM PC on the personal computer market#Before the IBM PC's introduction|before the PC-compatible market started]],{{Efn|The [[GRID Compass]] laptop, for instance.}} some of them before the [[IBM PC]] (1981) debut. {{As of|2022|06}}, most [[desktop computer|desktop]] and [[laptop]] computers sold are based on the x86 architecture family,<ref>{{cite web|url=https://www.tomshardware.com/news/intel-amd-4q-2021-2022-market-share-desktop-notebook-server-x86|title=AMD Sets All-Time CPU Market Share Record as Intel Gains in Desktop and Notebook PCs|first=Paul|last=Alcorn|date=February 9, 2022|website=[[Tom's Hardware]]}}</ref> while mobile categories such as [[smartphone]]s or [[tablet computer|tablets]] are dominated by [[ARM architecture|ARM]]. At the high end, x86 continues to dominate computation-intensive [[workstation]] and [[cloud computing]] segments.<ref>{{cite web |url=https://icloud.pe/blog/the-cloud-beyond-x86-how-old-architectures-are-making-a-comeback/ |title=The cloud beyond x86: How old architectures are making a comeback |last1=Brandon |first1=Jonathan |date=15 April 2015 |website=ICloud PE |publisher=Business Cloud News |access-date=23 November 2020 |quote=Despite the dominance of x86 in the datacentre it is difficult to ignore the noise vendors have been making over the past couple of years around non-x86 architectures like ARM... |archive-date=August 19, 2021 |archive-url=https://web.archive.org/web/20210819235820/https://icloud.pe/blog/the-cloud-beyond-x86-how-old-architectures-are-making-a-comeback/ |url-status=live }}</ref> <gallery> File:KL Intel D8086.jpg|The x86 architectures were based on the Intel 8086 microprocessor chip, initially released in 1978. File:Intel Core i7 13700K.jpg|Intel Core i7, a modern x86-compatible, 64-bit multicore processor File:Slot-A Athlon.jpg|AMD Athlon (early version), a technically different but fully compatible x86 implementation </gallery> ==Overview== In the 1980s and early 1990s, when the [[8088]] and [[80286]] were still in common use, the term x86 usually represented any 8086-compatible CPU. Today, however, x86 usually implies binary compatibility with the [[32-bit computing|32-bit]] [[instruction set]] of the [[I386|80386]]. This is due to the fact that this instruction set has become something of a lowest common denominator for many modern operating systems and also probably because the term became common after the introduction of the 80386 in 1985. A few years after the introduction of the 8086 and 8088, Intel added some complexity to its naming scheme and terminology as the "iAPX" of the ambitious but ill-fated [[Intel iAPX 432]] processor was tried on the more successful 8086 family of chips,{{Efn|Including the [[8088]], [[80186]], [[80188]] and [[80286]] processors.}} applied as a kind of system-level prefix. An 8086 system, including [[coprocessor]]s such as [[8087]] and [[8089]], and simpler Intel-specific system chips,{{Efn|Such a system also contained the usual mix of standard [[7400 series]] support components, including [[multiplexer]]s, buffers, and [[glue logic]].}} was thereby described as an iAPX 86 system.<ref>{{cite web |last=Dvorak |first=John C. |url=http://www.dvorak.org/blog/whatever-happened-to-the-intel-iapx432/ |title=Whatever Happened to the Intel iAPX432? |publisher=Dvorak.org |access-date=April 18, 2014 |archive-date=November 25, 2017 |archive-url=https://web.archive.org/web/20171125031112/http://www.dvorak.org/blog/whatever-happened-to-the-intel-iapx432/ |url-status=live }}</ref>{{Efn|The actual meaning of [[iAPX]] was ''Intel Advanced Performance Architecture'', or sometimes ''Intel Advanced Processor Architecture''.}} There were also terms ''iRMX'' (for operating systems), ''iSBC'' (for single-board computers), and ''iSBX'' (for multimodule boards based on the 8086 architecture), all together under the heading ''Microsystem 80''.<ref name="i286">{{cite book|url=http://bitsavers.org/components/intel/80286/210498-001_iAPX_286_Programmers_Reference_1983.pdf|title=iAPX 286 Programmer's Reference|publisher=Intel|year=1983|access-date=August 28, 2017|archive-date=August 28, 2017|archive-url=https://web.archive.org/web/20170828232803/http://www.bitsavers.org/components/intel/80286/210498-001_iAPX_286_Programmers_Reference_1983.pdf|url-status=live}}</ref><ref name="i86">{{cite book|url=http://bitsavers.org/components/intel/_dataBooks/1981_iAPX_86_88_Users_Manual.pdf|title=iAPX 86, 88 User's Manual|publisher=Intel|date=August 1981|access-date=August 28, 2017|archive-date=August 28, 2017|archive-url=https://web.archive.org/web/20170828231811/http://bitsavers.org/components/intel/_dataBooks/1981_iAPX_86_88_Users_Manual.pdf|url-status=live}}</ref> However, this naming scheme was quite temporary, lasting for a few years during the early 1980s.{{Efn|late 1981 to early 1984, approximately}} Although the 8086 was primarily developed for [[embedded systems]] and small multi-user or single-user computers, largely as a response to the successful 8080-compatible [[Zilog Z80]],<ref>{{cite web |url=http://www.pcworld.com/article/146957/birth_of_a_standard_the_intel_8086_microprocessor.html |title=Birth of a Standard: The Intel 8086 Microprocessor |last=Edwards |first=Benj |date=June 16, 2008 |work=PCWorld |access-date=September 14, 2014 |archive-date=September 26, 2010 |archive-url=https://web.archive.org/web/20100926200936/http://www.pcworld.com/article/146957/birth_of_a_standard_the_intel_8086_microprocessor.html |url-status=dead }}</ref> the x86 line soon grew in features and processing power. Today, x86 is ubiquitous in both stationary and portable personal computers, and is also used in [[midrange computer]]s, [[workstation]]s, servers, and most new [[supercomputer]] [[computer cluster|cluster]]s of the [[TOP500]] list. A large amount of [[software]], including a large list of {{Cl|x86 operating systems|x86 operating systems}} are using x86-based hardware. Modern x86 is relatively uncommon in [[embedded system]]s, however; small [[Low-power electronics|low power]] applications (using tiny batteries), and low-cost microprocessor markets, such as [[home appliance]]s and toys, lack significant x86 presence.{{Efn|The embedded processor market is populated by more than 25 different [[instruction set|architectures]], which, due to the price sensitivity, low power, and hardware simplicity requirements, outnumber the x86.}} Simple 8- and 16-bit based architectures are common here, as well as simpler RISC architectures like [[RISC-V]], although the x86-compatible [[VIA C7]], [[VIA Nano]], [[Advanced Micro Devices|AMD]]'s [[Geode (processor)|Geode]], [[Athlon Neo]] and [[Intel Atom]] are examples of 32- and [[64-bit computing|64-bit]] designs used in some relatively low-power and low-cost segments. There have been several attempts, including by Intel, to end the market dominance of the "inelegant" x86 architecture designed directly from the first simple 8-bit microprocessors. Examples of this are the [[iAPX 432]] (a project originally named the ''Intel 8800''<ref>{{Cite journal |doi=10.1109/MAHC.2010.22 |title=Intel's 8086 |author=Stanley Mazor |journal=IEEE Annals of the History of Computing |volume=32 |number=1 |date=January–March 2010 |pages=75–79|s2cid=16451604 }}</ref>), the [[Intel 960]], [[Intel 860]] and the Intel/Hewlett-Packard [[Itanium]] architecture. However, the continuous refinement of x86 [[microarchitecture]]s, [[electronic circuit|circuitry]] and [[semiconductor manufacturing]] would make it hard to replace x86 in many segments. AMD's 64-bit extension of x86 (which Intel eventually responded to with a compatible design)<ref>{{cite press release |url= http://www1.amd.com/newsroom/display/1,1528,435,00.html |title=AMD Discloses New Technologies At Microprocessor Forum |quote="Time and again, processor architects have looked at the inelegant x86 architecture and declared it cannot be stretched to accommodate the latest innovations," said Nathan Brookwood, principal analyst, Insight 64. |archive-url=https://web.archive.org/web/20000302151607/http://www1.amd.com/newsroom/display/1,1528,435,00.html |date=October 5, 1999 |publisher=[[Advanced Micro Devices|AMD]] |archive-date=March 2, 2000}}</ref> and the scalability of x86 chips in the form of modern multi-core CPUs, is underlining x86 as an example of how continuous refinement of established industry standards can resist the competition from completely new architectures.<ref>{{cite web |url=https://www.eweek.com/networking/microsoft-to-end-intel-itanium-support/ |title=Microsoft to End Intel Itanium Support |first=Jeff |last=Burt |date=April 5, 2010 |website=[[eWeek]] |access-date=June 2, 2022}}</ref> ==Chronology== {{More citations needed section|date=March 2020}} The table below lists processor models and model series implementing various architectures in the x86 family, in chronological order. Each line item is characterized by significantly improved or commercially successful processor microarchitecture designs. {{mw-datatable}} {| class="wikitable mw-datatable" style="text-align:center" |+Chronology of x86 processors |- ! colspan="2" rowspan="2"|Era ! rowspan="2" |Introduction ! rowspan="2" |Prominent CPU models ! colspan="3" |[[Address space]] ! rowspan="2" |Notable features |- ![[Linear address space|Linear]] ![[Virtual address space|Virtual]] ![[Physical address|Physical]] |- | rowspan="3" style="vertical-align: middle; font-size: smaller;"|x86-16 ||rowspan="2" style="width:80px"|'''1st''' || 1978 || [[Intel 8086]], [[Intel 8088]] (1979) || rowspan="3" style="background: #FAECC8;" |16-bit ||rowspan="2" style="background: #FAECC8;" |NA ||rowspan="2" style="background: #FAECC8;" |20-bit || [[16-bit computing|16-bit]] [[instruction set architecture|ISA]], [[IBM PC]] (8088), [[IBM PC/XT]] (8088) |- | rowspan="2" |1982 || [[Intel 80186]], [[Intel 80188]]<br/>[[NEC V20]]/V30 (1983) || 8086-2 ISA, embedded (80186/80188) |- | '''2nd''' || [[Intel 80286]] and clones || style="background: #FAECC8;" |30-bit ||style="background: #FAECC8;" |24-bit || [[protected mode]], [[IBM Personal Computer XT|IBM PC/XT 286]], [[IBM Personal Computer/AT|IBM PC/AT]] |- | rowspan="13" style="vertical-align: middle; font-size: smaller;" |[[IA-32]] ||rowspan="1" style="width:80px"|'''3rd''' || 1985 || [[Intel 80386]], [[AMD Am386]] (1991) || rowspan="13" style="background: #CEE0F2;" |32-bit || rowspan="13" style="background: #CEE0F2;" |46-bit ||rowspan="5" style="background: #CEE0F2;" |32-bit || [[32-bit computing|32-bit]] [[instruction set architecture|ISA]], paging, [[IBM Personal System/2|IBM PS/2]] |- | '''4th''' (pipelining, cache) || 1989 || [[Intel 80486]]<br/>[[Cyrix]] [[Cyrix Cx486SLC|Cx486S]], [[Cyrix Cx486DLC|DLC]] (1992)<br/>[[AMD Am486]] (1993), [[AMD Am5x86|Am5x86]] (1995) || [[Instruction pipelining|pipelining]], on-die [[x87]] [[floating-point unit|FPU]] (486DX), on-die [[CPU cache|cache]] |- | rowspan="3" |'''5th'''<br/>([[Superscalar]]) || 1993 || Intel [[P5 (microarchitecture)|Pentium]], [[Pentium MMX]] (1996) || [[Superscalar]], [[64-bit computing|64-bit]] [[Bus (computing)|databus]], faster FPU, [[MMX (instruction set)|MMX]] (Pentium MMX), [[Advanced Programmable Interrupt Controller|APIC]], [[Symmetric multiprocessing|SMP]] |- | 1994 || [[NexGen]] Nx586<br/>AMD [[AMD K5|5k86]]/[[AMD K5|K5]] (1996) || Discrete microarchitecture ([[Micro-operation|μ-op]] translation) |- | 1995 || [[Cyrix Cx5x86]]<br/>[[Cyrix 6x86]]/MX (1997)/[[Cyrix MII|MII]] (1998) || [[dynamic execution]] |- | rowspan="3" |'''6th'''<br/>([[Physical Address Extension|PAE]], μ-op translation)|| 1995 || Intel [[Pentium Pro]] || rowspan="2" style="background: #CEE0F2;" |36-bit ([[Physical Address Extension|PAE]])|| μ-op translation, conditional move instructions, [[dynamic execution]], [[speculative execution]], 3-way x86 superscalar, superscalar FPU, [[Physical Address Extension|PAE]], on-chip [[L2 cache]] |- | 1997 ||Intel [[Pentium II]], [[Pentium III]] (1999)<br/>[[Celeron]] (1998), [[Xeon]] (1998) || on-package (Pentium II) or on-die (Celeron) L2 Cache, [[Streaming SIMD Extensions|SSE]] (Pentium III), [[Slot 1]], [[Socket 370]] or [[Slot 2]] (Xeon) |- | 1997 || [[AMD K6]]/[[AMD K6-2|K6-2]] (1998)/[[AMD K6-III|K6-III]] (1999)|| style="background: #CEE0F2;" |32-bit ||[[3DNow!]], 3-level cache system (K6-III) |- | rowspan="5" |Enhanced Platform|| 1999 || AMD [[Athlon]]<br>[[Athlon XP]]/[[Athlon MP|MP]] (2001)<br/>[[Duron]] (2000)<br>[[Sempron]] (2004) || style="background: #CEE0F2;" |36-bit || MMX+, 3DNow!+, double-pumped bus, [[Slot A]] or [[Socket A]] |- | rowspan = "2" |2000 || [[Transmeta Crusoe]] || style="background: #CEE0F2;" |32-bit || [[Code Morphing Software|CMS]] powered x86 platform processor, [[Very long instruction word|VLIW]]-128 core, on-die memory controller, on-die PCI bridge logic |- |Intel [[Pentium 4]] || rowspan="3" style="background: #CEE0F2;" |36-bit || [[SSE2]], [[Hyper-Threading|HTT]] (Northwood), NetBurst, quad-pumped bus, Trace Cache, [[Socket 478]] |- | rowspan="2" |2003 || Intel [[Pentium M]]<br/>[[Intel Core#Core|Intel Core]] (2006)<br>[[Pentium Dual-Core]] (2007) || [[Micro-op fusion|μ-op fusion]], [[XD bit]] (Dothan) (Intel Core "Yonah") |- |[[Transmeta Efficeon]] || [[Code Morphing Software|CMS]] 6.0.4, [[Very long instruction word|VLIW]]-256, [[NX bit]], [[Hyper Transport|HT]] |- |style="background: #ececec; color: grey; vertical-align: middle; font-size: smaller;" |[[IA-64]]||style="background: #ececec; color: grey; vertical-align: middle; font-size: smaller;" |64-bit Transition<br/>1999–2005 || 2001 || Intel [[Itanium]] (2001–2017) || colspan="3" style="background: #ececec;" |52-bit || 64-bit [[Explicitly parallel instruction computing|EPIC]] architecture, 128-bit VLIW instruction bundle, on-die hardware IA-32 H/W enabling x86 OSes & x86 applications (early generations), software IA-32 EL enabling x86 applications (Itanium 2), Itanium register files are remapped to x86 registers |- | rowspan="30" style="background: #ececec; color: grey; vertical-align: middle; font-size: smaller;" | [[x86-64]] || rowspan="30" style="background: #ececec; color: grey; vertical-align: middle; font-size: smaller;" |64-bit Extended<br/>since 2001 || colspan="6" style="background: #ececec; color: grey; vertical-align: middle; font-size: smaller;" |x86-64 is the 64-bit extended architecture of x86, its Legacy Mode preserves the entire and unaltered x86 architecture. The native architecture of x86-64 processors: residing in the 64-bit Mode, lacks of access mode in segmentation, presenting 64-bit architectural-permit linear address space; an adapted IA-32 architecture residing in the Compatibility Mode alongside 64-bit Mode is provided to support most x86 applications |- | 2003 || [[Athlon 64]]/[[Athlon 64 FX|FX]]/[[Athlon 64 X2|X2]] (2005), [[Opteron]]<br/>[[Sempron]] (2004)/[[Turion 64 X2|X2]] (2008)<br/>[[Turion 64]] (2005)/[[Turion 64 X2|X2]] (2006) || colspan="3" style="background: #ececec;" |40-bit || [[AMD64]] (except some Sempron processors presented as purely x86 processors), on-die memory controller, [[HyperTransport]], on-die dual-core (X2), [[AMD-V]] (Athlon 64 Orleans), [[Socket 754]]/[[Socket 939|939]]/[[Socket 940|940]] or [[Socket AM2|AM2]] |- | 2004 || [[Pentium4#Prescott|Pentium 4]] (Prescott)<br/>[[Celeron D]], [[Pentium D]] (2005) ||rowspan="2" colspan="3" style="background: #ececec;" |36-bit || [[EM64T]] (enabled on selected models of Pentium 4 and Celeron D), [[SSE3]], 2nd gen. NetBurst pipelining, dual-core (on-die: Pentium D 8xx, on-chip: Pentium D 9xx), [[Intel VT]] (Pentium 4 6x2), socket [[LGA 775]] |- | 2006 || [[Intel Core 2]]<br/>[[Pentium Dual-Core]] (2007)<br/>[[Celeron Dual-Core]] (2008) ||[[Intel 64]] (<<== EM64T), [[SSSE3]] (65 nm), wide dynamic execution, μ-op fusion, macro-op fusion in 16-bit and 32-bit mode,<ref name="intel-optimization-for-macro-fusion">{{cite web|url=https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf|title=Intel 64 and IA-32 Architectures Optimization Reference Manual|at=3.4.2.2 Optimizing for Macro-fusion|date=September 2019|publisher=Intel|access-date=March 7, 2020|archive-date=February 14, 2020|archive-url=https://web.archive.org/web/20200214191947/https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf|url-status=live}}</ref><ref name="agner-fog-microarchitecture">{{cite web|url=https://www.agner.org/optimize/microarchitecture.pdf|title=The microarchitecture of Intel, AMD and VIA CPUs|last=Fog|first=Agner|page=107|quote=Core2 can do macro-op fusion only in 16-bit and 32-bit mode. Core Nehalem can also do this in 64-bit mode.|access-date=March 7, 2020|archive-date=March 22, 2019|archive-url=https://web.archive.org/web/20190322145155/https://www.agner.org/optimize/microarchitecture.pdf|url-status=live}}</ref> on-chip quad-core(Core 2 Quad), Smart Shared L2 Cache (Intel Core 2 "Merom") |- | 2007 || [[AMD Phenom]]/[[AMD Phenom II|II]] (2008)<br/>[[AMD Athlon II|Athlon II]] (2009)<br>[[AMD Turion#Turion II|Turion II]] (2009)|| colspan="3" style="background: #ececec;" |48-bit || Monolithic quad-core (X4)/triple-core (X3), [[SSE4a]], [[Rapid Virtualization Indexing]] (RVI), HyperTransport 3, [[AM2+]] or [[AM3]] |- | rowspan="4" |2008 || [[Intel Core 2]] (45 nm) ||rowspan = "4" colspan="3" style="background: #ececec;" |40-bit || [[SSE4.1]] |- | [[Intel Atom]] || netbook or low power smart device processor, P54C core reused |- | Intel [[Core i7]]<br/>[[Core i5]] (2009)<br>[[Intel Core i3|Core i3]] (2010)|| QuickPath, on-chip GMCH ([[Clarkdale (microprocessor)|Clarkdale]]), [[SSE4|SSE4.2]], [[Second Level Address Translation#Extended Page Tables|Extended Page Tables]] (EPT) for virtualization, macro-op fusion in 64-bit mode,<ref name="intel-optimization-for-macro-fusion"/><ref name="agner-fog-microarchitecture"/> (Intel Xeon "Bloomfield" with Nehalem microarchitecture) |- | [[VIA Nano]] || [[hardware-based encryption]]; adaptive [[power management]] |- | 2010 ||[[Bulldozer (microarchitecture)|AMD FX]] || colspan="3" style="background: #ececec;" |48-bit || octa-core, CMT(Clustered Multi-Thread), FMA, OpenCL, AM3+ |- | rowspan="3" |2011 || AMD APU A and E Series ([[AMD Accelerated Processing Unit|Llano]]) || colspan="3" style="background: #ececec;" |40-bit || on-die GPGPU, PCI Express 2.0, [[Socket FM1]] |- | AMD APU C, E and Z Series ([[Bobcat (processor)|Bobcat]]) || rowspan = "2" colspan="3" style="background: #ececec;" |36-bit || low power smart device APU |- | [[Intel Core i3]], [[Core i5]] and [[Core i7]]<br/>([[Sandy Bridge (microarchitecture)|Sandy Bridge]]/[[Ivy Bridge (microarchitecture)|Ivy Bridge]]) || Internal Ring connection, decoded μ-op cache, [[LGA 1155]] socket |- | rowspan = "2" |2012 || AMD APU A Series ([[Bulldozer (processor)|Bulldozer, Trinity]] and later) || rowspan = "3" colspan="3" style="background: #ececec;" |48-bit || [[Advanced Vector Extensions|AVX]], Bulldozer based APU, [[Socket FM2]] or [[Socket FM2+]] |- | Intel [[Xeon Phi]] ([[Knights Corner]]) || PCI-E add-on card coprocessor for XEON based system, Manycore Chip, In-order [[P5 (microarchitecture)|P54C]], very wide VPU (512-bit SSE), LRBni instructions (8× 64-bit) |- | rowspan = "3" |2013 || |AMD [[Jaguar (microarchitecture)|Jaguar]]<br/>(Athlon, Sempron) || [[System on a chip|SoC]], game console and low power smart device processor |- | Intel [[Silvermont]]<br/>(Atom, Celeron, Pentium) ||colspan="3" style="background: #ececec;" |36-bit || [[System on a chip|SoC]], low/ultra-low power smart device processor |- |[[Intel Core i3]], [[Core i5]] and [[Core i7]] ([[Haswell (microarchitecture)|Haswell]]/[[Broadwell (microarchitecture)|Broadwell]]) || rowspan = "2" colspan="3" style="background: #ececec;" |39-bit || [[Advanced Vector Extensions 2|AVX2]], [[FMA instruction set|FMA3]], [[Transactional Synchronization Extensions|TSX]], [[Bit Manipulation Instruction Sets|BMI1, and BMI2]] instructions, [[LGA 1150]] socket |- | 2015 || Intel [[Broadwell (microarchitecture)|Broadwell-U]]<br/>([[Intel Core i3]], [[Core i5]], [[Core i7]], [[List of Intel Core M microprocessors|Core M]], [[Pentium]], [[Celeron]]) || SoC, on-chip Broadwell-U PCH-LP (Multi-chip module) |- |2015–2020 || Intel [[Skylake (microarchitecture)|Skylake]]/[[Kaby Lake]]/[[Cannon Lake (microarchitecture)|Cannon Lake]]/[[Coffee Lake (microarchitecture)|Coffee Lake]]/[[Rocket Lake (microarchitecture)|Rocket Lake]]<br/>(Intel Pentium/Celeron Gold, [[Core i3]], [[Core i5]], [[Core i7]], [[Core i9]]) || colspan="3" style="background: #ececec;" |46-bit || AVX-512 (restricted to Cannon Lake-U and workstation/server variants of Skylake) |- | 2016 || Intel [[Xeon Phi]] ([[Knights Landing (microarchitecture)|Knights Landing]]) || colspan="3" rowspan="4" style="background: #ececec" | 48-bit || Manycore CPU and coprocessor for Xeon systems, Airmont (Atom) based core |- |2016 || AMD [[Bristol Ridge]]<br/>(AMD (Pro) A6/A8/A10/A12) || Integrated FCH on die, SoC, AM4 socket |- |2017 || AMD [[Ryzen]] Series/AMD [[Epyc]] Series || AMD's implementation of SMT, on-chip multiple dies |- |2017 || Zhaoxin WuDaoKou (KX-5000, KH-20000) || [[Zhaoxin]]'s first brand new x86-64 architecture |- |2018–2021 || Intel [[Sunny Cove (microarchitecture)|Sunny Cove]] ([[Ice Lake (microprocessor)|Ice Lake]]-U and Y), [[Cypress Cove (microarchitecture)|Cypress Cove]] ([[Rocket Lake]]) || colspan="3" style="background: #ececec;" |57-bit || Intel's first implementation of AVX-512 for the consumer segment. Addition of Vector Neural Network Instructions (VNNI) |- |2019 |AMD [[AMD Matisse|Matisse]] | colspan="3" style="background: #ececec" |48-bit |Multiple Chip Module design with I/O die separate from CPU die(s), Support for PCIe Gen4 |- |2020 || Intel [[Willow Cove]] ([[Tiger Lake]]-Y/U/H) | colspan="3" rowspan="2" style="background: #ececec" |57-bit|| Dual ring interconnect architecture, updated Gaussian Neural Accelerator (GNA2), new AVX-512 Vector Intersection Instructions, addition of Control-Flow Enforcement Technology (CET) |- |2021 || Intel [[Alder Lake (microarchitecture)|Alder Lake]] || Hybrid design with performance (Golden Cove) and efficiency cores (Gracemont), support for PCIe Gen5 and DDR5, updated Gaussian Neural Accelerator (GNA3). AVX-512 not officially supported |- |2022 |AMD [[AMD Vermeer|Vermeer]] (5800X3D) | colspan="3" rowspan="2" style="background: #ececec" |48-bit |X3D chips have an additional 64MB 3D vertically stacked L3 cache (3D V-Cache) for up to 96MB L3 Cache |- |2022 |AMD [[AMD Raphael|Raphael]] |AMD's first implementation of AVX-512 for the consumer segment, iGPU now standard on Ryzen CPU's with 2 [[RDNA 2]] compute cores |} ==History== ===Designers and manufacturers=== [[File:Am386SXL-25cropped.jpg|thumb|[[Am386]], released by AMD in 1991]] {{Further|List of former IA-32 compatible processor manufacturers}} At various times, companies such as [[IBM]], [[VIA Technologies|VIA]], [[NEC]],{{Efn|The NEC V20 and V30 also provided the older 8080 instruction set, allowing PCs equipped with these microprocessors to operate CP/M applications at full speed (i.e., without the need to simulate an 8080 by software).}} [[AMD]], [[Texas Instruments|TI]], [[STMicroelectronics|STM]], [[Fujitsu]], [[Oki Electric Industry|OKI]], [[Siemens]], [[Cyrix]], [[Intersil]], [[Chips and Technologies|C&T]], [[NexGen]], [[United Microelectronics Corporation|UMC]], and [[Vortex86|DM&P]] started to design or manufacture{{Efn|[[Fabless]] companies designed the chip and contracted another company to manufacture it, while fabbed companies would do both the design and the manufacturing themselves. Some companies started as fabbed manufacturers and later became fabless designers, one such example being AMD.}} x86 [[central processing unit|processors]] (CPUs) intended for personal computers and embedded systems. Other companies that designed or manufactured x86 or [[x87]] processors include [[ITT Corporation]], [[National Semiconductor]], ULSI System Technology, and [[Weitek]]. Such x86 implementations were seldom simple copies but often employed different internal [[microarchitecture]]s and different solutions at the electronic and physical levels. Quite naturally, early compatible microprocessors were 16-bit, while 32-bit designs were developed much later. For the [[personal computer]] market, real quantities started to appear around 1990 with [[i386]] and [[i486]] compatible processors, often named similarly to Intel's original chips. After the fully [[Instruction pipelining|pipelined]] [[i486]], in 1993 [[Intel]] introduced the [[Pentium]] brand name (which, unlike numbers, could be [[trademark]]ed) for their new set of [[superscalar]] x86 designs. With the x86 naming scheme now legally cleared, other x86 vendors had to choose different names for their x86-compatible products, and initially some chose to continue with variations of the numbering scheme: [[IBM]] partnered with [[Cyrix]] to produce the [[Cyrix Cx5x86|5x86]] and then the very efficient [[6x86]] (M1) and [[6x86]]MX ([[Cyrix 6x86|MII]]) lines of Cyrix designs, which were the first x86 microprocessors implementing [[register renaming]] to enable [[speculative execution]]. AMD meanwhile designed and manufactured the advanced but delayed [[5k86]] ([[AMD K5|K5]]), which, internally, was closely based on AMD's earlier [[29K]] [[RISC]] design; similar to [[NexGen]]'s [[Nx586]], it used a strategy such that dedicated pipeline stages decode x86 instructions into uniform and easily handled [[micro-operation]]s, a method that has remained the basis for most x86 designs to this day. Some early versions of these microprocessors had heat dissipation problems. The 6x86 was also affected by a few minor compatibility problems, the [[Nx586]] lacked a [[floating-point unit]] (FPU) and (the then crucial) pin-compatibility, while the [[AMD K5|K5]] had somewhat disappointing performance when it was (eventually) introduced. Customer ignorance of alternatives to the Pentium series further contributed to these designs being comparatively unsuccessful, despite the fact that the [[AMD K5|K5]] had very good Pentium compatibility and the [[6x86]] was significantly faster than the Pentium on integer code.{{Efn|It had a slower FPU however, which is slightly ironic as Cyrix started out as a designer of fast floating-point units for x86 processors.}} [[AMD]] later managed to grow into a serious contender with the [[AMD K6|K6]] set of processors, which gave way to the very successful [[Athlon]] and [[Opteron]]. There were also other contenders, such as [[Centaur Technology]] (formerly [[Integrated Device Technology|IDT]]), [[Rise Technology]], and [[Transmeta]]. [[VIA Technologies]]' energy efficient [[VIA C3|C3]] and [[VIA C7|C7]] processors, which were designed by the [[Centaur Technology|Centaur]] company, were sold for many years following their release in 2005. Centaur's 2008 design, the [[VIA Nano]], was their first processor with [[superscalar]] and [[speculative execution]]. It was introduced at about the same time (in 2008) as Intel introduced the [[Intel Atom]], its first "in-order" processor after the [[P5 (microarchitecture)|P5]] [[Pentium]]. Many additions and extensions have been added to the original x86 instruction set over the years, almost consistently with full [[backward compatibility]].{{Efn|Intel abandoned its "x86" naming scheme with the ''[[P5 (microarchitecture)|P5]] [[Pentium]]'' during 1993 (as numbers could not be trademarked). However, the term x86 was already established among technicians, compiler writers etc.}} The architecture family has been implemented in processors from Intel, [[Cyrix]], [[AMD]], [[VIA Technologies]] and many other companies; there are also open implementations, such as the Zet SoC platform (currently inactive).<ref>{{cite web |date=November 4, 2013 |title=Zet: The x86 (IA-32) open implementation: Overview |url=http://opencores.org/project,zet86 |url-status=live |archive-url=https://web.archive.org/web/20180211072830/https://opencores.org/project,zet86 |archive-date=February 11, 2018 |access-date=January 5, 2014 |website=OpenCores}}</ref> Nevertheless, of those, only Intel, AMD, VIA Technologies, and [[Vortex86|DM&P Electronics]] hold x86 architectural licenses, and from these, only the first two actively produce modern 64-bit designs, leading to what has been called a "duopoly" of Intel and AMD in x86 processors. However, in 2014 the Shanghai-based Chinese company [[Zhaoxin]], a joint venture between a Chinese company and VIA Technologies, began designing VIA based x86 processors for desktops and laptops. The release of its newest "7" family<ref>{{Cite web |title=Zhaoxin Preparing Linux Kernel Support For 7-Series Centaur CPUs |url=https://www.phoronix.com/scan.php?page=news_item&px=Zhaoxin-Centaur-Family-7-Bits |access-date=2022-04-05 |website=www.phoronix.com |language=en}}</ref> of x86 processors (e.g. KX-7000), which are not quite as fast as AMD or Intel chips but are still state of the art,<ref>{{Cite web |title=Zhaoxin aiming at 2021 release for its 7nm x86 CPUs - CPU - News - HEXUS.net |url=https://m.hexus.net/tech/news/cpu/137873-zhaoxin-aiming-2021-release-7nm-x86-cpus/ |access-date=2022-04-05 |website=m.hexus.net}}</ref> had been planned for 2021; as of March 2022 the release had not taken place, however.<ref>{{Cite web |title=Zhaoxin Finally Adding "Lujiazui" x86_64 CPU Tuning To GCC |url=https://www.phoronix.com/scan.php?page=news_item&px=Zhaoxin-lujiazui-GCC |access-date=2022-04-05 |website=www.phoronix.com |language=en}}</ref> ===From 16-bit and 32-bit to 64-bit architecture=== The [[instruction set architecture]] has twice been extended to a larger [[Word (computer architecture)|word]] size. In 1985, Intel released the 32-bit 80386 (later known as i386) which gradually replaced the earlier 16-bit chips in computers (although typically not in [[embedded system]]s) during the following years; this extended programming model was originally referred to as ''the i386 architecture'' (like its first implementation) but Intel later dubbed it [[IA-32]] when introducing its (unrelated) [[IA-64]] architecture. In 1999–2003, [[Advanced Micro Devices|AMD]] extended this 32-bit architecture to 64 bits and referred to it as [[x86-64]] in early documents and later as [[AMD64]]. Intel soon adopted AMD's architectural extensions under the name IA-32e, later using the name EM64T and finally using Intel 64. [[Microsoft]] and [[Sun Microsystems]]/[[Oracle Corporation|Oracle]] also use term "x64", while many [[Linux distribution]]s, and the [[BSD]]s also use the "amd64" term. Microsoft Windows, for example, designates its 32-bit versions as "x86" and 64-bit versions as "x64", while installation files of 64-bit Windows versions are required to be placed into a directory called "AMD64".<ref>{{cite web|url=http://support.microsoft.com/kb/896334|title=Setup and installation considerations for Windows x64 Edition-based computers|access-date=September 14, 2014|archive-date=September 11, 2014|archive-url=https://web.archive.org/web/20140911011914/http://support.microsoft.com/kb/896334|url-status=live}}</ref> In 2023, Intel proposed a major change to the architecture referred to as [[X86-64#X86S|X86S]] (formerly known as X86-S). The S in X86S stood for "simplification", which aimed to remove support for legacy execution modes and instructions. A processor implementing this proposal would start execution directly in [[long mode]] and would only support 64-bit operating systems. 32-bit code would only be supported for user applications running in ring 3, and would use the same simplified segmentation as long mode.<ref>{{Cite web |title=Envisioning a Simplified Intel Architecture |url=https://www.intel.com/content/www/us/en/developer/articles/technical/envisioning-future-simplified-architecture.html |website=Intel}}</ref><ref>{{Cite news |last=Larabel |first=Michael |date=2023-05-20 |title=Intel Publishes "X86-S" Specification For 64-bit Only Architecture |work=[[Phoronix]] |url=https://www.phoronix.com/news/Intel-X86-S-64-bit-Only |access-date=2023-05-20}}</ref> In December 2024 Intel cancelled this project.<ref>{{Cite web |title=toms hardware X86S |url=https://www.tomshardware.com/pc-components/cpus/intel-terminates-x86s-initiative-unilateral-quest-to-de-bloat-x86-instruction-set-comes-to-an-end}}</ref> ==Basic properties of the architecture== The x86 architecture is a variable instruction length, primarily "[[Complex instruction set computer|CISC]]" design with emphasis on [[backward compatibility]]. The instruction set is not typical CISC, however, but basically an extended version of the simple eight-bit [[Intel 8008|8008]] and [[Intel 8080|8080]] architectures. Byte-addressing is enabled and words are stored in memory with [[endianness|little-endian]] byte order. Memory access to unaligned addresses is allowed for almost all instructions. The largest native size for [[integer (computing)|integer]] arithmetic and memory addresses (or [[offset (computer science)|offset]]s) is 16, 32 or 64 bits depending on architecture generation (newer processors include direct support for smaller integers as well). Multiple scalar values can be handled simultaneously via the SIMD unit present in later generations, as described below.{{Efn|16-bit and 32-bit microprocessors were introduced during 1978 and 1985 respectively; plans for 64-bit was announced during 1999 and gradually introduced from 2003 and onwards.}} Immediate addressing offsets and immediate data may be expressed as 8-bit quantities for the frequently occurring cases or contexts where a −128..127 range is enough. Typical instructions are therefore 2 or 3 bytes in length (although some are much longer, and some are single-byte). To further conserve encoding space, most registers are expressed in [[opcode]]s using three or four bits, the latter via an opcode prefix in 64-bit mode, while at most one operand to an instruction can be a memory location.{{Efn|Some "CISC" designs, such as the [[PDP-11]], may use two.}} However, this memory operand may also be the destination (or a combined source and destination), while the other operand, the source, can be either register or immediate. Among other factors, this contributes to a code size that rivals eight-bit machines and enables efficient use of instruction cache memory. The relatively small number of general registers (also inherited from its 8-bit ancestors) has made register-relative addressing (using small immediate offsets) an important method of accessing operands, especially on the stack. Much work has therefore been invested in making such accesses as fast as register accesses—i.e., a one cycle instruction throughput, in most circumstances where the accessed data is available in the top-level cache. ===Floating point and SIMD=== A dedicated [[floating-point unit|floating-point processor]] with 80-bit internal registers, the [[Intel 8087|8087]], was developed for the original [[8086]]. This microprocessor subsequently developed into the extended [[80387]], and later processors incorporated a [[backward compatible]] version of this functionality on the same microprocessor as the main processor. In addition to this, modern x86 designs also contain a [[Single instruction, multiple data|SIMD]]-unit (see [[Streaming SIMD Extensions|SSE]] below) where instructions can work in parallel on (one or two) 128-bit words, each containing two or four [[floating-point arithmetic|floating-point number]]s (each 64 or 32 bits wide respectively), or alternatively, 2, 4, 8 or 16 integers (each 64, 32, 16 or 8 bits wide respectively). The presence of wide SIMD registers means that existing x86 processors can load or store up to 128 bits of memory data in a single instruction and also perform bitwise operations (although not integer arithmetic{{Efn|That is because integer arithmetic generates carry between subsequent bits (unlike simple bitwise operations).}}) on full 128-bits quantities in parallel. Intel's [[Sandy Bridge]] processors added the [[Advanced Vector Extensions]] (AVX) instructions, widening the SIMD registers to 256 bits. The Intel Initial Many Core Instructions implemented by the Knights Corner [[Xeon Phi]] processors, and the [[AVX-512]] instructions implemented by the Knights Landing Xeon Phi processors and by [[Skylake (microarchitecture)#High-end desktop processors (Skylake-X)|Skylake-X]] processors, use 512-bit wide SIMD registers. <!-- The fact that "MOV" has been extended to cope with 128-bit words does not make the 128-bit SSE registers general purpose. The bitwise instructions extended for 128-bit SSE registers and memory locations is just SSE/SIMD, plain and simple. The fact that 128-bit registers can be pushed and popped to/from the stack with "normal instructions" is nothing more remarkable than the "MOV" mentioned above (although very useful). Only 128-bit SSE words (not 128-bit integers or addresses) are enabled by the single-instruction-single-data core. What opcodes are used are irrelevant here.--> ==Current implementations== During [[Execution (computers)|execution]], current x86 processors employ a few extra decoding steps to split most instructions into smaller pieces called micro-operations. These are then handed to a [[control unit]] that buffers and schedules them in compliance with x86-semantics so that they can be executed, partly in parallel, by one of several (more or less specialized) [[execution units]]. These modern x86 designs are thus [[Instruction pipelining|pipelined]], [[superscalar]], and also capable of [[out-of-order execution|out of order]] and [[speculative execution]] (via [[branch prediction]], [[register renaming]], and [[memory dependence prediction]]), which means they may execute multiple (partial or complete) x86 instructions simultaneously, and not necessarily in the same order as given in the instruction stream.<ref>{{cite web|url=http://www.intel.com/support/processors/sb/CS-030169.htm?wapkw=8086+processor|title=Processors — What mode of addressing do the Intel Processors use?|access-date=September 14, 2014|archive-date=September 11, 2014|archive-url=https://web.archive.org/web/20140911003022/http://www.intel.com/support/processors/sb/CS-030169.htm?wapkw=8086+processor|url-status=live}}</ref> Some Intel CPUs ([[Xeon#Foster|Xeon Foster MP]], some [[Pentium 4]], and some [[Nehalem (microarchitecture)|Nehalem]] and later [[Intel Core]] processors) and AMD CPUs (starting from [[Zen (microarchitecture)|Zen]]) are also capable of [[simultaneous multithreading]] with two [[thread (computer science)|threads]] per [[multi-core processor|core]] ([[Xeon Phi]] has four threads per core). Some Intel CPUs support [[transactional memory]] ([[Transactional Synchronization Extensions|TSX]]). When introduced, in the mid-1990s, this method was sometimes referred to as a "RISC core" or as "RISC translation", partly for marketing reasons, but also because these micro-operations share some properties with certain types of RISC instructions. However, traditional [[microcode]] (used since the 1950s) also inherently shares many of the same properties; the new method differs mainly in that the translation to micro-operations now occurs asynchronously. Not having to synchronize the execution units with the decode steps opens up possibilities for more analysis of the (buffered) code stream, and therefore permits detection of operations that can be performed in parallel, simultaneously feeding more than one execution unit. The latest processors also do the opposite when appropriate; they combine certain x86 sequences (such as a compare followed by a conditional jump) into a more complex micro-op which fits the execution model better and thus can be executed faster or with fewer machine resources involved. Another way to try to improve performance is to cache the decoded micro-operations, so the processor can directly access the decoded micro-operations from a special cache, instead of decoding them again. Intel followed this approach with the Execution Trace Cache feature in their [[NetBurst]] microarchitecture (for Pentium 4 processors) and later in the Decoded Stream Buffer (for Core-branded processors since Sandy Bridge).<ref>{{cite web|url=http://software.intel.com/sites/products/documentation/doclib/iss/2013/amplifier/lin/ug_docs/GUID-143D1B76-D97F-454F-9B4B-91F2D791B66D.htm|title=DSB Switches|work=Intel VTune Amplifier 2013|publisher=Intel|access-date=August 26, 2013|archive-date=December 2, 2013|archive-url=https://web.archive.org/web/20131202232818/http://software.intel.com/sites/products/documentation/doclib/iss/2013/amplifier/lin/ug_docs/GUID-143D1B76-D97F-454F-9B4B-91F2D791B66D.htm|url-status=live}}</ref> [[Transmeta]] used a completely different method in their [[Transmeta Crusoe|Crusoe]] x86 compatible CPUs. They used [[Just-in-time compilation|just-in-time]] translation to convert x86 instructions to the CPU's native [[VLIW]] instruction set. Transmeta argued that their approach allows for more power efficient designs since the CPU can forgo the complicated decode step of more traditional x86 implementations. ==Addressing modes== [[Addressing mode]]s for 16-bit processor modes can be summarized by the formula:<ref>{{cite web|title=The 8086 Family User's Manual|page=2{{hyp}}68|date=October 1979|publisher=Intel Corporation|url=http://bitsavers.org/components/intel/8086/9800722-03_The_8086_Family_Users_Manual_Oct79.pdf|access-date=March 28, 2018|archive-date=April 4, 2018|archive-url=https://web.archive.org/web/20180404223644/http://www.bitsavers.org/components/intel/8086/9800722-03_The_8086_Family_Users_Manual_Oct79.pdf|url-status=live}}</ref><ref>{{cite web|title=iAPX 286 Programmer's Reference Manual|at=2.4.3 Memory Addressing Modes|year=1983|publisher=Intel Corporation|url=http://bitsavers.org/components/intel/80286/210498-001_iAPX_286_Programmers_Reference_1983.pdf|access-date=August 28, 2017|archive-date=August 28, 2017|archive-url=https://web.archive.org/web/20170828232803/http://www.bitsavers.org/components/intel/80286/210498-001_iAPX_286_Programmers_Reference_1983.pdf|url-status=live}}</ref> :<math> \begin{matrix} \mathtt{CS}: \\ \mathtt{DS}: \\ \mathtt{SS}: \\ \mathtt{ES}: \end{matrix}\ \ \begin{pmatrix} \\ \begin{bmatrix} \mathtt{BX} \\ \mathtt{BP} \end{bmatrix} + \begin{bmatrix} \mathtt{SI} \\ \mathtt{DI} \end{bmatrix} \\ \\ \end{pmatrix} + \rm displacement </math> Addressing modes for 32-bit x86 processor modes<ref>{{cite book|title=80386 Programmer's Reference Manual|at=2.5.3.2 EFFECTIVE-ADDRESS COMPUTATION|year=1986|publisher=Intel Corporation|url=http://bitsavers.org/components/intel/80386/230985-001_80386_Programmers_Reference_Manual_1986.pdf|access-date=March 28, 2018|archive-date=December 28, 2018|archive-url=https://web.archive.org/web/20181228110138/http://bitsavers.org/components/intel/80386/230985-001_80386_Programmers_Reference_Manual_1986.pdf|url-status=live}}</ref> can be summarized by the formula:<ref name="addrmodes">{{cite book|title=Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 1: Basic Architecture|at=Chapter 3|date=March 2018|publisher=Intel Corporation|url=http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html|access-date=March 19, 2014|archive-date=January 26, 2012|archive-url=https://web.archive.org/web/20120126002939/http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html|url-status=live}}</ref> :<math> \begin{matrix} \mathtt{CS}: \\ \mathtt{DS}: \\ \mathtt{SS}: \\ \mathtt{ES}: \\ \mathtt{FS}: \\ \mathtt{GS}: \end{matrix}\ \ \begin{bmatrix} \mathtt{EAX} \\ \mathtt{EBX} \\ \mathtt{ECX} \\ \mathtt{EDX} \\ \mathtt{ESP} \\ \mathtt{EBP} \\ \mathtt{ESI} \\ \mathtt{EDI} \end{bmatrix} + \begin{pmatrix}\\ \begin{bmatrix} \mathtt{EAX} \\ \mathtt{EBX} \\ \mathtt{ECX} \\ \mathtt{EDX} \\ \mathtt{EBP} \\ \mathtt{ESI} \\ \mathtt{EDI} \end{bmatrix} * \begin{bmatrix} 1 \\ 2 \\ 4 \\ 8 \end{bmatrix} \\ \\ \end{pmatrix} + \rm displacement </math> Addressing modes for the 64-bit processor mode can be summarized by the formula:<ref name="addrmodes"/> :<math> \begin{Bmatrix} \\ \begin{matrix} \mathtt{FS}: \\ \mathtt{GS}: \end{matrix}\ \ \begin{bmatrix} \vdots \\ \mathtt{GPR} \\ \vdots \end{bmatrix} + \begin{pmatrix} \\ \begin{bmatrix} \vdots \\ \mathtt{GPR} \\ \vdots \\ \end{bmatrix} * \begin{bmatrix} 1\\2\\4\\8 \end{bmatrix} \\ \\ \end{pmatrix} \\ \\ \hline \\ \begin{matrix} \mathtt{RIP} \end{matrix} \\ \\ \end{Bmatrix} + \rm displacement </math> Instruction relative addressing in 64-bit code (RIP + displacement, where RIP is the [[instruction pointer|instruction pointer register]]) simplifies the implementation of [[position-independent code]] (as used in [[shared libraries]] in some operating systems).<ref name="Andriesse 2019">{{cite book | last=Andriesse | first=Dennis | title=Practical binary analysis: build your own Linux tools for binary instrumentation, analysis, and disassembly | publisher=No Starch Press, Inc | publication-place=San Francisco, CA | year=2019 | isbn=978-1-59327-913-4 | oclc=1050453850 | section=6.5 Effects of Compiler Settings on Disassembly}}</ref> The 8086 had {{val|64|u=KB}} of eight-bit (or alternatively {{val|32|u=K-word of 16-bit}}) [[I/O]] space, and a {{val|64|u=KB}} (one segment) [[Stack (data structure)|stack]] in memory supported by [[computer hardware]]. Only words (two bytes) can be pushed to the stack. The stack grows toward numerically lower addresses, with {{mono|SS:SP}} pointing to the most recently pushed item. There are 256 [[interrupt]]s, which can be invoked by both hardware and software. The interrupts can cascade, using the stack to store the [[Return statement|return address]]. ==x86 registers== {{Hatnote|For a description of the general notion of a CPU register, see [[Processor register]].}} ===16-bit=== The original [[Intel 8086]] and [[Intel 8088|8088]] have fourteen 16-[[bit]] registers. Four of them (AX, BX, CX, DX) are general-purpose registers (GPRs), although each may have an additional purpose; for example, only CX can be used as a counter with the loop instruction. Each can be accessed as two separate bytes (thus BX's high byte can be accessed as BH and low byte as BL). Two pointer registers have special roles: SP (stack pointer) points to the "top" of the [[stack (data structure)|stack]], and BP (base pointer) is often used to point at some other place in the stack, typically above the local variables (see [[frame pointer]]). The registers SI, DI, BX and BP are [[address register]]s, and may also be used for array indexing. One of four possible 'segment registers' (CS, DS, SS and ES) is used to form a memory address. In the original 8086 / 8088 / 80186 / 80188 every address was built from a segment register and one of the general purpose registers. For example ds:si is the notation for an address formed as [16 * ds + si] to allow 20-bit addressing rather than 16 bits, although this changed in later processors. At that time only certain combinations were supported. The [[FLAGS register (computing)|FLAGS register]] contains [[Flag (computing)|flag]]s such as [[carry flag]], [[overflow flag]] and [[zero flag]]. Finally, the [[instruction pointer]] (IP) points to the next instruction that will be fetched from memory and then executed; this register cannot be directly accessed (read or written) by a program.<ref>{{cite web |url=http://www.cs.virginia.edu/~evans/cs216/guides/x86.html |title=Guide to x86 Assembly |publisher=Cs.virginia.edu |date=September 11, 2013 |access-date=February 6, 2014 |archive-date=March 24, 2020 |archive-url=https://web.archive.org/web/20200324154938/http://www.cs.virginia.edu/~evans/cs216/guides/x86.html |url-status=live }}</ref> The [[Intel 80186]] and [[Intel 80188|80188]] are essentially an upgraded 8086 or 8088 CPU, respectively, with on-chip peripherals added, and they have the same CPU registers as the 8086 and 8088 (in addition to interface registers for the peripherals). The 8086, 8088, 80186, and 80188 can use an optional floating-point coprocessor, the [[Intel 8087|8087]]. The 8087 appears to the programmer as part of the CPU and adds eight 80-bit wide registers, st(0) to st(7), each of which can hold numeric data in one of seven formats: 32-, 64-, or 80-bit floating point, 16-, 32-, or 64-bit (binary) integer, and 80-bit packed decimal integer.<ref name="i86"/>{{rp|S-6, S-13..S-15}} It also has its own 16-bit status register accessible through the {{mono|fstsw}} instruction, and it is common to simply use some of its bits for branching by copying it into the normal FLAGS.<ref>{{cite web |title=FSTSW/FNSTSW — Store x87 FPU Status Word |url=https://www.felixcloutier.com/x86/fstsw:fnstsw |quote=The FNSTSW AX form of the instruction is used primarily in conditional branching... |access-date=January 15, 2020 |archive-date=January 25, 2022 |archive-url=https://web.archive.org/web/20220125121653/https://www.felixcloutier.com/x86/fstsw:fnstsw |url-status=live }}</ref> In the [[Intel 80286]], to support [[protected mode]], three special registers hold descriptor table addresses (GDTR, LDTR, [[Interrupt descriptor table|IDTR]]), and a fourth task register (TR) is used for task switching. The [[Intel 80287|80287]] is the floating-point coprocessor for the 80286 and has the same registers as the 8087 with the same data formats. ===32-bit=== [[File:Table of x86 Registers svg.svg|thumb|upright=2.2|Registers available in the x86-64 instruction set]] With the advent of the 32-bit [[Intel 80386|80386]] processor, the 16-bit general-purpose registers, base registers, index registers, instruction pointer, and [[FLAGS register]], but not the segment registers, were expanded to 32 bits. The nomenclature represented this by prefixing an "'''E'''" (for "extended") to the register names in [[x86 assembly language]]. Thus, the AX register corresponds to the lower 16 bits of the new 32-bit EAX register, SI corresponds to the lower 16 bits of ESI, and so on. The general-purpose registers, base registers, and index registers can all be used as the base in addressing modes, and all of those registers except for the stack pointer can be used as the index in addressing modes. Two new segment registers (FS and GS) were added. With a greater number of registers, instructions and operands, the [[machine code]] format was expanded. To provide backward compatibility, segments with executable code can be marked as containing either 16-bit or 32-bit instructions. Special prefixes allow inclusion of 32-bit instructions in a 16-bit segment or vice versa. The 80386 had an optional floating-point coprocessor, the [[80387]]; it had eight 80-bit wide registers: st(0) to st(7),<ref>{{cite book|url= http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf|title= Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture|at= Chapter 8|publisher= Intel|date= March 2013|access-date= April 23, 2013|archive-date= April 2, 2013|archive-url= https://web.archive.org/web/20130402233513/http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf|url-status= live}}</ref> like the 8087 and 80287. The 80386 could also use an 80287 coprocessor.<ref>{{cite web|url=http://www.cpu-world.com/CPUs/80287/|title=Intel 80287 family|website=CPU-world|access-date=July 21, 2016|archive-date=August 9, 2016|archive-url=https://web.archive.org/web/20160809185320/http://www.cpu-world.com/CPUs/80287/|url-status=live}}</ref> With the [[80486]] and all subsequent x86 models, the floating-point processing unit (FPU) is integrated on-chip. The [[Pentium MMX]] added eight 64-bit [[MMX (instruction set)|MMX]] integer vector registers (MM0 to MM7, which share lower bits with the 80-bit-wide FPU stack).<ref>{{cite book|url= http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf|title= Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture|at= Chapter 9|publisher= Intel|date= March 2013|access-date= April 23, 2013|archive-date= April 2, 2013|archive-url= https://web.archive.org/web/20130402233513/http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf|url-status= live}}</ref> With the [[Pentium III]], Intel added a 32-bit [[Streaming SIMD Extensions]] (SSE) control/status register (MXCSR) and eight 128-bit SSE floating-point registers (XMM0 to XMM7).<ref>{{cite book |url= http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf |title= Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture |at= Chapter 10 |publisher= Intel |date= March 2013 |access-date= April 23, 2013 |archive-date= April 2, 2013 |archive-url= https://web.archive.org/web/20130402233513/http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf |url-status= live }}</ref> ===64-bit=== {{Further|x86-64}} Starting with the [[AMD Opteron]] processor, the x86 architecture extended the 32-bit registers into 64-bit registers in a way similar to how the 16 to 32-bit extension took place. An '''R'''-prefix (for "register") identifies the 64-bit registers (RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, RFLAGS, RIP), and eight additional 64-bit general registers (R8–R15) were also introduced in the creation of [[x86-64]]. Also, eight more SSE vector registers (XMM8–XMM15) were added. However, these extensions are only usable in 64-bit mode, which is one of the two modes only available in [[long mode]]. The addressing modes were not dramatically changed from 32-bit mode, except that addressing was extended to 64 bits, virtual addresses are now sign extended to 64 bits (in order to disallow mode bits in virtual addresses), and other selector details were dramatically reduced. In addition, an addressing mode was added to allow memory references relative to RIP (the [[instruction pointer]]), to ease the implementation of [[position-independent code]], used in shared libraries in some operating systems. ===128-bit=== {{See also|Streaming SIMD Extensions#Registers}} SIMD registers XMM0–XMM15 (XMM0–XMM31 when [[AVX-512]] is supported). ===256-bit=== {{See also|Advanced Vector Extensions#New features}} SIMD registers YMM0–YMM15 (YMM0–YMM31 when [[AVX-512]] is supported). Lower half of each of the YMM registers maps onto the corresponding XMM register. ===512-bit=== {{See also|Advanced Vector Extensions#AVX-512}} SIMD registers ZMM0–ZMM31. Lower half of each of the ZMM registers maps onto the corresponding YMM register. ===Miscellaneous/special purpose=== x86 processors that have a [[protected mode]], i.e. the 80286 and later processors, also have three descriptor registers (GDTR, LDTR, [[Interrupt descriptor table|IDTR]]) and a task register (TR). 32-bit x86 processors (starting with the 80386) also include various special/miscellaneous registers such as [[control register]]s (CR0 through 4, CR8 for 64-bit only), [[debug register]]s (DR0 through 3, plus 6 and 7), [[test register]]s (TR3 through 7; 80486 only), and [[model-specific register]]s (MSRs, appearing with the Pentium{{Efn|Two MSRs of particular interest are SYSENTER_EIP_MSR<!-- (0x176) --> and SYSENTER_ESP_MSR<!-- (0x175) -->, introduced on the Pentium® II processor, which store the address of the kernel mode system service handler<!-- nt!KiFastCallEntry --> and corresponding kernel stack pointer. Initialized during system startup, SYSENTER_EIP_MSR and SYSENTER_ESP_MSR are used by the SYSENTER (Intel) or SYSCALL (AMD) instructions to achieve Fast System Calls, about three times faster<!-- http://www.codeguru.com/cpp/misc/misc/system/article.php/c8223/System-Call-Optimization-with-the-SYSENTER-Instruction.htm 266% as fast (166% faster) on a PIII Dual 800 MHz --> than the software interrupt method used previously.}}). [[AVX-512]] has eight extra 64-bit mask registers K0–K7 for selecting elements in a vector register. Depending on the vector register and element widths, only a subset of bits of the mask register may be used by a given instruction. ===Purpose=== Although the main registers (with the exception of the instruction pointer) are "general-purpose" in the 32-bit and 64-bit versions of the instruction set and can be used for anything, it was originally envisioned that they be used for the following purposes: * AL/AH/AX/EAX/RAX: Accumulator * CL/CH/CX/ECX/RCX: Counter (for use with loops and strings) * DL/DH/DX/EDX/RDX: Extend the precision of the accumulator (e.g. combine 32-bit EAX and EDX for 64-bit integer operations in 32-bit code) * BL/BH/BX/EBX/RBX: Base index (for use with arrays) * SP/ESP/RSP: Stack pointer for top address of the stack. * BP/EBP/RBP: Stack base pointer for holding the address of the current [[stack frame]]. * SI/ESI/RSI: ''Source index'' for [[string (computer science)|string]] operations. * DI/EDI/RDI: ''Destination index'' for string operations. * IP/EIP/RIP: Instruction pointer. Holds the [[program counter]], the address of next instruction. Segment registers: *CS: Code *DS: Data *SS: Stack *ES: Extra data *FS: Extra data #2 *GS: Extra data #3 No particular purposes were envisioned for the other 8 registers available only in 64-bit mode. Some instructions compile and execute more efficiently when using these registers for their designed purpose. For example, using AL as an [[Accumulator (computing)|accumulator]] and adding an immediate byte value to it produces the efficient ''add to AL'' [[opcode]] of 04h, whilst using the BL register produces the generic and longer ''add to register'' opcode of 80C3h. Another example is double precision division and multiplication that works specifically with the AX and DX registers. Modern compilers benefited from the introduction of the ''sib'' byte (''scale-index-base byte'') that allows registers to be treated uniformly ([[minicomputer]]-like). However, using the sib byte universally is non-optimal, as it produces longer encodings than only using it selectively when necessary. (The main benefit of the sib byte is the orthogonality and more powerful addressing modes it provides, which make it possible to save instructions and the use of registers for address calculations such as scaling an index.) Some special instructions lost priority in the hardware design and became slower than equivalent small code sequences. A notable example is the LODSW instruction. ===Structure=== {| class="wikitable" |+ General Purpose Registers (A, B, C and D) ! style="width:50pt;"| 64 ! style="width:50pt;"| 56 ! style="width:50pt;"| 48 ! style="width:50pt;"| 40 ! style="width:50pt;"| 32 ! style="width:50pt;"| 24 ! style="width:50pt;"| 16 ! style="width:50pt;"| 8 |- | colspan="8" style="text-align:center;"| R?X |- | colspan="4" style="background:lightgrey" | | colspan="4" style="text-align:center;"| E?X |- | colspan="6" style="background:lightgrey" | | colspan="2" style="text-align:center;"| ?X |- | colspan="6" style="background:lightgrey" | | style="text-align:center;"| ?H | style="text-align:center;"| ?L |} {| class="wikitable" |+ 64-bit mode-only General Purpose Registers (R8, R9, R10, R11, R12, R13, R14, R15) ! style="width:50pt;"| 64 ! style="width:50pt;"| 56 ! style="width:50pt;"| 48 ! style="width:50pt;"| 40 ! style="width:50pt;"| 32 ! style="width:50pt;"| 24 ! style="width:50pt;"| 16 ! style="width:50pt;"| 8 |- | colspan="8" style="text-align:center;"| ? |- | colspan="4" style="background:lightgrey" | | colspan="4" style="text-align:center;"| ?D |- | colspan="6" style="background:lightgrey" | | colspan="2" style="text-align:center;"| ?W |- | colspan="7" style="background:lightgrey" | | style="text-align:center;"| ?B |} {| class="wikitable" |+ Segment Registers (C, D, S, E, F and G) ! style="width:50pt;"| 16 ! style="width:50pt;"| 8 |- | colspan="2" style="text-align:center;"| ?S |} {| class="wikitable" |+ Pointer Registers (S and B) ! style="width:50pt;"| 64 ! style="width:50pt;"| 56 ! style="width:50pt;"| 48 ! style="width:50pt;"| 40 ! style="width:50pt;"| 32 ! style="width:50pt;"| 24 ! style="width:50pt;"| 16 ! style="width:50pt;"| 8 |- | colspan="8" style="text-align:center;"| R?P |- | colspan="4" style="background:lightgrey" | | colspan="4" style="text-align:center;"|E?P |- | colspan="6" style="background:lightgrey" | | colspan="2" style="text-align:center;"|?P |- | colspan="7" style="background:lightgrey" | | style="text-align:center;"| ?PL |} Note: The ?PL registers are only available in 64-bit mode. {| class="wikitable" |+ Index Registers (S and D) ! style="width:50pt;"| 64 ! style="width:50pt;"| 56 ! style="width:50pt;"| 48 ! style="width:50pt;"| 40 ! style="width:50pt;"| 32 ! style="width:50pt;"| 24 ! style="width:50pt;"| 16 ! style="width:50pt;"| 8 |- | colspan="8" style="text-align:center;"| R?I |- | colspan="4" style="background:lightgrey" | | colspan="4" style="text-align:center;"| E?I |- | colspan="6" style="background:lightgrey" | | colspan="2" style="text-align:center;"| ?I |- | colspan="7" style="background:lightgrey" | | style="text-align:center;"| ?IL |} Note: The ?IL registers are only available in 64-bit mode. {| class="wikitable" |+ Instruction Pointer Register (I) ! style="width:50pt;"| 64 ! style="width:50pt;"| 56 ! style="width:50pt;"| 48 ! style="width:50pt;"| 40 ! style="width:50pt;"| 32 ! style="width:50pt;"| 24 ! style="width:50pt;"| 16 ! style="width:50pt;"| 8 |- | colspan="8" style="text-align:center;"| RIP |- | colspan="4" style="background:lightgrey" | | colspan="4" style="text-align:center;"| EIP |- | colspan="6" style="background:lightgrey" | | colspan="2" style="text-align:center;"| IP |} ==Operating modes== ===Real mode=== {{Main|Real mode}} {{More citations needed section|date=January 2014}} Real Address mode,<ref>{{cite book|url=http://bitsavers.org/components/intel/80286/210498-001_iAPX_286_Programmers_Reference_1983.pdf|title=iAPX 286 Programmer's Reference|at=Section 1.2, "Modes of Operation"|publisher=Intel|year=1983|access-date=January 27, 2014|archive-date=August 28, 2017|archive-url=https://web.archive.org/web/20170828232803/http://www.bitsavers.org/components/intel/80286/210498-001_iAPX_286_Programmers_Reference_1983.pdf|url-status=live}}</ref> commonly called Real mode, is an operating mode of [[8086]] and later x86-compatible [[Central processing unit|CPUs]]. Real mode is characterized by a 20-bit segmented memory address space (meaning that only slightly more than 1 [[Mebibyte|MiB]] of memory can be addressed{{Efn|Because a segmented address is the sum of a 16-bit segment multiplied by 16 and a 16-bit offset, the maximum address is 1,114,095 (10FFEF hex), for an addressability of 1,114,096 bytes {{=}} 1 MB + 65,520 bytes. Before the 80286, x86 CPUs had only 20 physical address lines (address bit signals), so the 21st bit of the address, bit 20, was dropped and addresses past 1 MB were mirrors of the low end of the address space (starting from address zero). Since the 80286, all x86 CPUs have at least 24 physical address lines, and bit 20 of the computed address is brought out onto the address bus in real mode, allowing the CPU to address the full 1,114,096 bytes reachable with an x86 segmented address. On the popular IBM PC platform, switchable hardware to disable the 21st address bit was added to machines with an 80286 or later so that all programs designed for 8088/8086-based models could run, while newer software could take advantage of the "high" memory in real mode and the full 16 MB or larger address space in protected mode—see A20 gate.}}), direct software access to peripheral hardware, and no concept of [[memory protection]] or [[computer multitasking|multitasking]] at the hardware level. All x86 CPUs in the [[Intel 80286|80286]] series and later start up in real mode at power-on; [[Intel 80186|80186]] CPUs and earlier had only one operational mode, which is equivalent to real mode in later chips. (On the IBM PC platform, direct software access to the IBM [[BIOS]] routines is available only in real mode, since BIOS is written for real mode. However, this is not a property of the x86 CPU but of the IBM BIOS design.) In order to use more than 64 KB of memory, the segment registers must be used. This created great complications for compiler implementors who introduced odd pointer modes such as "near", "far" and "huge" to leverage the implicit nature of segmented architecture to different degrees, with some pointers containing 16-bit offsets within implied segments and other pointers containing segment addresses and offsets within segments. It is technically possible to use up to 256 KB of memory for code and data, with up to 64 KB for code, by setting all four segment registers once and then only using 16-bit offsets (optionally with default-segment override prefixes) to address memory, but this puts substantial restrictions on the way data can be addressed and memory operands can be combined, and it violates the architectural intent of the Intel designers, which is for separate data items (e.g. arrays, structures, code units) to be contained in separate segments and addressed by their own segment addresses, in new programs that are not ported from earlier 8-bit processors with 16-bit address spaces. ===Unreal mode=== {{Main|Unreal mode}} Unreal mode is used by some 16-bit [[operating system]]s and some 32-bit [[boot loader]]s. ===System Management Mode=== {{See also|System Management Mode}} The System Management Mode (SMM) is only used by the system firmware ([[BIOS]]/[[UEFI]]), not by [[operating system]]s and applications software. The SMM code is running in SMRAM. ===Protected mode=== {{Main|Protected mode}} {{More citations needed section|date=January 2014}} In addition to real mode, the Intel 80286 supports protected mode, expanding addressable [[physical memory]] to 16 [[megabyte|MB]] and addressable [[virtual memory]] to 1 [[gigabyte|GB]], and providing [[protected memory]], which prevents programs from corrupting one another. This is done by using the segment registers only for storing an index into a descriptor table that is stored in memory. There are two such tables, the [[Global Descriptor Table]] (GDT) and the [[Local Descriptor Table]] (LDT), each holding up to 8192 segment descriptors, each segment giving access to 64 KB of memory. In the 80286, a segment descriptor provides a 24-bit [[base address]], and this base address is added to a 16-bit offset to create an absolute address. The base address from the table fulfills the same role that the literal value of the segment register fulfills in real mode; the segment registers have been converted from direct registers to indirect registers. Each segment can be assigned one of four [[ring (computer security)|ring]] levels used for hardware-based [[computer security]]. Each segment descriptor also contains a segment limit field which specifies the maximum offset that may be used with the segment. Because offsets are 16 bits, segments are still limited to 64 KB each in 80286 protected mode.<ref>{{cite book|year=1983|url=http://bitsavers.org/components/intel/80286/210498-001_iAPX_286_Programmers_Reference_1983.pdf|title=iAPX 286 Programmer's Reference|at=Chapter 6, "Memory Management and Virtual Addressing"|publisher=Intel|access-date=January 27, 2014|archive-date=August 28, 2017|archive-url=https://web.archive.org/web/20170828232803/http://www.bitsavers.org/components/intel/80286/210498-001_iAPX_286_Programmers_Reference_1983.pdf|url-status=live}}</ref> Each time a segment register is loaded in protected mode, the 80286 must read a 6-byte segment descriptor from memory into a set of hidden internal registers. Thus, loading segment registers is much slower in protected mode than in real mode, and changing segments very frequently is to be avoided. Actual memory operations using protected mode segments are not slowed much because the 80286 and later have hardware to check the offset against the segment limit in parallel with instruction execution. The [[Intel 80386]] extended offsets and also the segment limit field in each segment descriptor to 32 bits, enabling a segment to span the entire memory space. It also introduced support in protected mode for [[paging]], a mechanism making it possible to use paged [[virtual memory]] (with 4 KB page size). Paging allows the CPU to map any page of the virtual memory space to any page of the physical memory space. To do this, it uses additional mapping tables in memory called page tables. Protected mode on the 80386 can operate with paging either enabled or disabled; the segmentation mechanism is always active and generates virtual addresses that are then mapped by the paging mechanism if it is enabled. The segmentation mechanism can also be effectively disabled by setting all segments to have a base address of 0 and size limit equal to the whole address space; this also requires a minimally-sized segment descriptor table of only four descriptors (since the FS and GS segments need not be used).{{Efn|An extra descriptor record at the top of the table is also required, because the table starts at zero but the minimum descriptor index that can be loaded into a segment register is 1; the value 0 is reserved to represent a segment register that points to no segment.}} Paging is used extensively by modern multitasking operating systems. [[Linux]], [[386BSD]] and [[Windows NT]] were developed for the 386 because it was the first Intel architecture CPU to support paging and 32-bit segment offsets. The 386 architecture became the basis of all further development in the x86 series. x86 processors that support protected mode boot into [[real mode]] for backward compatibility with the older 8086 class of processors. Upon power-on (a.k.a. [[booting]]), the processor initializes in real mode, and then begins executing instructions. Operating system boot code, which might be stored in [[read-only memory]], may place the processor into the [[protected mode]] to enable paging and other features. Conversely, segment arithmetic, a common practice in real mode code, is not allowed in protected mode. ====Virtual 8086 mode==== {{Main|Virtual 8086 mode}} There is also a sub-mode of operation in 32-bit protected mode (a.k.a. 80386 protected mode) called ''[[virtual 8086 mode]]'', also known as ''V86 mode''. This is basically a special hybrid operating mode that allows real mode programs and operating systems to run while under the control of a protected mode supervisor operating system. This allows for a great deal of flexibility in running both protected mode programs and real mode programs simultaneously. This mode is exclusively available for the 32-bit version of protected mode; it does not exist in the 16-bit version of protected mode, or in long mode. ===Long mode=== {{Main|Long mode}} In the mid 1990s, it was obvious that the 32-bit address space of the x86 architecture was limiting its performance in applications requiring large data sets. A 32-bit address space would allow the processor to directly address only 4 GB of data, a size surpassed by applications such as [[Video editing software|video processing]] and [[database engine]]s. Using 64-bit addresses, it is possible to directly address 16 [[Exbibyte|EiB]] of data, although most 64-bit architectures do not support access to the full 64-bit address space; for example, AMD64 supports only 48 bits from a 64-bit address, split into four paging levels. In 1999, [[AMD]] published a (nearly) complete specification for a [[64-bit computing|64-bit]] extension of the x86 architecture which they called ''x86-64'' with claimed intentions to produce. That design is currently used in almost all x86 processors, with some exceptions intended for [[embedded system]]s. Mass-produced ''x86-64'' chips for the general market were available four years later, in 2003, after the time was spent for working prototypes to be tested and refined; about the same time, the initial name ''x86-64'' was changed to ''AMD64''. The success of the AMD64 line of processors coupled with lukewarm reception of the IA-64 architecture forced Intel to release its own implementation of the AMD64 instruction set. Intel had previously implemented support for AMD64<ref>{{Cite web |url=http://www.geek.com/intels-yamhill-technology-x86-64-compatible/ |title=Intel's Yamhill Technology: x86-64 compatible {{!}}Geek.com<!-- Bot generated title --> |access-date=July 18, 2008 |archive-date=September 5, 2012 |archive-url=https://archive.today/20120905073732/http://www.geek.com/intels-yamhill-technology-x86-64-compatible/ |url-status=dead }}</ref> but opted not to enable it in hopes that AMD would not bring AMD64 to market before Itanium's new IA-64 instruction set was widely adopted. It branded its implementation of AMD64 as ''EM64T'', and later rebranded it ''Intel 64''. In its literature and product version names, Microsoft and Sun refer to AMD64/Intel 64 collectively as ''x64'' in the Windows and [[Solaris (operating system)|Solaris]] operating systems. [[Linux distribution]]s refer to it either as "x86-64", its variant "x86_64", or "amd64". [[Berkeley Software Distribution|BSD]] systems use "amd64" while [[macOS]] uses "x86_64". Long mode is mostly an extension of the 32-bit instruction set, but unlike the 16–to–32-bit transition, many instructions were dropped in the 64-bit mode. This does not affect actual binary backward compatibility (which would execute legacy code in other modes that retain support for those instructions), but it changes the way assembler and compilers for new code have to work. This was the first time that a major extension of the x86 architecture was initiated and originated by a manufacturer other than Intel. It was also the first time that Intel accepted technology of this nature from an outside source. ==Extensions== ===Floating-point unit=== {{Main|x87}} {{Further|Floating-point unit}} [[File:Intel chips 386 387.jpg|thumb|An Intel 386 with the 387 co-processor]] Early x86 processors could be extended with [[floating-point]] hardware in the form of a series of floating-point [[numerical analysis|numerical]] [[co-processor]]s with names like [[Intel 8087|8087]], 80287 and 80387, abbreviated x87. This was also known as the NPX (''Numeric Processor eXtension''), an apt name since the coprocessors, while used mainly for floating-point calculations, also performed integer operations on both binary and decimal formats. With very few exceptions, the 80486 and subsequent x86 processors then integrated this x87 functionality on chip which made the x87 instructions a [[de facto]] integral part of the x86 instruction set. Each x87 register, known as ST(0) through ST(7), is 80 bits wide and stores numbers in the [[IEEE floating-point standard]] double extended precision format. These registers are organized as a stack with ST(0) as the top. This was done in order to conserve opcode space, and the registers are therefore randomly accessible only for either operand in a register-to-register instruction; ST0 must always be one of the two operands, either the source or the destination, regardless of whether the other operand is ST(x) or a memory operand. However, random access to the stack registers can be obtained through an instruction which exchanges any specified ST(x) with ST(0). The operations include arithmetic and transcendental functions, including trigonometric and exponential functions, and instructions that load common constants (such as 0; 1; e, the base of the natural logarithm; log2(10); and log10(2)) into one of the stack registers. While the integer ability is often overlooked, the x87 can operate on larger integers with a single instruction than the 8086, 80286, 80386, or any x86 CPU without to 64-bit extensions can, and repeated integer calculations even on small values (e.g., 16-bit) can be accelerated by executing integer instructions on the x86 CPU and the x87 in parallel. (The x86 CPU keeps running while the x87 coprocessor calculates, and the x87 sets a signal to the x86 when it is finished or interrupts the x86 if it needs attention because of an error.) ===MMX=== {{Main|MMX (instruction set)}} MMX is a [[Single instruction, multiple data|SIMD]] instruction set designed by Intel and introduced in 1997 for the [[Pentium MMX]] microprocessor.<ref name="intel">{{cite web |title=Programming With the Intel MMX™ Technology |url=http://www.intel.com/design/intarch/techinfo/pentium/mmxprog.htm |website=Embedded Pentium® Processor Family Technical Information Center |publisher=Intel |access-date=5 June 2022 |archive-url=https://web.archive.org/web/20030725092803/http://www.intel.com/design/intarch/techinfo/pentium/mmxprog.htm |archive-date=25 July 2003 |url-status=dead}}</ref> The MMX instruction set was developed from a similar concept first used on the [[Intel i860]]. It is supported on most subsequent IA-32 processors by Intel and other vendors. MMX is typically used for video processing (in multimedia applications, for instance).<ref>{{cite journal |last1=Krishnaprasad |first1=S. |title=SIMD programming illustrated using Intel's MMX instruction set |journal=Journal of Computing Sciences in Colleges |date=1 January 2004 |volume=19 |issue=3 |pages=268–277 |url=https://dl.acm.org/doi/10.5555/948835.948862 |issn=1937-4771}}</ref> MMX added 8 new registers to the architecture, known as MM0 through MM7 (henceforth referred to as ''MMn''). In reality, these new registers were just aliases for the existing x87 FPU stack registers. Hence, anything that was done to the floating-point stack would also affect the MMX registers. Unlike the FP stack, these MMn registers were fixed, not relative, and therefore they were randomly accessible. The instruction set did not adopt the stack-like semantics so that existing operating systems could still correctly save and restore the register state when multitasking without modifications.<ref name="intel" /> Each of the MMn registers are 64-bit integers. However, one of the main concepts of the MMX instruction set is the concept of ''packed data types'', which means instead of using the whole register for a single 64-bit integer ([[quadword]]), one may use it to contain two 32-bit integers ([[Integer (computer science)|doubleword]]), four 16-bit integers ([[Integer (computer science)|word]]) or eight 8-bit integers ([[Integer (computer science)|byte]]). Given that the MMX's 64-bit MMn registers are aliased to the FPU stack and each of the floating-point registers are 80 bits wide, the upper 16 bits of the floating-point registers are unused in MMX. These bits are set to all ones by any MMX instruction, which correspond to the floating-point representation of [[NaN]]s or infinities.<ref name="intel" /> ===3DNow!=== {{Main|3DNow!}} In 1997, AMD introduced 3DNow!.<ref>{{cite news |last1=Sexton |first1=Michael Justin Allen |title=The History Of AMD CPUs |url=https://www.tomshardware.com/picturestory/713-amd-cpu-history.html |access-date=5 June 2022 |work=Tom's Hardware |date=21 April 2017 |language=en}}</ref> The introduction of this technology coincided with the rise of [[3D computer graphics|3D]] entertainment applications and was designed to improve the CPU's [[vector processing]] performance of graphic-intensive applications. 3D video game developers and 3D graphics hardware vendors use 3DNow! to enhance their performance on AMD's [[AMD K6|K6]] and [[Athlon]] series of processors.<ref>{{cite news |last1=Shimpi |first1=Anand Lal |title=AMD's K6-2 350: Something to do... |url=https://www.anandtech.com/show/161/2 |access-date=5 June 2022 |work=AnandTech |date=29 October 1998}}</ref> 3DNow! was designed to be the natural evolution of MMX from integers to floating point. As such, it uses exactly the same register naming convention as MMX, that is MM0 through MM7.<ref>{{cite web |title=Intel's MMX and AMD's 3DNow! SIMD Operations |url=https://web.mit.edu/rhel-doc/3/rhel-as-en-3/i386-simd.html |website=web.mit.edu |access-date=5 June 2022}}</ref> The only difference is that instead of packing integers into these registers, two [[single-precision floating-point format|single-precision floating-point]] numbers are packed into each register. The advantage of aliasing the FPU registers is that the same instruction and data structures used to save the state of the FPU registers can also be used to save 3DNow! register states. Thus no special modifications are required to be made to operating systems which would otherwise not know about them.<ref>{{cite web |title=3DNow!™ Technology Manual |url=https://www.amd.com/system/files/TechDocs/21928.pdf |publisher=Advanced Micro Devices |access-date=5 June 2022}}</ref> ==={{vanchor|SSE}} and AVX=== {{Main|Streaming SIMD Extensions|SSE2|SSE3|SSSE3|SSE4|SSE5}} In 1999, Intel introduced the Streaming SIMD Extensions (SSE) [[instruction set]], following in 2000 with SSE2. The first addition allowed offloading of basic floating-point operations from the x87 stack and the second made MMX almost obsolete and allowed the instructions to be realistically targeted by conventional compilers. Introduced in 2004 along with the [[Intel Prescott|''Prescott'']] revision of the [[Pentium 4]] processor, SSE3 added specific memory and [[Thread (computing)|thread]]-handling instructions to boost the performance of Intel's [[HyperThreading]] technology. AMD licensed the SSE3 instruction set and implemented most of the SSE3 instructions for its revision E and later Athlon 64 processors. The Athlon 64 does not support HyperThreading and lacks those SSE3 instructions used only for HyperThreading.<ref name="tomshardware">{{cite news |title=Upgrading And Repairing PCs 21st Edition: Processor Features |url=https://www.tomshardware.com/reviews/processors-cpu-apu-features-upgrade,3569-3.html |access-date=5 June 2022 |work=Tom's Hardware |date=31 October 2013 |language=en}}</ref> SSE discarded all legacy connections to the FPU stack. This also meant that this instruction set discarded all legacy connections to previous generations of SIMD instruction sets like MMX. But it freed the designers up, allowing them to use larger registers, not limited by the size of the FPU registers. The designers created eight 128-bit registers, named XMM0 through XMM7. (In [[x86-64|AMD64]], the number of SSE XMM registers has been increased from 8 to 16.) However, the downside was that operating systems had to have an awareness of this new set of instructions in order to be able to save their register states. So Intel created a slightly modified version of Protected mode, called Enhanced mode which enables the usage of SSE instructions, whereas they stay disabled in regular Protected mode. An OS that is aware of SSE will activate Enhanced mode, whereas an unaware OS will only enter into traditional Protected mode. SSE is a SIMD instruction set that works only on floating-point values, like 3DNow!. However, unlike 3DNow! it severs all legacy connection to the FPU stack. Because it has larger registers than 3DNow!, SSE can pack twice the number of [[single precision]] floats into its registers. The original SSE was limited to only single-precision numbers, like 3DNow!. The SSE2 introduced the capability to pack [[double precision]] numbers too, which 3DNow! had no possibility of doing since a double precision number is 64-bit in size which would be the full size of a single 3DNow! MMn register. At 128 bits, the SSE XMMn registers could pack two double precision floats into one register. Thus SSE2 is much more suitable for scientific calculations than either SSE1 or 3DNow!, which were limited to only single precision. SSE3 does not introduce any additional registers.<ref name="tomshardware" /> {{main|Advanced Vector Extensions|AVX-512}} The Advanced Vector Extensions (AVX) doubled the size of SSE registers to 256-bit YMM registers. It also introduced the VEX coding scheme to accommodate the larger registers, plus a few instructions to permute elements. AVX2 did not introduce extra registers, but was notable for the addition for masking, [[Gather-scatter (vector addressing)|gather]], and shuffle instructions. AVX-512 features yet another expansion to 32 512-bit ZMM registers and a new EVEX scheme. Unlike its predecessors featuring a monolithic extension, it is divided into many subsets that specific models of CPUs can choose to implement. ===Physical Address Extension (PAE)=== {{Main|Physical Address Extension}} [[Physical Address Extension]] or PAE was first added in the Intel [[Pentium Pro]], and later by [[AMD]] in the Athlon processors,<ref name="Athlon PAE">{{cite book|chapter-url=http://pdf.datasheetcatalog.com/datasheet/AdvancedMicroDevices/mXvyvs.pdf|access-date=2017-04-13|author=AMD, Inc.|title=AMD Athlon™ Processor x86 Code Optimization Guide|chapter=Appendix E|page=250|date=February 2002|edition=Revision K|quote=A 2-bit index consisting of PCD and PWT bits of the page table entry is used to select one of four PAT register fields when PAE (page address extensions) is enabled, or when the PDE doesn’t describe a large page.|archive-date=April 13, 2017|archive-url=https://web.archive.org/web/20170413235648/http://pdf.datasheetcatalog.com/datasheet/AdvancedMicroDevices/mXvyvs.pdf|url-status=live}}</ref> to allow up to 64 GB of RAM to be addressed. Without PAE, physical RAM in 32-bit protected mode is usually limited to 4 [[gigabyte|GB]]. PAE defines a different page table structure with wider page table entries and a third level of page table, allowing additional bits of physical address. Although the initial implementations on 32-bit processors theoretically supported up to 64 GB of RAM, chipset and other platform limitations often restricted what could actually be used. [[x86-64]] processors define page table structures that theoretically allow up to 52 bits of physical address, although again, chipset and other platform concerns (like the number of DIMM slots available, and the maximum RAM possible per DIMM) prevent such a large physical address space to be realized. On x86-64 processors PAE mode must be active before the switch to [[long mode]], and must remain active while [[long mode]] is active, so while in long mode there is no "non-PAE" mode. PAE mode does not affect the width of linear or virtual addresses. ===x86-64=== {{More citations needed section|date=March 2016}} {{Main|x86-64}} [[File:Processor families in TOP500 supercomputers.svg|thumb|upright=1.25|In [[supercomputer]] [[computer cluster|cluster]]s (as tracked by [[TOP 500]] data and visualized on the diagram above, last updated 2013), the appearance of 64-bit extensions for the x86 architecture enabled 64-bit x86 processors by AMD and Intel (teal hatched and blue hatched, in the diagram, respectively) to replace most RISC processor architectures previously used in such systems (including [[PA-RISC]], [[SPARC]], [[DEC Alpha|Alpha]], and others), and 32-bit x86 (green on the diagram), even though Intel initially tried unsuccessfully to replace x86 with a new incompatible 64-bit architecture in the [[Itanium]] processor. The main non-x86 architecture which is still used, as of 2014, in supercomputing clusters is the [[Power ISA]] used by [[IBM Power microprocessors]] (blue with diamond tiling in the diagram), with SPARC as a distant second.]] By the 2000s, 32-bit x86 processors' limits in memory addressing were an obstacle to their use in high-performance computing clusters and powerful desktop workstations. The aged 32-bit x86 was competing with much more advanced 64-bit RISC architectures which could address much more memory. Intel and the whole x86 ecosystem needed 64-bit memory addressing if x86 was to survive the 64-bit computing era, as workstation and desktop software applications were soon to start hitting the limits of 32-bit memory addressing. However, Intel felt that it was the right time to make a bold step and use the transition to 64-bit desktop computers for a transition away from the x86 architecture in general, an experiment which ultimately failed. In 2001, Intel attempted to introduce a non-x86 64-bit architecture named [[IA-64]] in its [[Itanium]] processor, initially aiming for the [[high-performance computing]] market, hoping that it would eventually replace the 32-bit x86.<ref>{{cite web |url = http://features.techworld.com/operating-systems/2690/will-intel-abandon-the-itanium/ |title = Will Intel abandon the Itanium? |date = July 20, 2006 |author = Manek Dubash |quote = Once touted by Intel as a replacement for the x86 product line, expectations for Itanium have been throttled well back. |publisher = [[Techworld]] |access-date = December 19, 2010 |archive-date = February 19, 2011 |archive-url = https://web.archive.org/web/20110219212053/http://features.techworld.com/operating-systems/2690/will-intel-abandon-the-itanium/ |url-status = dead }}</ref> While IA-64 was incompatible with x86, the Itanium processor did provide [[Emulator|emulation]] abilities for translating x86 instructions into IA-64, but this affected the performance of x86 programs so badly that it was rarely, if ever, actually useful to the users: programmers should rewrite x86 programs for the IA-64 architecture or their performance on Itanium would be orders of magnitude worse than on a true x86 processor. The market rejected the Itanium processor since it broke [[backward compatibility]] and preferred to continue using x86 chips, and very few programs were rewritten for IA-64. AMD decided to take another path toward 64-bit memory addressing, making sure backward compatibility would not suffer. In April 2003, AMD released the first x86 processor with 64-bit general-purpose registers, the [[Opteron]], capable of addressing much more than 4 [[Gigabyte|GB]] of virtual memory using the new [[x86-64]] extension (also known as AMD64 or x64). The 64-bit extensions to the x86 architecture were enabled only in the newly introduced [[long mode]], therefore 32-bit and 16-bit applications and operating systems could simply continue using an AMD64 processor in protected or other modes, without even the slightest sacrifice of performance<ref name="x86-compat-perf">{{cite web |url=https://public.dhe.ibm.com/software/webserver/appserv/was/64bitPerf.pdf |title=IBM WebSphere Application Server 64-bit Performance Demystified |page=14 |quote=Figures 5, 6 and 7 also show the 32-bit version of WAS runs applications at full native hardware performance on the POWER and x86-64 platforms. Unlike some 64-bit processor architectures, the POWER and x86-64 hardware does not emulate 32-bit mode. Therefore applications that do not benefit from 64-bit features can run with full performance on the 32-bit version of WebSphere running on the above mentioned 64-bit platforms. |publisher=IBM Corporation |date=September 6, 2007 |access-date=April 9, 2010 |archive-date=January 25, 2022 |archive-url=https://web.archive.org/web/20220125121650/ftp://ftp.software.ibm.com/software/webserver/appserv/was/64bitPerf.pdf |url-status=live }}</ref> and with full compatibility back to the original instructions of the 16-bit Intel 8086.<ref name="amd-24593">{{cite web |url=https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf |title=Volume 2: System Programming |date=March 2024 |work=AMD64 Architecture Programmer's Manual |publisher=AMD Corporation |access-date=April 24, 2024 |archive-date=April 4, 2024 |archive-url=https://web.archive.org/web/20240404110900/https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf |url-status=live }}</ref>{{rp|page=13–14|date=November 2012}} The market responded positively, adopting the 64-bit AMD processors for both high-performance applications and business or home computers. Seeing the market rejecting the incompatible Itanium processor and Microsoft supporting AMD64, Intel had to respond and introduced its own x86-64 processor, the ''[[Pentium 4#Prescott|Prescott]]'' Pentium 4, in July 2004.<ref>{{cite news |author= Charlie Demerjian |title=Why Intel's Prescott will use AMD64 extensions |url=http://www.theinquirer.net/inquirer/news/1029651/why-intels-prescott-will-use-amd64--extensions |archive-url=https://web.archive.org/web/20091010181925/http://www.theinquirer.net/inquirer/news/1029651/why-intels-prescott-will-use-amd64--extensions |url-status=dead |archive-date=October 10, 2009 |work=[[The Inquirer]] |date=September 26, 2003 |access-date=October 7, 2009 }}</ref> As a result, the Itanium processor with its IA-64 instruction set is rarely used and x86, through its x86-64 incarnation, is still the dominant CPU architecture in non-embedded computers. x86-64 also introduced the [[NX bit]], which offers some protection against security bugs caused by [[buffer overrun]]s. As a result of AMD's 64-bit contribution to the x86 lineage and its subsequent acceptance by Intel, the 64-bit RISC architectures ceased to be a threat to the x86 ecosystem and almost disappeared from the workstation market. x86-64 began to be utilized in powerful [[supercomputer]]s (in its [[AMD Opteron]] and [[Intel Xeon]] incarnations), a market which was previously the natural habitat for 64-bit RISC designs (such as the [[IBM Power microprocessors]] or [[SPARC]] processors). The great leap toward 64-bit computing and the maintenance of backward compatibility with 32-bit and 16-bit software enabled the x86 architecture to become an extremely flexible platform today, with x86 chips being utilized from small low-power systems (for example, [[Intel Quark]] and [[Intel Atom]]) to fast gaming desktop computers (for example, [[Intel Core i7]] and [[AMD FX]]/[[Ryzen]]), and even dominate large supercomputing [[computer cluster|cluster]]s, effectively leaving only the [[ARM architecture|ARM]] 32-bit and 64-bit RISC architecture as a competitor in the [[smartphone]] and [[tablet computer|tablet]] market. ===Virtualization=== {{Main|x86 virtualization}} Prior to 2005, x86 architecture processors were unable to meet the [[Popek and Goldberg virtualization requirements|Popek and Goldberg requirements]] – a specification for virtualization created in 1974 by [[Gerald J. Popek]] and [[Robert P. Goldberg]]. However, both proprietary and open-source [[x86 virtualization]] hypervisor products were developed using [[Shadow page tables|software-based virtualization]]. Proprietary systems include [[Hyper-V]], [[Parallels Workstation]], [[VMware ESX]], [[VMware Workstation]], [[VMware Workstation Player]] and [[Windows Virtual PC]], while [[free and open-source]] systems include [[QEMU]], [[Kernel-based Virtual Machine]], [[VirtualBox]], and [[Xen]]. The introduction of the AMD-V and Intel VT-x instruction sets in 2005 allowed x86 processors to meet the Popek and Goldberg virtualization requirements.<ref>{{cite conference |last1=Adams |first1=Keith |last2=Agesen |first2=Ole |title=A Comparison of Software and Hardware Techniques for x86 Virtualization |conference=Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, USA, 2006 |date=October 21–25, 2006 |url=http://www.vmware.com/pdf/asplos235_adams.pdf |id=ACM 1-59593-451-0/06/0010 |access-date=December 22, 2006 |archive-date=August 20, 2010 |archive-url=https://web.archive.org/web/20100820201944/http://www.vmware.com/pdf/asplos235_adams.pdf |url-status=live }}</ref> ===AES=== {{Main|AES instruction set}} === APX (Advanced Performance Extensions) === APX (Advanced Performance Extensions) are extensions to double the number of general-purpose registers from 16 to 32 and add new features to improve general-purpose performance.<ref>{{Cite web |last1=Winkel |first1=Sebastian |last2=Agron |first2=Jason |title=Advanced Performance Extensions (APX) |url=https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html |access-date=2023-10-22 |website=[[Intel]] |language=en}}</ref><ref>{{cite web |last1=Robinson |first1=Dan |title=Intel adds fresh x86 and vector instructions for future chips |url=https://www.theregister.com/2023/07/26/intel_x86_vector_instructions/ |website=The Register |access-date=22 October 2023}}</ref><ref>{{cite web |last1=Bonshor |first1=Gavin |title=Intel Unveils AVX10 and APX Instruction Sets: Unifying AVX-512 For Hybrid Architectures |url=https://www.anandtech.com/show/18975/intel-unveils-avx10-and-apx-isas-unifying-avx512-for-hybrid-architectures- |website=AnandTech |access-date=22 October 2023}}</ref><ref>{{cite web |last1=Alcorn |first1=Paul |title=Intel's New AVX10 Brings AVX-512 Capabilities to E-Cores |url=https://www.tomshardware.com/news/intels-new-avx10-brings-avx-512-capabilities-to-e-cores |website=Tom's Hardware |date=July 24, 2023 |access-date=22 October 2023}}</ref> These extensions have been called "generational"<ref>{{cite web |last1=Shah |first1=Agam |title=Intel's Generational On-Chip Change APX Will Make All the Apps Faster |url=https://thenewstack.io/intels-generational-on-chip-change-apx-will-make-all-the-apps-faster/ |website=The New Stack |date=August 9, 2023 |access-date=22 October 2023}}</ref> and "the biggest x86 addition since 64 bits".<ref>{{cite web |last1=Byrne |first1=Joseph |title=APX is Biggest x86 Addition Since 64 Bits |url=https://www.techinsights.com/blog/apx-biggest-x86-addition-64-bits |website=Tech Insights}}</ref> Intel contributed APX support to [[GNU Compiler Collection]] (GCC) 14.<ref>{{cite web |last1=Larabel |first1=Michael |title=Intel APX Code Begins Landing Within The GCC Compiler |url=https://www.phoronix.com/news/GCC-Intel-APX-Starts-Landing |website=Phoronix |access-date=22 October 2023}}</ref> According to the architecture specification,<ref>{{Cite web |date=2023-07-21 |title=Intel® Advanced Performance Extensions (Intel® APX) Architecture Specification |url=https://www.intel.com/content/www/us/en/content-details/784266/intel-advanced-performance-extensions-intel-apx-architecture-specification.html |access-date=2023-10-22 |website=Intel}}</ref> the main features of APX are: * 16 additional general-purpose registers, called the Extended GPRs (EGPRs) * Three-operand instruction formats for many integer instructions * New conditional instructions for loads, stores, and comparisons with common instructions that do not modify flags * Optimized register save/restore operations * A 64-bit absolute direct jump instruction Extended GPRs for general purpose instructions are encoded using 2-byte [[REX prefix|REX2]] prefix, while new instructions and extended operands for existing [[Advanced Vector Extensions|AVX]]/[[AVX2]]/[[AVX-512]] instructions are encoded with [[EVEX prefix#Extended EVEX prefix|extended EVEX]] prefix which has four variants used for different groups of instructions. ==See also== {{Div col|colwidth=25em}} * [[x86 calling conventions]] * [[x86 instruction listings]] * [[CPUID]] * [[680x0]], a competing architecture in the 16-bit and early 32-bit eras * [[PowerPC]], a competing architecture in the later 32-bit and 64-bit eras * [[List of AMD processors]] * [[List of Intel processors]] * [[List of Intel CPU microarchitectures]] * [[List of VIA microprocessor cores]] * [[List of x86 manufacturers]] * [[Interrupt request]] * [[Speculative execution CPU vulnerabilities]] * [[Tick–tock model]] * [[Virtual legacy wires]] {{Div col end}} ==Notes== {{Notelist|30em}} ==References== {{Reflist|30em}} ==Further reading== * {{cite journal |last1=Rosenblum |first1=Mendel |last2=Garfinkel |first2=Tal |title=Virtual machine monitors: current technology and future trends |pages=39–47 |journal=IEEE Computer |volume=38 |issue=5 |date=May 2005 |doi=10.1109/MC.2005.176 |citeseerx=10.1.1.614.9870|s2cid=10385623 }} ==External links== {{Commons category|X86 architecture}} {{Wikibooks|X86 Assembly/X86 Architecture}} * [https://www.computerworld.com/article/2827767/why-intel-can-t-seem-to-retire-the-x86.html Why Intel can't seem to retire the x86] * [http://www.felixcloutier.com/x86/ 32/64-bit x86 Instruction Reference] * [https://software.intel.com/sites/landingpage/IntrinsicsGuide/ Intel Intrinsics Guide], an interactive reference tool for Intel intrinsic instructions * [https://software.intel.com/en-us/articles/intel-sdm#combined Intel® 64 and IA-32 Architectures Software Developer's Manuals] * [https://developer.amd.com/resources/developer-guides-manuals/#amd64_architecture AMD Developer Guides, Manuals & ISA Documents, AMD64 Architecture] {{X86 assembly topics}} {{Multimedia extensions}} {{Intel}} {{Processor technologies}} {{Authority control}} [[Category:Computer-related introductions in 1978]] [[Category:Intel products]] [[Category:IBM PC compatibles]] [[Category:X86 architecture| ]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:About
(
edit
)
Template:As of
(
edit
)
Template:Authority control
(
edit
)
Template:Cite book
(
edit
)
Template:Cite conference
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite news
(
edit
)
Template:Cite press release
(
edit
)
Template:Cite web
(
edit
)
Template:Cl
(
edit
)
Template:Commons category
(
edit
)
Template:Div col
(
edit
)
Template:Div col end
(
edit
)
Template:Efn
(
edit
)
Template:Further
(
edit
)
Template:Hatnote
(
edit
)
Template:Infobox CPU architecture
(
edit
)
Template:Intel
(
edit
)
Template:Lowercase title
(
edit
)
Template:Main
(
edit
)
Template:Mono
(
edit
)
Template:More citations needed section
(
edit
)
Template:Multimedia extensions
(
edit
)
Template:Mw-datatable
(
edit
)
Template:Notelist
(
edit
)
Template:Processor technologies
(
edit
)
Template:Reflist
(
edit
)
Template:Rp
(
edit
)
Template:See also
(
edit
)
Template:Short description
(
edit
)
Template:Sister project
(
edit
)
Template:Use mdy dates
(
edit
)
Template:Val
(
edit
)
Template:Vanchor
(
edit
)
Template:Wikibooks
(
edit
)
Template:X86 assembly topics
(
edit
)