Editing Register allocation (section)

== Context ==
{{See also|Compiler|Interpreter (computing)}}

=== Principle ===
{{See also|Computer data storage|Memory hierarchy}}
<!-- please change this to wiki table and add numbers for 32-bit versions of SPARC and MIPS -->

<!-- {| class="wikitable" style="text-align:center; width:30%; float: right; margin=1em; border-spacing: 2px" -->
 {| class="wikitable floatright"
 |+ Different number of general-purpose registers in the most common architectures
 |-
 ! Architecture
 ! scope="col" | 32 bit
 ! scope="col" | 64 bit
 |-
 ! scope="row" | ARM
 | 15
 | 31
 |-
 ! scope="row" | Intel x86
 | 8
 | 16
 |-
 ! scope="row" | MIPS
 | 32
 | 32
 |-
 ! scope="row" | POWER/PowerPC
 | 32
 | 32
 |-
 ! scope="row" | RISC-V
 | 16/32
 | 32
 |-
 ! scope="row" | SPARC
 | 31
 | 31
 |-
 |}

<!-- I assume the lack of a 32-bit entry for the "SPARC" row in the table indicates that the architecture in question only supports 64-bit -->

In many [[programming language]]s, the programmer may use any number of [[Variable (computer science)|variables]]. The computer can quickly read and write [[Processor register|registers]] in the [[Central processing unit|CPU]], so the [[computer program]] runs faster when more variables can be in the CPU's registers.{{sfn|Ditzel|McLellan|1982|p=48}} Also, sometimes code accessing registers is more compact, so the code is smaller, and can be fetched faster if it uses registers rather than memory. However, the number of registers is limited. Therefore, when the [[Compiler (computing)|compiler]] is translating code to machine-language, it must decide how to allocate variables to the limited number of registers in the CPU.{{sfn|Runeson|Nyström|2003|p=242}}{{sfn|Eisl|Grimmer|Simon|Würthinger|2016|p=14:1}}

Not all variables are [[liveness analysis|in use]] (or "live") at the same time, so, over the lifetime of a program, a given register may be used to hold different variables. However, two variables in use at the ''same'' time cannot be assigned to the same register without corrupting one of the variables. If there are not enough registers to hold all the variables, some variables may be moved to and from [[random access memory|RAM]]. This process is called "spilling" the registers.{{sfn|Chaitin|Auslander|Chandra|Cocke|1981|p=47}} Over the lifetime of a program, a variable can be both spilled and stored in registers: this variable is then considered as "split".{{sfn|Eisl|Grimmer|Simon|Würthinger|2016|p=1}} Accessing RAM is significantly slower than accessing registers <ref>{{cite web |url=https://blog.morizyun.com/computer-science/basic-latency-comparison-numbers.html |title=Latency Comparison Numbers in computer/network |publisher=blog.morizyun.com |date=2018-01-06 |access-date=8 January 2019}}</ref> and so a compiled program runs slower. Therefore, an optimizing compiler aims to assign as many variables to registers as possible. A high "[[Register pressure]]" is a technical term that means that more spills and reloads are needed; it is defined by Braun et al. as "the number of simultaneously live variables at an instruction".{{sfn|Braun|Hack|2009|p=174}}

In addition, some computer designs [[Cache (computing)|cache]] frequently-accessed registers. So, programs can be further optimized by assigning the same register to a source and destination of a <code>move</code> instruction whenever possible. This is especially important if the compiler is using an [[intermediate representation]] such as [[Static single assignment form|static single-assignment form]] (SSA). In particular, when SSA is not fully optimized it can artificially generate additional <code>move</code> instructions.

=== Components of register allocation ===
Register allocation consists therefore of choosing where to store the variables at runtime, i.e. inside or outside registers. If the variable is to be stored in registers, then the allocator needs to determine in which register(s) this variable will be stored. Eventually, another challenge is to determine the duration for which a variable should stay at the same location.

A register allocator, disregarding the chosen allocation strategy, can rely on a set of core actions to address these challenges. These actions can be gathered in several different categories:{{sfn|Koes|Goldstein|2009|p=21}}
;Move insertion: This action consists of increasing the number of move instructions between registers, i.e. make a variable live in different registers during its lifetime, instead of one. This occurs in the split live range approach. 
;Spilling: This action consists of storing a variable into memory instead of registers.{{sfn|Bouchez|Darte|Rastello|2007b|p=103}}
;Assignment: This action consists of assigning a register to a variable.{{sfn|Colombet|Brandner|Darte|2011|p=26}}
;Coalescing: This action consists of limiting the number of moves between registers, thus limiting the total number of instructions. For instance, by identifying a variable live across different methods, and storing it into one register during its whole lifetime.{{sfn|Bouchez|Darte|Rastello|2007b|p=103}}
Many register allocation approaches optimize for one or more specific categories of actions.
[[File:Registers CPU i386.png|thumb|300px|Intel 386 registers]]

===Common problems raised in register allocation===
Register allocation raises several problems that can be tackled (or avoided) by different register allocation approaches. Three of the most common problems are identified as follows:

;Aliasing: In some architectures, assigning a value to one register can affect the value of another: this is called aliasing. For example, the [[x86]] architecture has four general purpose 32-bit registers that can also be used as 16-bit or 8-bit registers.<ref>{{cite web |date=May 2019 |publisher=Intel |url=https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf | title = Intel® 64 and IA-32 Architectures Software Developer's Manual, Section 3.4.1 |url-status=dead |archive-url=https://web.archive.org/web/20190525125151/https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf |archive-date=2019-05-25}}</ref> In this case, assigning a 32-bit value to the eax register will affect the value of the al register.

;Pre-coloring: This problem is an act to force some variables to be assigned to particular registers. For example, in [[PowerPC]] [[calling convention]]s, parameters are commonly passed in R3-R10 and the return value is passed in R3.<ref>{{cite web |title=32-bit PowerPC function calling conventions |url=https://developer.apple.com/library/archive/documentation/DeveloperTools/Conceptual/LowLevelABI/100-32-bit_PowerPC_Function_Calling_Conventions/32bitPowerPC.html}}</ref>

;NP-Problem: Chaitin et al. showed that register allocation is an [[NP-completeness|NP-complete]] problem. They reduce the [[graph coloring]] problem to the register allocation problem by showing that for an arbitrary graph, a program can be constructed such that the register allocation for the program (with registers representing nodes and machine registers representing available colors) would be a coloring for the original graph. As Graph Coloring is an NP-Hard problem and Register Allocation is in NP, this proves the NP-completeness of the problem.{{sfn|Bouchez|Darte|Rastello|2006|p=4}}