Editing Optimizing compiler (section)

== Categorization ==
===Local vs. global scope===
[[Scope (computer science)|Scope]] describes how much of the input code is considered to apply optimizations.

Local scope optimizations use information local to a [[basic block]].<ref name="Cooper_2003">{{cite book|author-first1=Keith D.|author-last1=Cooper|author-link1=Keith D. Cooper|author-first2=Linda|author-last2=Torczon|title=Engineering a Compiler|publisher=[[Morgan Kaufmann]]|date=2003|orig-date=2002-01-01|pages=404, 407|isbn=978-1-55860-698-2}}</ref> Since basic blocks contain no control flow statements, these optimizations require minimal analysis, reducing time and storage requirements. However, no information is retained across jumps.

Global scope optimizations, also known as intra-procedural optimizations, operate on individual functions.<ref name="Cooper_2003"/> This gives them more information to work with but often makes expensive computations necessary. Worst-case assumptions need to be made when function calls occur or global variables are accessed because little information about them is available.

=== Peephole optimization ===
[[Peephole optimization]]s are usually performed late in the compilation process after [[machine code]] has been generated. This optimization examines a few adjacent instructions (similar to "looking through a peephole" at the code) to see whether they can be replaced by a single instruction or a shorter sequence of instructions.<ref name="aho-sethi-ullman" />{{rp|page=554}} For instance, a multiplication of a value by two might be more efficiently executed by [[Bit shift|left-shifting]] the value or by adding the value to itself (this example is also an instance of [[strength reduction]]).

===Inter-procedural optimization===
[[Interprocedural optimization]]s analyze all of a program's source code. The more information available, the more effective the optimizations can be. The information can be used for various optimizations, including function [[inline expansion|inlining]], where a call to a function is replaced by a copy of the function body.

===Link-time optimization===
[[Link-time optimization]] (LTO), or whole-program optimization, is a more general class of interprocedural optimization. During LTO, the compiler has visibility across translation units which allows it to perform more aggressive optimizations like cross-module inlining and [[Virtual method table|devirtualization]].

===Machine and object code optimization===
Machine code optimization involves using an [[object code optimizer]] to analyze the program after all machine code has been [[linker (computing)|linked]]. Techniques such as macro compression, which conserves space by condensing common instruction sequences, become more effective when the entire executable task image is available for analysis.<ref name="MCO">{{cite tech report|url=https://ClintGoss.com/mco/Goss_1986_MachineCodeOptimization.pdf|title=Machine Code Optimization – Improving Executable Object Code|author=Goss|first=Clinton F.|date=August 2013|access-date=22 Aug 2013|archive-url=https://ghostarchive.org/archive/20221009/https://ClintGoss.com/mco/Goss_1986_MachineCodeOptimization.pdf|url-status=live|archive-date=2022-10-09|publisher=Courant Institute, New York University|type=Ph.D. dissertation|orig-date=First published June 1986|volume=Computer Science Department Technical Report #246|arxiv=1308.4815|bibcode=2013arXiv1308.4815G}}

* {{cite thesis|author=Clinton F. Goss|title=Machine Code Optimization - Improving Executable Object Code|date=2013|degree=PhD|publisher=Courant Institute, New York University|url=https://ClintGoss.com/mco/|orig-date=1986}}</ref>

===Language-independent vs. language-dependent===
Most high-level [[programming language]]s share common programming constructs and abstractions, such as branching constructs (if, switch), looping constructs (for, while), and encapsulation constructs (structures, objects). Thus, similar optimization techniques can be used across languages. However, certain language features make some optimizations difficult. For instance, pointers in [[C (programming language)|C]] and [[C++]] make array optimization difficult; see [[alias analysis]]. However, languages such as [[PL/I]] that also support pointers implement optimizations for arrays. Conversely, some language features make certain optimizations easier. For example, in some languages, functions are not permitted to have [[side effect (computer science)|side effects]]. Therefore, if a program makes several calls to the same function with the same arguments, the compiler can infer that the function's result only needs to be computed once. In languages where functions are allowed to have side effects, the compiler can restrict such optimization to functions that it can determine have no side effects.

===Machine-independent vs. machine-dependent===
Many optimizations that operate on abstract programming concepts (loops, objects, structures) are independent of the machine targeted by the compiler, but many of the most effective optimizations are those that best exploit special features of the target platform. Examples are instructions that do several things at once, such as decrement register and branch if not zero.

The following is an instance of a local machine-dependent optimization. To set a [[Processor register|register]] to 0, the obvious way is to use the constant '0' in an instruction that sets a register value to a constant. A less obvious way is to [[XOR]] a register with itself or subtract it from itself. It is up to the compiler to know which instruction variant to use. On many [[RISC]] machines, both instructions would be equally appropriate, since they would both be the same length and take the same time. On many other [[microprocessor]]s such as the [[Intel]] [[x86]] family, it turns out that the XOR variant is shorter and probably faster, as there will be no need to decode an immediate operand, nor use the internal "immediate operand register"; the same applies on [[IBM System/360]] and successors for the subtract variant.<ref>{{cite book |url=https://bitsavers.trailing-edge.com/pdf/ibm/360/training/GC20-1646-5_A_Programmers_Introduction_to_IBM_System360_Assembly_Language_196907.pdf |title=A Programmer's Introduction to IBM System/360 Assembly Language |page=42 |publisher=[[IBM]] |id=GC20-1645-5}}</ref> A potential problem with this is that XOR or subtract may introduce a data dependency on the previous value of the register, causing a [[instruction pipeline|pipeline]] stall, which occurs when the processor must delay execution of an instruction because it depends on the result of a previous instruction. However, processors often treat the XOR of a register with itself or the subtract of a register from itself as a special case that does not cause stalls.