Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
GNU Compiler Collection
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Design == [[File:GCC Translation Diagram.jpg|thumb|400x400px|Overview of GCC's extended compilation pipeline, including specialized programs like the [[preprocessor]], [[Assembly language|assembler]] and [[Linker (computing)|linker]].]] [[File:Compiler design.svg|thumb|400x400px|GCC follows the 3-stage architecture typical of multi-language and multi-CPU [[compiler]]s. All [[Abstract syntax tree|program trees]] are converted to a common abstract representation at the "middle end", allowing [[Program optimization|code optimization]] and [[binary code]] generation facilities to be shared by all languages.]] GCC's external interface follows [[Unix]] conventions. Users invoke a language-specific driver program (<code>gcc</code> for C, <code>g++</code> for C++, etc.), which interprets [[Command-line argument|command arguments]], calls the actual compiler, runs the [[Assembly language assembler|assembler]] on the output, and then optionally runs the [[Linker (computing)|linker]] to produce a complete [[executable]] binary. Each of the language compilers is a separate program that reads source code and outputs [[machine code]]. All have a common internal structure. A per-language front end [[parsing|parses]] the source code in that language and produces an [[abstract syntax tree]] ("tree" for short). These are, if necessary, converted to the middle end's input representation, called ''GENERIC'' form; the middle end then gradually transforms the program towards its final form. [[Compiler optimization]]s and [[static code analysis]] techniques (such as FORTIFY_SOURCE,<ref>{{cite web |url=http://fedoraproject.org/wiki/Security/Features |title=Security Features: Compile Time Buffer Checks (FORTIFY_SOURCE) |publisher=fedoraproject.org |access-date=2009-03-11 |archive-date=January 7, 2007 |archive-url=https://web.archive.org/web/20070107153447/http://fedoraproject.org/wiki/Security/Features |url-status=live }}</ref> a compiler directive that attempts to discover some [[buffer overflow]]s) are applied to the code. These work on multiple representations, mostly the architecture-independent GIMPLE representation and the architecture-dependent [[register transfer language|RTL]] representation. Finally, [[machine code]] is produced using architecture-specific [[pattern matching]] originally based on an algorithm of Jack Davidson and Chris Fraser. GCC was written primarily in [[C (programming language)|C]] except for parts of the [[Ada (programming language)|Ada]] front end. The distribution includes the standard libraries for Ada and [[C++]] whose code is mostly written in those languages.<ref>{{cite web |title=languages used to make GCC |url=http://www.ohloh.net/projects/gcc/analyses/latest |url-status=dead |archive-url=https://web.archive.org/web/20080527213819/http://www.ohloh.net/projects/gcc/analyses/latest |archive-date=May 27, 2008 |access-date=14 September 2008}}</ref>{{Update inline|date=January 2023|reason=The first time the reference was used is 2008. It seems it hasn't been updated for a long time. (A reference to the Java compiler was removed in 2021, but the rest of the paragraph hasn't been changed at least since 2013.)}} On some platforms, the distribution also includes a low-level runtime library, '''libgcc''', written in a combination of machine-independent C and processor-specific [[machine code]], designed primarily to handle arithmetic operations that the target processor cannot perform directly.<ref>{{cite web|url=https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html|title=GCC Internals|publisher=GCC.org|access-date=March 1, 2010|archive-date=January 18, 2023|archive-url=https://web.archive.org/web/20230118185814/https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html|url-status=live}}</ref> GCC uses many additional tools in its build, many of which are installed by default by many Unix and Linux distributions (but which, normally, aren't present in Windows installations), including [[Perl]],{{Explain|reason=|date=January 2021}} [[Flex lexical analyser|Flex]], [[GNU bison|Bison]], and other common tools. In addition, it currently requires three additional libraries to be present in order to build: [[GNU Multi-Precision Library|GMP]], [[Multiple Precision Complex|MPC]], and [[MPFR]].<ref>{{Cite web|title=Prerequisites for GCC - GNU Project|url=https://gcc.gnu.org/install/prerequisites.html|access-date=2021-09-05|website=gcc.gnu.org|archive-date=January 18, 2023|archive-url=https://web.archive.org/web/20230118185814/https://gcc.gnu.org/install/prerequisites.html|url-status=live}}</ref> In May 2010, the GCC steering committee decided to allow use of a [[C++]] compiler to compile GCC.<ref name="gcc-c++">{{cite news | title = GCC allows C++ β to some degree | url = http://www.h-online.com/open/news/item/GCC-allows-C-to-some-degree-1012611.html | publisher = [[Heinz Heise|The H]] | date = June 1, 2010 | access-date = June 9, 2010 | archive-date = September 26, 2022 | archive-url = https://web.archive.org/web/20220926160308/http://www.h-online.com/open/news/item/GCC-allows-C-to-some-degree-1012611.html | url-status = live }}</ref> The compiler was intended to be written mostly in C plus a subset of features from C++. In particular, this was decided so that GCC's developers could use the [[Destructor (computer science)|destructors]] and [[Generic programming|generics]] features of C++.<ref>{{Cite web|url=https://lists.gnu.org/archive/html/emacs-devel/2010-07/msg00518.html|title=Re: Efforts to attract more users?|website=lists.gnu.org|access-date=September 24, 2021|archive-date=January 18, 2023|archive-url=https://web.archive.org/web/20230118185909/https://lists.gnu.org/archive/html/emacs-devel/2010-07/msg00518.html|url-status=live}}</ref> In August 2012, the GCC steering committee announced that GCC now uses C++ as its implementation language.<ref>{{cite web|title=GCC 4.8 Release Series: Changes, New Features, and Fixes|url=https://gcc.gnu.org/gcc-4.8/changes.html|access-date=October 4, 2013|archive-date=December 8, 2015|archive-url=https://web.archive.org/web/20151208064435/https://gcc.gnu.org/gcc-4.8/changes.html|url-status=live}}</ref> This means that to build GCC from sources, a C++ compiler is required that understands [[C++03|ISO/IEC C++03]] standard. On May 18, 2020, GCC moved away from [[C++03|ISO/IEC C++03]] standard to [[C++11|ISO/IEC C++11]] standard (i.e. needed to compile, bootstrap, the compiler itself; by default it however compiles later versions of C++).<ref>{{cite web|title=bootstrap: Update requirement to C++11.|website=[[GitHub]]|url=https://github.com/gcc-mirror/gcc/commit/5329b59a2e13dabbe2038af0fe2e3cf5fc7f98ed|access-date=May 18, 2020|archive-date=September 29, 2022|archive-url=https://web.archive.org/web/20220929120518/https://github.com/gcc-mirror/gcc/commit/5329b59a2e13dabbe2038af0fe2e3cf5fc7f98ed|url-status=live}}</ref> === Front ends === [[File:Xxx Scanner and parser example for C.gif|thumb|right|300px|Front ends consist of [[Preprocessor|preprocessing]], [[lexical analysis]], [[Parsing|syntactic analysis]] (parsing) and semantic analysis. The goals of compiler front ends are to either accept or reject candidate programs according to the language grammar and semantics, identify errors and handle valid program representations to later compiler stages. This example shows the lexer and parser steps performed for a simple program written in [[C (programming language)|C]].]] Each [[front end (compiler)|front end]] uses a parser to produce the [[abstract syntax tree]] of a given [[source file]]. Due to the syntax tree abstraction, source files of any of the different supported languages can be processed by the same [[back end (Compiler)|back end]]. GCC started out using [[LALR parser]]s generated with [[GNU Bison|Bison]], but gradually switched to hand-written [[Recursive descent parser|recursive-descent parsers]] for C++ in 2004,<ref>{{Cite web|url=https://gcc.gnu.org/gcc-3.4/changes.html|title=GCC 3.4 Release Series β Changes, New Features, and Fixes - GNU Project|website=gcc.gnu.org|access-date=July 25, 2016|archive-date=January 18, 2023|archive-url=https://web.archive.org/web/20230118185814/https://gcc.gnu.org/gcc-3.4/changes.html|url-status=live}}</ref> and for C and Objective-C in 2006.<ref>{{Cite web|url=https://gcc.gnu.org/gcc-4.1/changes.html|title=GCC 4.1 Release Series β Changes, New Features, and Fixes - GNU Project|website=gcc.gnu.org|access-date=July 25, 2016|archive-date=January 18, 2023|archive-url=https://web.archive.org/web/20230118185814/https://gcc.gnu.org/gcc-4.1/changes.html|url-status=live}}</ref> As of 2021 all front ends use hand-written recursive-descent parsers. Until GCC 4.0, the tree representation of the program was not fully independent of the processor being targeted. The meaning of a tree was somewhat different for different language front ends, and front ends could provide their own tree codes. This was simplified with the introduction of GENERIC and GIMPLE, two new forms of language-independent trees that were introduced with the advent of GCC 4.0. GENERIC is more complex, based on the GCC 3.x Java front end's intermediate representation. GIMPLE is a simplified GENERIC, in which various constructs are ''[[lowering (computer science)|lowered]]'' to multiple GIMPLE instructions. The [[C (Programming Language)|C]], [[C++]], and [[Java (programming language)|Java]] front ends produce GENERIC directly in the front end. Other front ends instead have different intermediate representations after parsing and convert these to GENERIC. In either case, the so-called "gimplifier" then converts this more complex form into the simpler [[static single-assignment form|SSA]]-based GIMPLE form that is the common language for a large number of language- and architecture-independent global (function scope) optimizations. === GENERIC and GIMPLE === ''GENERIC'' is an [[intermediate representation]] language used as a "middle end" while compiling source code into [[executable|executable binaries]]. A subset, called ''GIMPLE'', is targeted by all the front ends of GCC. The middle stage of GCC does all of the code analysis and [[optimizing compiler|optimization]], working independently of both the compiled language and the target architecture, starting from the GENERIC<ref>{{Cite web|url=https://gcc.gnu.org/onlinedocs/gccint/GENERIC.html|title=GENERIC (GNU Compiler Collection (GCC) Internals)|website=gcc.gnu.org|access-date=July 25, 2016|archive-date=January 18, 2023|archive-url=https://web.archive.org/web/20230118185814/https://gcc.gnu.org/onlinedocs/gccint/GENERIC.html|url-status=live}}</ref> representation and expanding it to [[register transfer language]] (RTL). The GENERIC representation contains only the subset of the imperative [[computer programming|programming]] constructs optimized by the middle end. In transforming the source code to GIMPLE,<ref>{{Cite web|url=https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html|title=GIMPLE (GNU Compiler Collection (GCC) Internals)|website=gcc.gnu.org|access-date=July 25, 2016|archive-date=January 18, 2023|archive-url=https://web.archive.org/web/20230118185814/https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html|url-status=live}}</ref> complex [[Expression (programming)|expressions]] are split into a [[three-address code]] using [[temporary variable]]s. This representation was inspired by the SIMPLE representation proposed in the McCAT compiler<ref>{{cite web |url=http://www-acaps.cs.mcgill.ca/info/McCAT/McCAT.html |title=McCAT |access-date=2017-09-14 |url-status=bot: unknown |archive-url=https://web.archive.org/web/20040812030043/http://www-acaps.cs.mcgill.ca/info/McCAT/McCAT.html |archive-date=August 12, 2004 |df=mdy-all }}</ref> by Laurie J. Hendren<ref>{{Cite web|url=http://www.sable.mcgill.ca/~hendren/|title=Laurie Hendren's Home Page|website=www.sable.mcgill.ca|access-date=July 20, 2009|archive-date=September 27, 2022|archive-url=https://web.archive.org/web/20220927074148/http://www.sable.mcgill.ca/~hendren/|url-status=live}}</ref> for simplifying the analysis and [[Optimization (computer science)|optimization]] of [[Imperative programming|imperative programs]]. === Optimization === Optimization can occur during any phase of compilation; however, the bulk of optimizations are performed after the syntax and [[Semantic analysis (compilers)|semantic analysis]] of the front end and before the [[Code generation (compiler)|code generation]] of the back end; thus a common, though somewhat self-contradictory, name for this part of the compiler is the "middle end." The exact set of GCC optimizations varies from release to release as it develops, but includes the standard algorithms, such as [[loop optimization]], [[jump threading]], [[common subexpression elimination]], [[instruction scheduling]], and so forth. The [[Register transfer language|RTL]] optimizations are of less importance with the addition of global SSA-based optimizations on [[GIMPLE]] trees,<ref>{{cite web|url=http://www.redhat.com/magazine/002dec04/features/gcc/|title=From Source to Binary: The Inner Workings of GCC|last=Novillo|first=Diego|work=[[Red Hat#Opensource.com|Red Hat Magazine]]|date=December 2004|url-status=dead|archive-url=https://web.archive.org/web/20090401215553/http://www.redhat.com/magazine/002dec04/features/gcc/|archive-date=April 1, 2009|df=mdy-all}}</ref> as RTL optimizations have a much more limited scope, and have less high-level information. Some of these optimizations performed at this level include [[dead-code elimination]], [[partial-redundancy elimination]], [[global value numbering]], [[sparse conditional constant propagation]], and [[scalar replacement of aggregates]]. Array dependence based optimizations such as [[automatic vectorization]] and [[automatic parallelization]] are also performed. [[Profile-guided optimization]] is also possible.<ref>{{Cite web|url=https://gcc.gnu.org/install/build.html#TOC4|title=Installing GCC: Building - GNU Project|website=gcc.gnu.org|access-date=July 25, 2016|archive-date=August 22, 2023|archive-url=https://web.archive.org/web/20230822141635/http://gcc.gnu.org/install/build.html#TOC4|url-status=live}}</ref> === C++ Standard Library (libstdc++) === The GCC project includes an implementation of the [[C++ Standard Library]] called libstdc++,<ref>{{cite web|url=https://gcc.gnu.org/onlinedocs/libstdc++|title=The GNU C++ Library|publisher=GNU Project|accessdate=2021-02-21|archive-date=December 25, 2022|archive-url=https://web.archive.org/web/20221225041607/https://gcc.gnu.org/onlinedocs/libstdc++/|url-status=live}}</ref> licensed under the GPLv3 License with an exception to link non-GPL applications when sources are built with GCC.<ref>{{cite web|url=https://gcc.gnu.org/onlinedocs/libstdc++/manual/license.html|title=License|publisher=GNU Project|accessdate=2021-02-21|archive-date=January 18, 2023|archive-url=https://web.archive.org/web/20230118185814/https://gcc.gnu.org/onlinedocs/libstdc++/manual/license.html|url-status=live}}</ref> === Other features === Some features of GCC include: ; Link-time optimization : [[Link-time optimization]] optimizes across object file boundaries to directly improve the linked binary. Link-time optimization relies on an intermediate file containing the serialization of some ''Gimple'' representation included in the object file.{{Citation needed|date=January 2016}} The file is generated alongside the object file during source compilation. Each source compilation generates a separate object file and link-time helper file. When the object files are linked, the compiler is executed again and uses the helper files to optimize code across the separately compiled object files. ; Plugins : [[Plug-in (computing)|Plugins]] extend the GCC compiler directly.<ref>{{cite web |title= Plugins |url= https://gcc.gnu.org/onlinedocs/gccint/Plugins.html |work= GCC online documentation |access-date= July 8, 2013 |archive-date= April 30, 2013 |archive-url= https://web.archive.org/web/20130430223330/http://gcc.gnu.org/onlinedocs/gccint/Plugins.html |url-status= live }}</ref> Plugins allow a stock compiler to be tailored to specific needs by external code loaded as plugins. For example, plugins can add, replace, or even remove middle-end passes operating on ''Gimple'' representations.<ref>{{cite web|last=Starynkevitch|first=Basile|title=GCC plugins thru the MELT example|url=http://gcc-melt.org/gcc-plugin-MELT-LinuxCollabSummit2014.pdf |archive-url=https://web.archive.org/web/20140413124801/http://gcc-melt.org/gcc-plugin-MELT-LinuxCollabSummit2014.pdf |archive-date=2014-04-13 |url-status=live|access-date=2014-04-10}}</ref> Several GCC plugins have already been published, notably: :* The Python plugin, which links against libpython, and allows one to invoke arbitrary Python scripts from inside the compiler. The aim is to allow GCC plugins to be written in Python. :* The MELT plugin provides a high-level [[Lisp (programming language)|Lisp]]-like language to extend GCC.<ref>{{cite web|title=About GCC MELT|url=http://gcc-melt.org/|access-date=July 8, 2013|archive-url= https://archive.today/20130704015544/http://gcc-melt.org/|archive-date=July 4, 2013|url-status=live}}</ref> : The support of plugins was once a contentious issue in 2007.<ref>{{cite web |title=GCC unplugged [LWN.net] |url=https://lwn.net/Articles/259157/ |website=lwn.net |access-date=March 28, 2021 |archive-date=November 9, 2020 |archive-url=https://web.archive.org/web/20201109001410/https://lwn.net/Articles/259157/ |url-status=live }}</ref> ; C++ [[Software transactional memory|transactional memory]] : The C++ language has an active proposal for transactional memory. It can be enabled in GCC 6 and newer when compiling with <code>-fgnu-tm</code>.<ref name="gcc6"/><ref>{{Cite web|url=https://gcc.gnu.org/wiki/TransactionalMemory|title=TransactionalMemory - GCC Wiki|website=gcc.gnu.org|access-date=September 19, 2016|archive-date=August 19, 2016|archive-url=https://web.archive.org/web/20160819055121/http://gcc.gnu.org/wiki/TransactionalMemory|url-status=live}}</ref> ; Unicode identifiers : Although the C++ language requires support for non-ASCII [[Unicode characters]] in [[Identifier (computer languages)|identifiers]], the feature has only been supported since GCC 10. As with the existing handling of string literals, the source file is assumed to be encoded in [[UTF-8]]. The feature is optional in C, but has been made available too since this change.<ref>{{Cite web|url=https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01667.html|title=Lewis Hyatt - [PATCH] wwwdocs: Document support for extended identifiers added to GCC|website=gcc.gnu.org|access-date=2020-03-27|archive-date=March 27, 2020|archive-url=https://web.archive.org/web/20200327153559/https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01667.html|url-status=live}}</ref><ref>{{Cite web|url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3146.html|title=Recommendations for extended identifier characters for C and C++|website=www.open-std.org|access-date=2020-03-27|archive-date=September 30, 2020|archive-url=https://web.archive.org/web/20200930152408/http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3146.html|url-status=live}}</ref> ; C extensions : GNU C extends the C programming language with several non-standard-features, including [[nested function]]s.<ref>{{Cite web|title=C Extensions (Using the GNU Compiler Collection (GCC))|url=https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html|access-date=2022-01-12|website=gcc.gnu.org|archive-date=January 12, 2022|archive-url=https://web.archive.org/web/20220112203037/https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html|url-status=live}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)