Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Name mangling
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== C++ === {{anchor|Name mangling in C++}} [[C++]] compilers are the most widespread users of name mangling. The first C++ compilers were implemented as translators to [[C (programming language)|C]] source code, which would then be compiled by a C compiler to object code; because of this, symbol names had to conform to C identifier rules. Even later, with the emergence of compilers that produced machine code or assembly directly, the system's [[Linker (computing)|linker]] generally did not support C++ symbols, and mangling was still required. The [[C++]] language does not define a standard decoration scheme, so each compiler uses its own. C++ also has complex language features, such as [[C++ class|classes]], [[C++ template|templates]], [[Namespace (C++)|namespaces]], and [[operator overloading]], that alter the meaning of specific symbols based on context or usage. Meta-data about these features can be disambiguated by mangling (decorating) the name of a [[symbol (computing)|symbol]]. Because the name-mangling systems for such features are not standardized across compilers, few linkers can link object code that was produced by different compilers. ====Simple example==== A single C++ translation unit might define two functions named {{code|f()}}: <syntaxhighlight lang="cpp"> int f () { return 1; } int f (int) { return 0; } void g () { int i = f(), j = f(0); } </syntaxhighlight> These are distinct functions, with no relation to each other apart from the name. The C++ compiler will therefore encode the type information in the symbol name, the result being something resembling: <syntaxhighlight lang="cpp"> int __f_v () { return 1; } int __f_i (int) { return 0; } void __g_v () { int i = __f_v(), j = __f_i(0); } </syntaxhighlight> Even though its name is unique, {{code|g()}} is still mangled: name mangling applies to ''all'' C++ symbols (except for those in an <syntaxhighlight lang="cpp" inline>extern "C"{}</syntaxhighlight> block). ====Complex example==== The mangled symbols in this example, in the comments below the respective identifier name, are those produced by the GNU GCC 3.x compilers, according to the [[IA-64]] (Itanium) ABI: <syntaxhighlight lang="cpp"> namespace wikipedia { class article { public: std::string format (); // = _ZN9wikipedia7article6formatEv bool print_to (std::ostream&); // = _ZN9wikipedia7article8print_toERSo class wikilink { public: wikilink (std::string const& name); // = _ZN9wikipedia7article8wikilinkC1ERKSs }; }; } </syntaxhighlight> All mangled symbols begin with {{code|_Z}} (note that an identifier beginning with an underscore followed by a capital letter is a [[reserved identifier]] in C, so conflict with user identifiers is avoided); for nested names (including both namespaces and classes), this is followed by {{code|N}}, then a series of <length, id> pairs (the length being the length of the next identifier), and finally {{code|E}}. For example, {{code|wikipedia::article::format}} becomes: _ZN9wikipedia7article6formatE For functions, this is then followed by the type information; as {{code|format()}} is a {{code|void}} function, this is simply {{code|v}}; hence: _ZN9wikipedia7article6formatEv For {{code|print_to}}, the standard type {{code|std::ostream}} (which is a {{mono|[[typedef]]}} for {{code|std::basic_ostream<char, std::char_traits<char> >}}) is used, which has the special alias {{code|So}}; a reference to this type is therefore {{code|RSo}}, with the complete name for the function being: _ZN9wikipedia7article8print_toERSo ====How different compilers mangle the same functions==== There isn't a standardized scheme by which even trivial C++ identifiers are mangled, and consequently different compilers (or even different versions of the same compiler, or the same compiler on different platforms) mangle public symbols in radically different (and thus totally incompatible) ways. Consider how different C++ compilers mangle the same functions: {| class="wikitable" !Compiler !{{codett|void h(int)}} !{{codett|void h(int, char)}} !{{codett|void h(void)}} |- |Intel C++ 8.0 for Linux | rowspan=5 | {{codett|_Z1hi}} | rowspan=5 | {{codett|_Z1hic}} | rowspan=5 | {{codett|_Z1hv}} |- |HP aC++ A.05.55 [[IA-64]] |- |IAR EWARM C++ |- |[[GNU Compiler Collection|GCC]] 3.''x'' and higher |- |[[Clang]] 1.''x'' and higher<ref>{{citation |url=http://clang.llvm.org/features.html#gcccompat |title=Clang - Features and Goals: GCC Compatibility |date=15 April 2013}}</ref> |- |[[GNU Compiler Collection|GCC]] 2.9.''x'' | rowspan=2 | {{codett|h__Fi}} | rowspan=2 | {{codett|h__Fic}} | rowspan=2 | {{codett|h__Fv}} |- |HP aC++ A.03.45 PA-RISC |- |[[Microsoft Visual C++]] v6-v10 ([[v:Visual C++ name mangling|mangling details]]) | rowspan=2 | {{codett|?h@@YAXH@Z}} | rowspan=2 | {{codett|?h@@YAXHD@Z}} | rowspan=2 | {{codett|?h@@YAXXZ}} |- |[[Digital Mars]] C++ |- |[[Borland]] C++ v3.1 |{{codett|@h$qi}} |{{codett|@h$qizc}} |{{codett|@h$qv}} |- |[[OpenVMS]] C++ v6.5 (ARM mode) |{{codett|H__XI}} |{{codett|H__XIC}} |{{codett|H__XV}} |- |OpenVMS C++ v6.5 (ANSI mode) | |{{codett|CXX$__7H__FIC26CDH77}} |{{codett|CXX$__7H__FV2CB06E8}} |- |OpenVMS C++ X7.1 IA-64 |{{codett|CXX$_Z1HI2DSQ26A}} |{{codett|CXX$_Z1HIC2NP3LI4}} |{{codett|CXX$_Z1HV0BCA19V}} |- |SunPro CC |{{codett|__1cBh6Fi_v_}} |{{codett|__1cBh6Fic_v_}} |{{codett|__1cBh6F_v_}} |- |Tru64 C++ v6.5 (ARM mode) |{{codett|h__Xi}} |{{codett|h__Xic}} |{{codett|h__Xv}} |- |Tru64 C++ v6.5 (ANSI mode) |{{codett|__7h__Fi}} |{{codett|__7h__Fic}} |{{codett|__7h__Fv}} |- |Watcom C++ 10.6 |{{codett|W?h$n(i)v}} |{{codett|W?h$n(ia)v}} |{{codett|W?h$n()v}} |} Notes: *The [[Compaq]] C++ compiler on [[OpenVMS]] [[VAX]] and [[DEC Alpha|Alpha]] (but not IA-64) and [[Tru64 UNIX]] has two name mangling schemes. The original, pre-standard scheme is known as the ARM model, and is based on the name mangling described in the C++ Annotated Reference Manual (ARM). With the advent of new features in standard C++, particularly [[template (programming)|templates]], the ARM scheme became more and more unsuitable β it could not encode certain function types, or produced identically mangled names for different functions. It was therefore replaced by the newer ''[[American National Standards Institute]]'' (ANSI) model, which supported all ANSI template features, but was not backward compatible. *On IA-64, a standard [[application binary interface]] (ABI) exists (see [[#External links|external links]]), which defines (among other things) a standard name-mangling scheme, and which is used by all the IA-64 compilers. GNU GCC 3.''x'' has further adopted the name mangling scheme defined in this standard for use on other, non-Intel platforms. *The [[Visual Studio]] and Windows SDK include the program {{code|undname}} which prints the C-style function prototype for a given mangled name. *On Microsoft Windows, the Intel compiler<ref>{{cite web|url=https://community.intel.com/t5/Intel-C-Compiler/OBJ-differences-between-Intel-Compiler-and-VC-Compiler/td-p/871131|title=OBJ differences between Intel Compiler and VC Compiler|website=software.intel.com|archive-url=https://web.archive.org/web/20220520203123/https://community.intel.com/t5/Intel-C-Compiler/OBJ-differences-between-Intel-Compiler-and-VC-Compiler/td-p/871131|archive-date=2022-05-20|url-status=dead}}</ref> and [[Clang]]<ref>{{cite web|url=http://clang.llvm.org/docs/MSVCCompatibility.html#abi-features|title=MSVC compatibility|access-date=13 May 2016}}</ref> uses the Visual C++ name mangling for compatibility. ====Handling of C symbols when linking from C++==== The job of the common C++ idiom: <syntaxhighlight lang="cpp"> #ifdef __cplusplus extern "C" { #endif /* ... */ #ifdef __cplusplus } #endif </syntaxhighlight> is to ensure that the symbols within are "unmangled" β that the compiler emits a binary file with their names undecorated, as a C compiler would do. As C language definitions are unmangled, the C++ compiler needs to avoid mangling references to these identifiers. For example, the standard strings library, {{code|<string.h>}}, usually contains something resembling: <syntaxhighlight lang="cpp"> #ifdef __cplusplus extern "C" { #endif void *memset (void *, int, size_t); char *strcat (char *, const char *); int strcmp (const char *, const char *); char *strcpy (char *, const char *); #ifdef __cplusplus } #endif </syntaxhighlight> Thus, code such as: <syntaxhighlight lang="cpp"> if (strcmp(argv[1], "-x") == 0) strcpy(a, argv[2]); else memset (a, 0, sizeof(a)); </syntaxhighlight> uses the correct, unmangled {{code|strcmp}} and {{code|memset}}. If the {{code|extern "C"}} had not been used, the (SunPro) C++ compiler would produce code equivalent to: <syntaxhighlight lang="cpp"> if (__1cGstrcmp6Fpkc1_i_(argv[1], "-x") == 0) __1cGstrcpy6Fpcpkc_0_(a, argv[2]); else __1cGmemset6FpviI_0_ (a, 0, sizeof(a)); </syntaxhighlight> Since those symbols do not exist in the C runtime library (''e.g.'' libc), link errors would result. <!-- Useful links. I don't know which should end up in the finished article: * http://www.ucmb.ulb.ac.be/documents/libg++_2.7.1/g++FAQ_35.html * http://theoryx5.uwinnipeg.ca/gnu/gcc/gxxint_15.html This one is ancient, should not be used. * http://www.kegel.com/mangle.html This one describes an old version of Visual C++. * http://www.pgroup.com/ppro_docs/pgiws_ug/pgiug_14.htm --> ====Standardized name mangling in C++==== It would seem that standardized name mangling in the C++ language would lead to greater interoperability between compiler implementations. However, such a standardization by itself would not suffice to guarantee C++ compiler interoperability and it might even create a false impression that interoperability is possible and safe when it isn't. Name mangling is only one of several [[application binary interface]] (ABI) details that need to be decided and observed by a C++ implementation. Other ABI aspects like [[exception handling]], [[virtual table]] layout, structure, and stack frame [[Data structure alignment|padding]] also cause differing C++ implementations to be incompatible. Further, requiring a particular form of mangling would cause issues for systems where implementation limits (e.g., length of symbols) dictate a particular mangling scheme. A standardized ''requirement'' for name mangling would also prevent an implementation where mangling was not required at all β for example, a linker that understood the C++ language. The [[ISO/IEC 14882|C++ standard]] therefore does not attempt to standardize name mangling. On the contrary, the ''Annotated C++ Reference Manual'' (also known as ''ARM'', {{ISBN|0-201-51459-1}}, section 7.2.1c) actively encourages the use of different mangling schemes to prevent linking when other aspects of the ABI are incompatible. Nevertheless, as detailed in the section above, on some platforms<ref>{{cite web|url=https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling|title=Itanium C++ ABI, Section 5.1 External Names (a.k.a. Mangling)|access-date=16 May 2016}}</ref> the full C++ ABI has been standardized, including name mangling. ====Real-world effects of C++ name mangling==== Because C++ symbols are routinely exported from [[Dynamic-link library|DLL]] and [[shared object]] files, the name mangling scheme is not merely a compiler-internal matter. Different compilers (or different versions of the same compiler, in many cases) produce such binaries under different name decoration schemes, meaning that symbols are frequently unresolved if the compilers used to create the library and the program using it employed different schemes. For example, if a system with multiple C++ compilers installed (e.g., GNU GCC and the OS vendor's compiler) wished to install the [[Boost C++ Libraries]], it would have to be compiled multiple times (once for GCC and once for the vendor compiler). It is good for safety purposes that compilers producing incompatible object codes (codes based on different ABIs, regarding e.g., classes and exceptions) use different name mangling schemes. This guarantees that these incompatibilities are detected at the linking phase, not when executing the software (which could lead to obscure bugs and serious stability issues). For this reason, name decoration is an important aspect of any C++-related [[Application binary interface|ABI]]. There are instances, particularly in large, complex code bases, where it can be difficult or impractical to map the mangled name emitted within a linker error message back to the particular corresponding token/variable-name in the source. This problem can make identifying the relevant source file(s) very difficult for build or test engineers even if only one compiler and linker are in use. Demanglers (including those within the linker error reporting mechanisms) sometimes help but the mangling mechanism itself may discard critical disambiguating information. ====Demangle via c++filt==== <syntaxhighlight lang="console"> $ c++filt -n _ZNK3MapI10StringName3RefI8GDScriptE10ComparatorIS0_E16DefaultAllocatorE3hasERKS0_ Map<StringName, Ref<GDScript>, Comparator<StringName>, DefaultAllocator>::has(StringName const&) const </syntaxhighlight> ====Demangle via builtin GCC ABI==== <syntaxhighlight lang="cpp"> #include <stdio.h> #include <stdlib.h> #include <cxxabi.h> int main() { const char *mangled_name = "_ZNK3MapI10StringName3RefI8GDScriptE10ComparatorIS0_E16DefaultAllocatorE3hasERKS0_"; int status = -1; char *demangled_name = abi::__cxa_demangle(mangled_name, NULL, NULL, &status); printf("Demangled: %s\n", demangled_name); free(demangled_name); return 0; } </syntaxhighlight> Output: {{codett|Demangled: Map<StringName, Ref<GDScript>, Comparator<StringName>, DefaultAllocator>::has(StringName const&) const}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)