Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Hungarian notation
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Computer programming identifier naming notation}} '''Hungarian notation''' is an [[identifier naming convention]] in [[computer programming]] in which the name of a [[variable (computer science)|variable]] or [[subroutine|function]] indicates its intention or kind, or in some dialects, its [[data type|type]]. The original Hungarian notation uses only intention or kind in its naming convention and is sometimes called ''Apps Hungarian'' as it became popular in the [[Microsoft]] Apps division in the development of [[Microsoft Office]] applications. When the [[Microsoft Windows]] division adopted the naming convention, they based it on the actual data type, and this convention became widely spread through the [[Windows API]]; this is sometimes called ''Systems Hungarian'' notation. {{Quote box |quote = '''Simonyi''': ...BCPL [had] a single type which was a 16-bit word... not that it matters. '''Booch''': Unless you continue the Hungarian notation. '''Simonyi''': Absolutely... we went over to the typed languages too later ... But ... we would look at one name and I would tell you exactly a lot about that...<ref>{{cite web|url=http://archive.computerhistory.org/resources/access/text/2015/06/102702232-05-01-acc.pdf |archive-url=https://web.archive.org/web/20150910103308/http://archive.computerhistory.org/resources/access/text/2015/06/102702232-05-01-acc.pdf |archive-date=2015-09-10 |url-status=live|title=Oral History of Charles Simonyi |website=Archive.computerhistory.org\accessdate=5 August 2018}}</ref> |width = 30% |align = right }} Hungarian notation was designed to be language-independent, and found its first major use with the [[BCPL]] programming language. Because BCPL has no data types other than the machine [[Word (computer architecture)|word]], nothing in the language itself helps a [[programmer]] remember variables' types. Hungarian notation aims to remedy this by providing the programmer with explicit knowledge of each variable's data type. In Hungarian notation, a variable name starts with a group of lower-case letters which are [[mnemonic]]s for the type or purpose of that variable, followed by whatever name the programmer has chosen; this last part is sometimes distinguished as the ''given name''. The first character of the given name can be capitalized to separate it from the type indicators (see also [[CamelCase]]). Otherwise the case of this character denotes scope. ==History== The original Hungarian notation was invented by [[Charles Simonyi]], a programmer who worked at [[Xerox PARC]] circa 1972–1981, and who later became Chief Architect at [[Microsoft]]. The name of the notation is a reference to Simonyi's nation of origin, and also, according to [[Andy Hertzfeld]], because it made programs "look like they were written in some inscrutable foreign language".<ref name=Rosenberg>{{cite news |last1=Rosenberg |first1=Scott |title=Anything You Can Do, I Can Do Meta |url=https://www.technologyreview.com/2007/01/01/227178/anything-you-can-do-i-can-do-meta/ |access-date=21 July 2022 |work=MIT Technology Review |date=1 January 2007 |language=en}}</ref> [[Hungarian name|Hungarian people's names]] are "reversed" compared to most other European names; [[Name order|the family name precedes the given name]]. For example, the anglicized name "Charles Simonyi" in [[Hungarian language|Hungarian]] was originally "Simonyi Károly". In the same way, the type name precedes the "given name" in Hungarian notation. The similar [[Smalltalk]] "type last" naming style (e.g. aPoint and lastPoint) was common at Xerox PARC during Simonyi's tenure there.{{cn|date=July 2022}} Simonyi's paper on the notation referred to prefixes used to indicate the "type" of information being stored.<ref name="simonyi"/><ref name="Spolsky"/> His proposal was largely concerned with decorating identifier names based upon the semantic information of what they store (in other words, the variable's ''purpose''). Simonyi's notation came to be called Apps Hungarian, since the convention was used in the [[application software|applications]] division of Microsoft. Systems Hungarian developed later in the [[Microsoft Windows]] development team. Apps Hungarian is not entirely distinct from what became known as Systems Hungarian, as some of Simonyi's suggested prefixes contain little or no semantic information (see below for examples).<ref name="Spolsky"/> =={{anchor|Systems|Apps}}Systems Hungarian vs. Apps Hungarian== Where Systems notation and Apps notation differ is in the purpose of the prefixes. In Systems Hungarian notation, the prefix encodes the actual data type of the variable. For example: *<code>lAccountNum</code> : variable is a ''long integer'' (<code>"l"</code>); *<code>arru8NumberList</code> : variable is ''an '''arr'''ay of '''u'''nsigned '''8'''-bit integers'' (<code>"arru8"</code>); *<code>bReadLine(bPort,&arru8NumberList)</code> : function with a byte-value return code. *<code>strName</code> : Variable represents a string (<code>"str"</code>) containing the name, but does not specify how that string is implemented. Apps Hungarian notation strives to encode the logical data type rather than the physical data type; in this way, it gives a hint as to what the variable's purpose is, or what it represents. *<code>rwPosition</code> : variable represents a ''row'' (<code>"rw"</code>); *<code>usName</code> : variable represents an ''unsafe string'' (<code>"us"</code>), which needs to be "sanitized" before it is used (e.g. see [[code injection]] and [[cross-site scripting]] for examples of attacks that can be caused by using raw user input) *<code>szName</code> : variable is a '''''z'''ero-terminated '''s'''tring'' (<code>"sz"</code>); this was one of Simonyi's original suggested prefixes. Most, but not all, of the prefixes Simonyi suggested are semantic in nature. To modern eyes, some prefixes seem to represent physical data types, such as <code>sz</code> for strings. However, such prefixes were still semantic, as Simonyi intended Hungarian notation for languages whose type systems could not distinguish some data types that modern languages take for granted. The following are examples from the original paper:<ref name="simonyi">{{cite web | author = Charles Simonyi | title = Hungarian Notation |date=November 1999 | publisher = [[Microsoft Corp.]] | work = MSDN Library | url = http://msdn2.microsoft.com/en-us/library/aa260976(VS.60).aspx | author-link = Charles Simonyi }} </ref> * <code>p''X''</code> is a pointer to another type ''X''; this contains very little semantic information. * <code>''d''</code> is a prefix meaning difference between two values; for instance, ''dY'' might represent a distance along the Y-axis of a graph, while a variable just called ''y'' might be an absolute position. This is entirely semantic in nature. * <code>''sz''</code> is a null- or zero-terminated string. In C, this contains some semantic information because it is not clear whether a variable of type ''char*'' is a pointer to a single character, an array of characters or a zero-terminated string. * <code>''w''</code> marks a variable that is a word. This contains essentially no semantic information at all, and would probably be considered Systems Hungarian. * <code>''b''</code> marks a byte, which in contrast to w might have semantic information, because in C the only byte-sized data type is the ''char'', so these are sometimes used to hold numeric values. This prefix might clear ambiguity between whether the variable is holding a value that should be treated as a character or a number. While the notation always uses initial lower-case letters as mnemonics, it does not prescribe the mnemonics themselves. There are several widely used conventions (see examples below), but any set of letters can be used, as long as they are consistent within a given body of code. It is possible for code using Apps Hungarian notation to sometimes contain Systems Hungarian when describing variables that are defined solely in terms of their type. ==Relation to sigils== In some programming languages, a similar notation now called [[sigil (computer programming)|sigil]]s is built into the language and enforced by the [[compiler]]. For example, in some forms of [[BASIC programming language|BASIC]], <code>name$</code> names a [[string (computer science)|string]] and <code>count%</code> names an [[integer]]. The major difference between Hungarian notation and sigils is that sigils declare the type of the variable in the language, whereas Hungarian notation is purely a naming scheme with no effect on the machine interpretation of the program text. ==Examples== *<code>bBusy</code> : [[Boolean data type|Boolean]] *<code>chInitial</code> : [[character (computing)|char]] *<code>cApples</code> : count of items *<code>dwLightYears</code> : double [[Word (data type)|word]] (Systems) *<code>fBusy</code> : [[Boolean data type|flag]] (or [[Floating-point|float]]) *<code>nSize</code> : [[Integer (computer science)|integer]] (Systems) or count (Apps) *<code>iSize</code> : [[Integer (computer science)|integer]] (Systems) or index (Apps) *<code>fpPrice</code> : [[floating-point]] *<code>decPrice</code> : decimal *<code>db[[Pi]]</code> : [[double precision|double]] (Systems) *<code>p[[Foo]]</code> : [[pointer (computer programming)|pointer]] *<code>rgStudents</code> : array, or range *<code>szLastName</code> : zero-terminated string *<code>u16Identifier</code> : unsigned 16-bit [[Integer (computer science)|integer]] (Systems) *<code>u32Identifier</code> : unsigned 32-bit [[Integer (computer science)|integer]] (Systems) *<code>stTime</code> : clock time structure *<code>fnFunction</code> : function name The mnemonics for pointers and [[Array data structure|arrays]], which are not actual data types, are usually followed by the type of the data element itself: *<code>pszOwner</code> : pointer to zero-terminated string *<code>rgfpBalances</code> : array of [[floating-point]] values *<code>aulColors</code> : array of unsigned long (Systems) While Hungarian notation can be applied to any programming language and environment, it was widely adopted by [[Microsoft]] for use with the C language, in particular for [[Microsoft Windows]], and its use remains largely confined to that area. In particular, use of Hungarian notation was widely [[Technology evangelist|evangelized]] by [[Charles Petzold]]'s ''"Programming Windows"'', the original (and for many readers, the definitive) book on [[Windows API]] programming. Thus, many commonly seen constructs of Hungarian notation are specific to Windows: * For programmers who learned Windows programming in C, probably the most memorable examples are the <code>wParam</code> (word-size parameter) and <code>lParam</code> (long-integer parameter) for the [[WindowProc]]() function. * <code>hwndFoo</code> : handle to a window * <code>lpszBar</code> : long pointer to a zero-terminated string The notation is sometimes extended in [[C++]] to include the [[scope (programming)|scope]] of a variable, optionally separated by an underscore.<ref>{{cite web|title=Mozilla Coding Style|url=https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Prefixes|website=Developer.mozilla.org|access-date=17 March 2015|archive-date=2 December 2019|archive-url=https://web.archive.org/web/20191202222313/https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Prefixes|url-status=dead}}</ref><ref>{{cite web|title=Webkit Coding Style Guidelines|url=http://www.webkit.org/coding/coding-style.html#names-data-members|website=Webkit.org|access-date=17 March 2015}}</ref> This extension is often also used without the Hungarian type-specification: * <code>g_nWheels</code> : member of a global namespace, integer * <code>m_nWheels</code> : member of a structure/class, integer * <code>m_wheels</code>, <code>_wheels</code> : member of a structure/class * <code>s_wheels</code> : static member of a class * <code>c_wheels</code> : static member of a function ==Advantages== (Some of these apply to Systems Hungarian only.) Supporters argue that the benefits of Hungarian Notation include:<ref name="simonyi" /> * The symbol type can be seen from its name. This is useful when looking at the code outside an integrated development environment — like on a code review or printout — or when the symbol declaration is in another file from the point of use, such as a function. * In a language that uses [[dynamic typing]] or that is untyped, the decorations that refer to types cease to be redundant. In such languages variables are typically not declared as holding a particular type of data, so the only clue as to what operations can be done on it are hints given by the programmer, such as a variable naming scheme, documentation and comments. As mentioned above, Hungarian Notation expanded in such a language ([[BCPL]]). * The formatting of variable names may simplify some aspects of [[code refactoring]] (while making other aspects more error-prone). * Multiple variables with similar semantics can be used in a block of code: dwWidth, iWidth, fWidth, dWidth. * Variable names can be easy to remember from knowing just their types. * It leads to more consistent variable names. * Inappropriate type casting and operations using incompatible types can be detected easily while reading code. * In complex programs with many global objects (VB/Delphi Forms), having a basic prefix notation can ease the work of finding the component inside of the editor. For example, searching for the string <code>btn</code> might find all the Button objects. * Applying Hungarian notation in a narrower way, such as applying only for [[Member variable|member variables]], helps avoid [[naming collision]]. * Printed code is more clear to the reader in case of datatypes, type conversions, assignments, truncations, etc. ==Disadvantages== Most arguments against Hungarian notation are against ''Systems'' Hungarian notation, not ''Apps'' Hungarian notation{{fact|date=August 2024}}. Some potential issues are: * The Hungarian notation is redundant when type-checking is done by the compiler. Compilers for languages providing strict type-checking, such as [[Pascal (programming language)|Pascal]], ensure the usage of a variable is consistent with its type automatically; checks by eye are redundant and subject to human error. * Most modern [[integrated development environment]]s display variable types on demand, and automatically flag operations which use incompatible types, making the notation largely obsolete. * Hungarian Notation becomes confusing when it is used to represent several properties, as in <!--[http://mindprod.com/jgloss/unmainnaming.html]/--> <code>a_crszkvc30LastNameCol</code>: an [[parameter (computer science)|argument]], that is [[constant (computer science)|constant]], and is a reference [[reference (computer science)|reference]] holding the contents of a [[database]] column <code>LastName</code> of type [[varchar]](30) which is part of the table's [[primary key]]. * It may lead to inconsistency when code is modified or ported. If a variable's type is changed, either the decoration on the name of the variable will be inconsistent with the new type, or the variable's name must be changed. A particularly well known example is the standard WPARAM type, and the accompanying wParam [[formal parameter]] in many Windows system function declarations. The 'w' stands for 'word', where 'word' is the native word size of the platform's hardware architecture. It was originally a 16 bit type on 16-bit word architectures, but was changed to a 32-bit on 32-bit word architectures, or 64-bit type on 64-bit word architectures in later versions of the operating system while retaining its original name (its true underlying type is UINT_PTR, that is, an unsigned integer large enough to hold a pointer). The semantic impedance, and hence programmer confusion and inconsistency from platform-to-platform, is on the assumption that 'w' stands for a two byte, 16-bit word in those different environments. * Most of the time, knowing the use of a variable implies knowing its type. Furthermore, if the usage of a variable is not known, it cannot be deduced from its type. * Hungarian notation reduces the benefits of using code editors that support completion on variable names, for the programmer has to input the type specifier first, which is more likely to collide with other variables than when using other naming schemes. * It makes code less readable, by obfuscating the purpose of the variable with type and scoping prefixes.<ref>{{cite book | last=Jones | first=Derek M. | title=The New C Standard: A Cultural and Economic Commentary | url=http://www.coding-guidelines.com/cbook/cbook1_2.pdf |archive-url=https://web.archive.org/web/20110501142254/http://www.coding-guidelines.com/cbook/cbook1_2.pdf |archive-date=2011-05-01 |url-status=live | page=727 | year=2009 | publisher=Addison-Wesley | isbn=978-0-201-70917-9}}</ref> * The additional type information can insufficiently replace more descriptive names. E.g. sDatabase does not tell the reader what it is. databaseName might be a more descriptive name. * When names are sufficiently descriptive, the additional type information can be redundant. E.g. firstName is most likely a string. So naming it sFirstName only adds clutter to the code. * It's harder to remember the names. * Multiple variables with '''different''' semantics can be used in a block of code with similar names: ''dwTmp, iTmp, fTmp, dTmp''. ==Notable opinions== * [[Robert Cecil Martin]] (against Hungarian notation and all other forms of encoding): <blockquote>... nowadays HN and other forms of type encoding are simply impediments. They make it harder to change the name or type of a variable, function, member or class. They make it harder to read the code. And they create the possibility that the encoding system will mislead the reader.<ref>{{cite book | last = Martin |first=Robert Cecil | date = 2008 | title = Clean Code: A Handbook of Agile Software Craftsmanship | location = Redmond, WA | publisher = Prentice Hall PTR | isbn = 978-0-13-235088-4 }}</ref></blockquote> * [[Linus Torvalds]] (against Systems Hungarian): <blockquote>Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged—the compiler knows the types anyway and can check those, and it only confuses the programmer.<ref>{{cite web | title = Linux kernel coding style | work = [[Linux kernel]] documentation | url = https://www.kernel.org/doc/html/v4.10/process/coding-style.html | access-date = 9 March 2018 }}</ref></blockquote> * [[Steve McConnell]] (for Apps Hungarian): <blockquote>Although the Hungarian naming convention is no longer in widespread use, the basic idea of standardizing on terse, precise abbreviations continues to have value. Standardized prefixes allow you to check types accurately when you're using abstract data types that your compiler can't necessarily check.<ref>{{cite book | last = McConnell |first=Steve |author-link=Steve McConnell | date = 2004 | title = [[Code Complete]] | edition = 2nd | location = Redmond, WA | publisher = [[Microsoft Press]] | isbn = 0-7356-1967-0 }}</ref></blockquote> * [[Bjarne Stroustrup]] (against Systems Hungarian for C++):<blockquote>No I don't recommend 'Hungarian'. I regard 'Hungarian' (embedding an abbreviated version of a type in a variable name) as a technique that can be useful in untyped languages, but is completely unsuitable for a language that supports generic programming and object-oriented programming — both of which emphasize selection of operations based on the type and arguments (known to the language or to the run-time support). In this case, 'building the type of an object into names' simply complicates and minimizes abstraction.<ref>{{cite web | last = Stroustrup |first=Bjarne |author-link = Bjarne Stroustrup | date = 2007 | title = Bjarne Stroustrup's C++ Style and Technique FAQ | url = http://www.stroustrup.com/bs_faq2.html#Hungarian | access-date = 15 February 2015 }}</ref></blockquote> * [[Joel Spolsky]] (for Apps Hungarian): <blockquote>If you read Simonyi's paper closely, what he was getting at was the same kind of naming convention as I used in my example above where we decided that <code>'''us'''</code> meant unsafe string and <code>'''s'''</code> meant safe string. They're both of type <code>'''string'''</code>. The compiler won't help you if you assign one to the other and Intellisense [an [[intelligent code completion]] system] won't tell you [[:wikt:bupkis#English|bupkis]]. But they are semantically different. They need to be interpreted differently and treated differently and some kind of conversion function will need to be called if you assign one to the other or you will have a runtime bug. If you're lucky. There's still a tremendous amount of value to Apps Hungarian, in that it increases collocation in code, which makes the code easier to read, write, debug and maintain, and, most importantly, it makes wrong code look wrong.... (Systems Hungarian) was a subtle but complete misunderstanding of Simonyi’s intention and practice.<ref name="Spolsky">{{cite web | last = Spolsky |first=Joel |author-link = Joel Spolsky | date = 2005-05-11 | title = Making Wrong Code Look Wrong | work = Joel on Software | url = http://www.joelonsoftware.com/articles/Wrong.html | access-date = 2005-12-13 }}</ref></blockquote> * [[Microsoft]]'s Design Guidelines<ref name=MSDotNetDeveloperGuide>{{cite web | title = Design Guidelines for Developing Class Libraries: General Naming Conventions | url = http://msdn2.microsoft.com/en-us/library/ms229045.aspx | access-date = 2008-01-03 }}</ref> discourage developers from using Systems Hungarian notation when they choose names for the elements in .NET class libraries, although it was common on prior Microsoft development platforms like [[Visual Basic (classic)|Visual Basic 6]] and earlier. These Design Guidelines are silent on the naming conventions for local variables inside functions. ==See also== * [[Leszynski naming convention]], a variant of Hungarian for database development * [[Camel case]], another widespread naming convention * [[Polish notation]], an unrelated concept with a similar name ==References== {{reflist}} ==External links== *[https://web.archive.org/web/20180519042122/http://www.parc.com/publication/1940/meta-programming.html Meta-Programming: A Software Production Method] Charles Simonyi, December 1976 (PhD Thesis) *[https://blogs.msdn.microsoft.com/larryosterman/2004/06/22/hugarian-notation-its-my-turn-now/ Hugarian{{sic|hide=y}} notation - it's my turn now :)] – Larry Osterman's WebLog *[http://msdn.microsoft.com/en-us/library/aa260976%28VS.60%29.aspx Hungarian Notation] (MSDN) *[https://idleloop.com/hungarian/ HTML version of Doug Klunder's paper], Idle Loop Software Design, [https://web.archive.org/web/20230509101456/https://idleloop.com/hungarian/ archived May 9, 2023] *[http://www.xoc.net/standards/rvbanc.asp RVBA Naming Conventions] *[http://msdn.microsoft.com/en-us/library/aa378932%28VS.85%29.aspx Coding Style Conventions] (MSDN) {{DEFAULTSORT:Hungarian Notation}} [[Category:Source code]] [[Category:Naming conventions]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Anchor
(
edit
)
Template:Cite book
(
edit
)
Template:Cite news
(
edit
)
Template:Cite web
(
edit
)
Template:Cn
(
edit
)
Template:Fact
(
edit
)
Template:Quote box
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Sic
(
edit
)