Hungarian notation
Encyclopedia
Hungarian notation is an identifier naming convention in computer programming
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

, in which the name of a variable or function indicates its type
Data type
In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of...

 or intended use. There are two types of Hungarian notation: Systems Hungarian notation and Apps Hungarian notation.

Hungarian notation was designed to be language-independent, and found its first major use with the BCPL
BCPL
BCPL is a procedural, imperative, and structured computer programming language designed by Martin Richards of the University of Cambridge in 1966.- Design :...

 programming language. Because BCPL has no data types other than the machine word, nothing in the language itself helps a programmer remember variables' types. Hungarian notation aims to remedy this by providing the programmer with explicit knowledge of each variable's data type.

In Hungarian notation, a variable name starts with a group of lower-case letters which are mnemonic
Mnemonic
A mnemonic , or mnemonic device, is any learning technique that aids memory. To improve long term memory, mnemonic systems are used to make memorization easier. Commonly encountered mnemonics are often verbal, such as a very short poem or a special word used to help a person remember something,...

s for the type or purpose of that variable, followed by whatever name the programmer has chosen; this last part is sometimes distinguished as the given name. The first character of the given name can be capitalized to separate it from the type indicators (see also CamelCase
CamelCase
CamelCase , also known as medial capitals, is the practice of writing compound words or phrases in which the elements are joined without spaces, with each element's initial letter capitalized within the compound and the first letter either upper or lower case—as in "LaBelle", "BackColor",...

). Otherwise the case of this character denotes scope.

History

The original Hungarian notation, which would now be called Apps Hungarian, was invented by Charles Simonyi
Charles Simonyi
Charles Simonyi is a Hungarian-American computer software executive who, as head of Microsoft's application software group, oversaw the creation of Microsoft's flagship Office suite of applications. He now heads his own company, Intentional Software, with the aim of developing and marketing his...

, a programmer who worked at Xerox PARC
Xerox PARC
PARC , formerly Xerox PARC, is a research and co-development company in Palo Alto, California, with a distinguished reputation for its contributions to information technology and hardware systems....

 circa 1972-1981, and who later became Chief Architect at Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

. It may have been derived from the earlier principle of using the first letter of a variable name to set its type — for example, variables whose names started with letters I through N in FORTRAN
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

 were integers by default.

The notation is a reference to Simonyi's nation of origin; Hungarian people's names
Hungarian name
Hungarian names invariably use the "Eastern name order", or family name followed by given name, except in foreign language text. Hungary is the only European and Western country to do so....

 are "reversed" compared to most other European names; the family name precedes the given name. For example, the anglicized name "Charles Simonyi" in Hungarian
Hungarian language
Hungarian is a Uralic language, part of the Ugric group. With some 14 million speakers, it is one of the most widely spoken non-Indo-European languages in Europe....

 was originally "Simonyi Charles" (Simonyi Károly in Hungarian). In the same way the type name precedes the "given name" in Hungarian notation rather than the more natural, to most Europeans, Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

 "type last" naming style e.g. aPoint and lastPoint. This latter naming style was most common at Xerox PARC during Simonyi's tenure there. It may also be inspired by play on the name of an almost entirely unrelated concept, Polish notation
Polish notation
Polish notation, also known as prefix notation, is a form of notation for logic, arithmetic, and algebra. Its distinguishing feature is that it places operators to the left of their operands. If the arity of the operators is fixed, the result is a syntax lacking parentheses or other brackets that...

.

The name Apps Hungarian was coined since the convention was used in the applications
Application software
Application software, also known as an application or an "app", is computer software designed to help the user to perform specific tasks. Examples include enterprise software, accounting software, office suites, graphics software and media players. Many application programs deal principally with...

 division of Microsoft. Systems Hungarian developed later in the Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 development team. Simonyi's paper referred to prefixes used to indicate the "type" of information being stored. His proposal was largely concerned with decorating identifier names based upon the semantic information of what they store (in other words, the variable's purpose), consistent with Apps Hungarian. However, his suggestions were not entirely distinct from what became known as Systems Hungarian, as some of his suggested prefixes contain little or no semantic information (see below for examples).

The term Hungarian notation is memorable for many people because the strings of unpronounceable consonants vaguely resemble the consonant-rich orthography of some Eastern Europe
Eastern Europe
Eastern Europe is the eastern part of Europe. The term has widely disparate geopolitical, geographical, cultural and socioeconomic readings, which makes it highly context-dependent and even volatile, and there are "almost as many definitions of Eastern Europe as there are scholars of the region"...

an languages despite the fact that Hungarian
Hungarian language
Hungarian is a Uralic language, part of the Ugric group. With some 14 million speakers, it is one of the most widely spoken non-Indo-European languages in Europe....

 is a Uralic language, and unlike Slavic languages is rather rich in vowels. For example the zero-terminated string prefix "sz" is also a letter
Hungarian sz
Sz is a digraph of the Latin alphabet, used in Hungarian, Polish, Kashubian, and formerly in German.-Polish:In Polish orthography, sz represents a voiceless retroflex fricative , similar to English "sh"...

 in the Hungarian alphabet
Hungarian alphabet
The Hungarian alphabet is an extension of the Latin alphabet used for writing the Hungarian language.One sometimes speaks of the smaller and greater Hungarian alphabets, depending on whether or not the letters Q, W, X, Y are listed, which can only be found in foreign words and traditional...

.

Systems vs. Apps Hungarian

Where Systems notation and Apps notation differ is in the purpose of the prefixes.

In Systems Hungarian notation, the prefix encodes the actual data type of the variable. For example:
  • lAccountNum : variable is a long integer ("l");
  • arru8NumberList : variable is an array of unsigned 8-bit integers ("arru8");
  • szName : variable is a zero-terminated string ("sz"); this was one of Simonyi's original suggested prefixes.
  • bReadLine(bPort,&arru8NumberList) : function with a byte-value return code.


Apps Hungarian notation strives to encode the logical data type rather than the physical data type; in this way, it gives a hint as to what the variable's purpose is, or what it represents.
  • rwPosition : variable represents a row ("rw");
  • usName : variable represents an unsafe string ("us"), which needs to be "sanitized" before it is used (e.g. see code injection
    Code injection
    Code injection is the exploitation of a computer bug that is caused by processing invalid data. Code injection can be used by an attacker to introduce code into a computer program to change the course of execution. The results of a code injection attack can be disastrous...

     and cross-site scripting
    Cross-site scripting
    Cross-site scripting is a type of computer security vulnerability typically found in Web applications that enables attackers to inject client-side script into Web pages viewed by other users. A cross-site scripting vulnerability may be used by attackers to bypass access controls such as the same...

     for examples of attacks that can be caused by using raw user input)
  • strName : Variable represents a string ("str") containing the name, but does not specify how that string is implemented.


Most, but not all, of the prefixes Simonyi suggested are semantic in nature. The following are examples from the original paper:
  • pX is a pointer to another type X; this contains very little semantic information.
  • d is a prefix meaning difference between two values; for instance, dY might represent a distance along the Y-axis of a graph, while a variable just called y might be an absolute position. This is entirely semantic in nature.
  • sz is a null- or zero-terminated string. In C, this contains some semantic information because it's not clear whether a variable of type char* is a pointer to a single character, an array of characters or a zero-terminated string.
  • w marks a variable that is a word. This contains essentially no semantic information at all, and would probably be considered Systems Hungarian.
  • b marks a byte, which in contrast to w might have semantic information, because in C the only byte-sized data type is the char, so these are sometimes used to hold numeric values. This prefix might clear ambiguity between whether the variable is holding a value that should be treated as a letter (or more generally a character) or a number.


While the notation always uses initial lower-case letters as mnemonics, it does not prescribe the mnemonics themselves. There are several widely used conventions (see examples below), but any set of letters can be used, as long as they are consistent within a given body of code.

It is possible for code using Apps Hungarian notation to sometimes contain Systems Hungarian when describing variables that are defined solely in terms of their type.

Relation to sigils

In some programming languages, a similar notation now called sigil
Sigil (computer programming)
In computer programming, a sigil is a symbol attached to a variable name, showing the variable's datatype or scope. In 1999 Philip Gwyn adopted the term "to mean the funny character at the front of a Perl variable".- Historical context:...

s is built into the language and enforced by the compiler. For example, in BASIC, name$ names a string
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

 and count% names an integer
Integer
The integers are formed by the natural numbers together with the negatives of the non-zero natural numbers .They are known as Positive and Negative Integers respectively...

. Sigils have the notable advantages over Hungarian notation that they implicitly define the type of the variable without need for redundant declaration, and are also checked by the compiler, preventing omission and misuse.

On the other hand, such systems are in practice less flexible than Hungarian notation, typically defining only a few different types — the lack of adequate easy-to-remember symbols obstructs more extensive use.

Examples

  • bBusy : boolean
    Boolean datatype
    In computer science, the Boolean or logical data type is a data type, having two values , intended to represent the truth values of logic and Boolean algebra...

  • chInitial : char
    Character (computing)
    In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....

  • cApples : count of items
  • dwLightYears : double word (systems)
  • fBusy : boolean
    Boolean datatype
    In computer science, the Boolean or logical data type is a data type, having two values , intended to represent the truth values of logic and Boolean algebra...

     (flag)
  • nSize : integer
    Integer (computer science)
    In computer science, an integer is a datum of integral data type, a data type which represents some finite subset of the mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values....

     (systems) or count (application)
  • iSize : integer
    Integer (computer science)
    In computer science, an integer is a datum of integral data type, a data type which represents some finite subset of the mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values....

     (systems) or index (application)
  • fpPrice: floating-point
  • dbPi
    Pi
    ' is a mathematical constant that is the ratio of any circle's circumference to its diameter. is approximately equal to 3.14. Many formulae in mathematics, science, and engineering involve , which makes it one of the most important mathematical constants...

    : double
    Double precision
    In computing, double precision is a computer number format that occupies two adjacent storage locations in computer memory. A double-precision number, sometimes simply called a double, may be defined to be an integer, fixed point, or floating point .Modern computers with 32-bit storage locations...

     (systems)
  • pFoo : pointer
  • rgStudents : array, or range
  • szLastName : zero-terminated string
  • u32Identifier : unsigned 32-bit integer
    Integer (computer science)
    In computer science, an integer is a datum of integral data type, a data type which represents some finite subset of the mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values....

     (systems)
  • stTime : clock time structure
  • fnFunction : function name


The mnemonics for pointers and arrays, which are not actual data types, are usually followed by the type of the data element itself:
  • pszOwner : pointer to zero-terminated string
  • rgfpBalances : array of floating-point values
  • aulColors : array of unsigned long (systems)


While Hungarian notation can be applied to any programming language and environment, it was widely adopted by Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

 for use with the C language, in particular for Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, and its use remains largely confined to that area. In particular, use of Hungarian notation was widely evangelized
Technology evangelist
A technical or technology evangelist is a person who attempts to build a critical mass of support for a given technology in order to establish it as a technical standard in a market that is subject to network effects...

 by Charles Petzold
Charles Petzold
Charles Petzold is an American programmer and technical author on Microsoft Windows applications. He is also a Microsoft Most Valuable Professional....

's "Programming Windows", the original (and for many readers, the definitive) book on Windows API
Windows API
The Windows API, informally WinAPI, is Microsoft's core set of application programming interfaces available in the Microsoft Windows operating systems. It was formerly called the Win32 API; however, the name "Windows API" more accurately reflects its roots in 16-bit Windows and its support on...

 programming. Thus, many commonly-seen constructs of Hungarian notation are specific to Windows:
  • For programmers who learned Windows programming in C, probably the most memorable examples are the wParam (word-size parameter) and lParam (long-integer parameter) for the WindowProc
    WindowProc
    In Win32 application programming, WindowProc is a user-defined callback function that processes messages sent to a window. This function is specified when an application registers its window class, and can be named anything...

    function.
  • hwndFoo : handle to a window
  • lpszBar : long pointer to a zero-terminated string


The notation is sometimes extended in C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 to include the scope
Scope (programming)
In computer programming, scope is an enclosing context where values and expressions are associated. Various programming languages have various types of scopes. The type of scope determines what kind of entities it can contain and how it affects them—or semantics...

 of a variable, separated by an underscore. This extension is often also used without the Hungarian type-specification:
  • g_nWheels : member of a global namespace, integer
  • m_nWheels : member of a structure/class, integer
  • m_wheels, _wheels : member of a structure/class
  • s_wheels : static member of a class

Advantages

(Some of these apply to Systems Hungarian only.)

Supporters argue that the benefits of Hungarian Notation include:
  • The variable type can be seen from its name
  • The type of value returned by a function is determined without lookup (i.e. searching for function definitions in other files, e.g. ".h" header files, etc.)
  • The formatting of variable names may simplify some aspects of code refactoring (while making some aspects more error-prone).
  • Multiple variables with similar semantics can be used in a block of code: dwWidth, iWidth, fWidth, dWidth
  • Variable names can be easy to remember from knowing just their types.
  • It leads to more consistent variable names
  • Inappropriate type casting and operations using incompatible types can be detected easily while reading code
  • Useful with string-based languages where numerics are strings (Tcl
    Tcl
    Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...

     for example)
  • In Apps Hungarian, the variable name guards against using it in an improper operation with the same data type by making the error obvious as in:
heightWindow = window.getWidth
  • When programming in a language that uses dynamic typing or that is completely untyped, the decorations that refer to types cease to be redundant. Such languages typically do not include declarations of types (or make them optional), so the only sources of what types are allowed are the names themselves, documentation such as comments, and by reading the code to understand what it does. In these languages, including an indication of the type of a variable may aid the programmer. As mentioned above, Hungarian Notation expanded in such a language (BCPL
    BCPL
    BCPL is a procedural, imperative, and structured computer programming language designed by Martin Richards of the University of Cambridge in 1966.- Design :...

    ).
  • In complex programs with lots of global objects (VB/Delphi Forms), having a basic prefix notation can ease the work of finding the component inside of the editor. Typing btn and pressing <Ctrl-Space> causes the editor to pop up a list of Button objects.
  • Applying Hungarian notation in a narrower way, such as applying only for member variables helps avoiding naming collision.

Disadvantages

Most arguments against Hungarian notation are against System Hungarian notation, not Apps Hungarian notation. Some potential issues are:
  • The Hungarian notation is redundant when type-checking is done by the compiler. Compilers for languages providing type-checking ensure the usage of a variable is consistent with its type automatically; checks by eye are redundant and subject to human error.
  • All modern Integrated development environment
    Integrated development environment
    An integrated development environment is a software application that provides comprehensive facilities to computer programmers for software development...

    s display variable types on demand, and automatically flag operations which use incompatible types, making the notation largely obsolete.
  • Hungarian Notation becomes confusing when it is used to represent several properties, as in a_crszkvc30LastNameCol: a constant reference
    Reference (computer science)
    In computer science, a reference is a value that enables a program to indirectly access a particular data item, such as a variable or a record, in the computer's memory or in some other storage device. The reference is said to refer to the data item, and accessing those data is called...

     argument
    Parameter (computer science)
    In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments...

    , holding the contents of a database
    Database
    A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

     column LastName of type varchar
    Varchar
    A varchar or Variable Character Field is a set of character data of indeterminate length. The term varchar refers to a data type of a field in a database management system...

    (30) which is part of the table's primary key.
  • It may lead to inconsistency when code is modified or ported. If a variable's type is changed, either the decoration on the name of the variable will be inconsistent with the new type, or the variable's name must be changed. A particularly well known example is the standard WPARAM type, and the accompanying wParam formal parameter in many Windows system function declarations. The 'w' stands for 'word', where 'word' is the native word size of the platform's hardware architecture. It was originally a 16 bit type on 16-bit word architectures, but was changed to a 32-bit on 32-bit word architectures, or 64-bit type on 64-bit word architectures in later versions of the operating system while retaining its original name (its true underlying type is UINT_PTR, that is, an unsigned integer large enough to hold a pointer). The semantic impedance, and hence programmer confusion and inconsistency from platform-to-platform, is on the assumption that 'w' stands for 16-bit in those different environments.
  • Most of the time, knowing the use of a variable implies knowing its type. Furthermore, if the usage of a variable is not known, it can't be deduced from its type.
  • Hungarian notation strongly reduces the benefits of using feature-rich code editors that support completion on variable names, for the programmer has to input the whole type specifier first.
  • It makes code less readable, by obfuscating the purpose of the variable with needless type and scoping prefixes.
  • The additional type information can insufficiently replace more descriptive names. E.g. sDatabase doesn't tell the reader what it is. databaseName might be a more descriptive name.
  • When names are sufficiently descriptive, the additional type information can be redundant. E.g. firstName is most likely a string. So naming it sFirstName only adds clutter to the code.
  • It's harder to remember the names.

Notable opinions

  • Linus Torvalds
    Linus Torvalds
    Linus Benedict Torvalds is a Finnish software engineer and hacker, best known for having initiated the development of the open source Linux kernel. He later became the chief architect of the Linux kernel, and now acts as the project's coordinator...

     (against systems Hungarian):
    "Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged—the compiler knows the types anyway and can check those, and it only confuses the programmer."
  • Steve McConnell
    Steve McConnell
    Steven C. McConnell is an author of many software engineering textbooks including Code Complete, Rapid Development, and Software Estimation...

     (for Hungarian):
    "Although the Hungarian naming convention is no longer in widespread use, the basic idea of standardizing on terse, precise abbreviations continues to have value. Standardized prefixes allow you to check types accurately when you're using abstract data types that your compiler can't necessarily check."
  • Bjarne Stroustrup
    Bjarne Stroustrup
    Bjarne Stroustrup ; born December 30, 1950 in Århus, Denmark) is a Danish computer scientist, most notable for the creation and the development of the widely used C++ programming language...

     (against systems Hungarian for C++):
    "No I don't recommend 'Hungarian'. I regard 'Hungarian' (embedding an abbreviated version of a type in a variable name) a technique that can be useful in untyped languages, but is completely unsuitable for a language that supports generic programming and object-oriented programming—both of which emphasize selection of operations based on the type an arguments (known to the language or to the run-time support). In this case, 'building the type of an object into names' simply complicates and minimizes abstraction."
  • Joel Spolsky
    Joel Spolsky
    Avram Joel Spolsky is a software engineer and writer. He is the author of Joel on Software, a blog on software development. He was a Program Manager on the Microsoft Excel team between 1991 and 1994. He later founded Fog Creek Software in 2000 and launched the Joel on Software blog...

     (for apps Hungarian):
    "If you read Simonyi's paper closely, what he was getting at was the same kind of naming convention as I used in my example above where we decided that us meant "unsafe string" and s meant "safe string." They're both of type string. The compiler won't help you if you assign one to the other and Intellisense won't tell you bupkis. But they are semantically different. They need to be interpreted differently and treated differently and some kind of conversion function will need to be called if you assign one to the other or you will have a runtime bug. If you're lucky. There's still a tremendous amount of value to Apps Hungarian, in that it increases collocation in code, which makes the code easier to read, write, debug, and maintain, and, most importantly, it makes wrong code look wrong."
  • Tim Ottinger (against apps Hungarian):
    "nowadays HN and other forms of type encoding are simply impediments. They make it harder to change the name or type of a variable, function, or class. They make it harder to read the code. And they create the possibility that the encoding system will mislead the reader."
  • Microsoft
    Microsoft
    Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

    's Design Guidelines discourage developers from using Hungarian notation when they choose names for the elements in .NET Class Libraries, although it was common on prior Microsoft development platforms like Visual Basic 6 and earlier. These Design Guidelines are silent on the naming conventions for local variables inside functions.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK