Editing Assembly language (section)

===Basic elements===
There is a large degree of diversity in the way the authors of assemblers categorize statements and in the nomenclature that they use. In particular, some describe anything other than a machine mnemonic or extended mnemonic as a pseudo-operation (pseudo-op). A typical assembly language consists of 3 types of instruction statements that are used to define program operations:

* [[Opcode]] mnemonics
* Data definitions
* Assembly directives

===={{anchor|Mnemonics}}Opcode mnemonics and extended mnemonics====
Instructions (statements) in assembly language are generally very simple, unlike those in [[high-level programming language|high-level languages]]. Generally, a mnemonic is a symbolic name for a single executable machine language instruction (an [[opcode]]), and there is at least one opcode mnemonic defined for each machine language instruction. Each instruction typically consists of an ''operation'' or ''opcode'' plus zero or more ''[[operand]]s''. Most instructions refer to a single value or a pair of values.  Operands can be immediate (value coded in the instruction itself), registers specified in the instruction or implied, or the addresses of data located elsewhere in storage. This is determined by the underlying processor architecture: the assembler merely reflects how this architecture works. ''Extended mnemonics'' are often used to specify a combination of an opcode with a specific operand, e.g., the System/360 assemblers use {{code|B}} as an extended mnemonic for {{code|BC}} with a mask of 15 and {{code|NOP}} ("NO OPeration" – do nothing for one step) for {{code|BC}} with a mask of 0.

''Extended mnemonics'' are often used to support specialized uses of instructions, often for purposes not obvious from the instruction name. For example, many CPU's do not have an explicit NOP instruction, but do have instructions that can be used for the purpose. In 8086 CPUs the instruction {{code|2=asm|xchg ax,ax}} is used for {{code|nop}}, with {{code|nop}} being a pseudo-opcode to encode the instruction {{code|2=asm|xchg ax,ax}}. Some disassemblers recognize this and will decode the {{code|2=asm|xchg ax,ax}} instruction as {{code|nop}}. Similarly, IBM assemblers for [[IBM System/360|System/360]] and [[IBM System/370|System/370]] use the extended mnemonics {{code|NOP}} and {{code|NOPR}} for {{code|BC}} and {{code|BCR}} with zero masks.  For the SPARC architecture, these are known as ''synthetic instructions''.<ref name="SPARC_1992"/>

Some assemblers also support simple built-in macro-instructions that generate two or more machine instructions. For instance, with some Z80 assemblers the instruction {{code|ld hl,bc}} is recognized to generate {{code|ld l,c}} followed by {{code|ld h,b}}.<ref name="Moxham_1996"/> These are sometimes known as ''pseudo-opcodes''.

Mnemonics are arbitrary symbols; in 1985 the [[Institute of Electrical and Electronics Engineers|IEEE]] published Standard 694 for a uniform set of mnemonics to be used by all assemblers.<ref>{{cite book |title=IEEE Std 694-1985: IEEE Standard for Microprocessor Assembly Language |publisher=IEEE Computer Society |date=1985 |isbn=0-7381-2752-3 |oclc=1415906564 }}</ref> The standard has since been withdrawn.

====Data directives====
There are instructions used to define data elements to hold data and variables.  They define the type of data, the length and the [[data structure alignment|alignment]] of data. These instructions can also define whether the data is available to outside programs (programs assembled separately) or only to the program in which the data section is defined. Some assemblers classify these as pseudo-ops.

====Assembly directives====
Assembly directives, also called pseudo-opcodes, pseudo-operations or pseudo-ops, are commands given to an assembler "directing it to perform operations other than assembling instructions".<ref name="Salomon_1992"/> Directives affect how the assembler operates and "may affect the object code, the symbol table, the listing file, and the values of internal assembler parameters". Sometimes the term ''pseudo-opcode'' is reserved for directives that generate object code, such as those that generate data.<ref name="Hyde_MASM"/>

The names of pseudo-ops often start with a dot to distinguish them from machine instructions.  Pseudo-ops can make the assembly of the program dependent on parameters input by a programmer, so that one program can be assembled in different ways, perhaps for different applications. Or, a pseudo-op can be used to manipulate presentation of a program to make it easier to read and maintain. Another common use of pseudo-ops is to reserve storage areas for run-time data and optionally initialize their contents to known values.

Symbolic assemblers let programmers associate arbitrary names (''[[label (computer science)|label]]s'' or ''symbols'') with memory locations and various constants. Usually, every constant and variable is given a name so instructions can reference those locations by name, thus promoting [[self-documenting code]]. In executable code, the name of each subroutine is associated with its entry point, so any calls to a subroutine can use its name. Inside subroutines, [[GOTO]] destinations are given labels. Some assemblers support ''local symbols'' which are often lexically distinct from normal symbols (e.g., the use of "10$" as a GOTO destination).

Some assemblers, such as [[Netwide Assembler|NASM]], provide flexible symbol management, letting programmers manage different [[namespace]]s, automatically calculate offsets within [[data structure]]s, and assign labels that refer to literal values or the result of simple computations performed by the assembler. Labels can also be used to initialize constants and variables with relocatable addresses.

Assembly languages, like most other computer languages, allow comments to be added to program [[source code]] that will be ignored during assembly. Judicious commenting is essential in assembly language programs, as the meaning and purpose of a sequence of binary machine instructions can be difficult to determine. The "raw" (uncommented) assembly language generated by compilers or disassemblers is quite difficult to read when changes must be made.