Editing Compiler-compiler (section)

==Schorre metalanguages==
The earliest Schorre metacompilers, META I and META II, were developed by D. Val Schorre at UCLA. Other Schorre based metacompilers followed. Each adding improvements to language analysis and/or code generation.

In programming it is common to use the programming language name to refer to both the compiler and the programming language, the context distinguishing the meaning. A C++ program is compiled using a C++ compiler. That also applies in the following. For example, META II is both the compiler and the language.
 
The metalanguages in the Schorre line of metacompilers are functional programming languages that use top down grammar analyzing syntax equations having embedded output transformation constructs.

A syntax equation:
<pre><name> = <body>;</pre>
is a compiled ''test'' function returning ''success'' or ''failure''. <name> is the function name. <body> is a form of logical expression consisting of tests that may be grouped, have alternates, and output productions. A ''test'' is like a ''bool'' in other languages, ''success'' being ''true'' and ''failure'' being ''false''.

Defining a programming language analytically top down is natural. For example, a program could be defined as:
<pre>
 program = $declaration;
</pre>
Defining a program as a sequence of zero or more declaration(s).

In the Schorre META '''''X''''' languages there is a driving rule. The program rule above is an example of a driving rule. The program rule is a ''test'' function that calls declaration, a ''test'' rule, that returns ''success'' or ''failure''. The $ loop operator repeatedly calling declaration until ''failure'' is returned. The $ operator is always successful, even when there are zero declaration. Above program would always return success. (In CWIC a long fail can bypass declaration. A long-fail is part of the backtracking system of CWIC)

The character sets of these early compilers were limited. The character '''/''' was used for the alternant (or) operator. "A or B" is written as A / B. Parentheses ( ) are used for grouping.
 A (B / C)
Describes a construct of A followed by B or C. As a boolean expression it would be
 A ''and'' (B ''or'' C)
A sequence X Y has an implied X '''and''' Y meaning. '''( )''' are grouping and '''/''' the '''or''' operator. The order of evaluation is always left to right as an input character sequence is being specified by the ordering of the tests.

Special operator words whose first character is a "." are used for clarity. .EMPTY is used as the last alternate when no previous alternant need be present.
 X (A / B / .EMPTY)
Indicates that X is optionally followed by A '''or''' B. This is a specific characteristic of these metalanguages being programming languages. Backtracking is avoided by the above. Other compiler constructor systems may have declared the three possible sequences and left it up to the parser to figure it out.

The characteristics of the metaprogramming metalanguages above are common to all Schorre metacompilers and those derived from them.

===META I===
META I was a hand compiled metacompiler used to compile META II. Little else is known of META I except that the initial compilation of META II produced nearly identical code to that of the hand coded META I compiler.

===META II===
{{main|META II}}
Each rule consists optionally of tests, operators, and output productions. A rule attempts to match some part of the input program source character stream returning success or failure. On success the input is advanced over matched characters. On failure the input is not advanced.

Output productions produced a form of assembly code directly from a syntax rule.

===TREE-META===
{{main|TREE-META}}

TREE-META introduced tree building operators ''':'''<''node_name''> and '''['''<''number''>''']''' moving the output production transforms to unparsed rules. The tree building operators were used in the grammar rules directly transforming the input into an [[abstract syntax tree]]. Unparse rules are also test functions that matched tree patterns. Unparse rules are called from a grammar rule when an abstract syntax tree is to be transformed into output code. The building of an abstract syntax tree and unparse rules allowed local optimizations to be performed by analyzing the parse tree.

Moving of output productions to the unparse rules made a clear separation of grammar analysis and code production. This made the programming easier to read and understand.

===CWIC===
In 1968–1970, Erwin Book, Dewey Val Schorre, and Steven J. Sherman developed CWIC.<ref name="CWIC" /> (Compiler for Writing and Implementing Compilers) at [[System Development Corporation]] [http://special.lib.umn.edu/findaid/xml/cbi00090-098.xml#series6 Charles Babbage Institute Center for the History of Information Technology (Box 12, folder 21)],

CWIC is a compiler development system composed of three special-purpose, domain specific, languages, each intended to permit the description of certain aspects of translation in a straight forward manner. The syntax language is used to describe the recognition of source text and the construction from it to an intermediate [[Tree (data structure)|tree]] structure. The generator language is used to describe the transformation of the tree into appropriate object language.

The syntax language follows Dewey Val Schorre's previous line of metacompilers. It most resembles TREE-META having [[Tree (data structure)|tree]] building operators in the syntax language. The unparse rules of TREE-META are extended to work with the object based generator language based on [[LISP 2]].

CWIC includes three languages:
* '''Syntax''': Transforms the source program input, into list structures using grammar transformation formula. A parsed expression structure is passed to a generator by placement of a generator call in a rule. A [[Tree (data structure)|tree]] is represented by a list whose first element is a node object. The language has operators, '''<''' and '''>''', specifically for making lists. The colon ''':''' operator is used to create node objects. ''':ADD''' creates an ADD node. The exclamation '''!''' operator combines a number of parsed entries with a node to make a tree. Trees created by syntax rules are passed to generator functions, returning success or failure. The syntax language is very close to TREE-META. Both use a colon to create a node. CWIC's tree building exclamation !<number> functions the same as TREE-META's [<number>]. 
* '''Generator''': a named series of transforming rules, each consisting of an unparse, pattern matching, rule. and an output production written in a LISP 2 like language. the translation was to IBM 360 binary machine code. Other facilities of the generator language generalized output.<ref name="CWIC" />
* '''[[MOL-360]]''': an independent [[system programming language|mid level implementation language]] for the IBM System/360 family of computers developed in 1968 and used for writing the underlying support library.

==== Generators language ====
Generators Language had semantics similar to [[Lisp (programming language)|Lisp]]. The parse [[Tree (data structure)|tree]] was thought of as a recursive list. The general form of a Generator Language function is:
<pre>  function-name(first-unparse_rule) =&gt; first-production_code_generator
           (second-unparse_rule) =&gt; second-production_code_generator
           (third-unparse_rule) =&gt; third-production_code_generator
               ...</pre>
The code to process a given [[Tree (data structure)|tree]] included the features of a general purpose programming language, plus a form: &lt;stuff&gt;, which would emit (stuff) onto the output file. 
A generator call may be used in the unparse_rule. The generator is passed the element of unparse_rule pattern in which it is placed and its return values are listed in (). For example:
<pre>  expr_gen(ADD[expr_gen(x),expr_gen(y)]) =&gt;
                                &lt;AR + (x*16)+y;&gt;
                                releasereg(y);
                                return x;
               (SUB[expr_gen(x),expr_gen(y)])=&gt;
                                &lt;SR + (x*16)+y;&gt;
                                releasereg(y);
                                return x;
               (MUL[expr_gen(x),expr_gen(y)])=&gt;
                              .
                              .
                              .
               (x)=&gt;     r1 = getreg();
                            load(r1, x);
                            return r1;
...</pre>
That is, if the parse [[Tree (data structure)|tree]] looks like (ADD[<something1>,<something2>]), expr_gen(x) would be called with <something1> and return x. A variable in the unparse rule is a local variable that can be used in the production_code_generator. expr_gen(y) is called with <something2> and returns y. Here is a generator call in an unparse rule is passed the element in the position it occupies. Hopefully in the above x and y will be registers on return. The last transforms is intended to load an atomic into a register and return the register. The first production would be used to generate the 360 "AR" (Add Register) instruction with the appropriate values in general registers. The above example is only a part of a generator. Every generator expression evaluates to a value that con then be further processed. The last transform could just as well have been written as:
<pre>
               (x)=&gt;     return load(getreg(), x);
</pre>
In this case load returns its first parameter, the register returned by getreg(). the functions load and getreg are other CWIC generators.

==== CWIC addressed [[domain-specific language]]s before the term [[domain-specific language]] existed ====
From the authors of CWIC:
 
"A metacompiler assists the task of compiler-building by automating its non creative aspects, those aspects that are the same regardless of the language which the produced compiler is to translate. This makes possible the design of languages which are appropriate to the specification of a particular problem. It reduces the cost of producing processors for such languages to a point where it becomes economically feasible to begin the solution of a problem with language design."<ref name="CWIC" />