Date of Award
Master of Science (MS)
Jeremy G Siek
Bor-Yuh Evan Chang
Programmers in many fields are interested in building new languages. In the same way that high-level languages increase productivity by allowing programmers to accomplish more with a given amount of code, a special-purpose language can reduce repetitive code and hide implementation detail, making a program's structure and function more evident. Sometimes an entire new language is called for, and other times an existing language serves most needs but could benefit from a few additional elements. However, current tools support this kind of extension poorly.
Language tools typically represent programs internally as trees which are easily extended with new types of nodes. However, in today's languages the programmer only indirectly manipulates this tree---through a parser which analyzes free-form text (that is, concrete syntax) and builds an abstract syntax tree (AST). The limitations of parsers make languages difficult to extend and severely constrain the choice of notation. In other words, what one writes and reads is dictated by what the parser is able to handle.
I explore an alternative approach: represent source code directly as an AST and derive both an executable program and a readable presentation from it. I present a flexible representation for ASTs, a general mechanism for transforming these trees, and a language for grammars which allows concrete syntax and semantics to be defined via these transformations. I show that this approach is modular, easy to understand, and expressive enough to define novel syntax and semantics.
My prottype system, Lorax, demonstrates the new approach. Reductions for presentation and execution are written in a functional language with meta-programming features. Syntax is not limited to simple text but may include richer notation for easier reading and understanding. A structure editor renders this rich syntax, using algorithms from TEX. In Lorax, the barrier to entry for the creation of languages is lowered, making it practical for programmers to express solutions in the terms and the notation which are closest to the problem domain.
Prescott, Moss, "Speaking for the Trees: a new (old) approach to languages and syntax" (2010). Computer Science Graduate Theses & Dissertations. 7.