Home | About | Partners | Contact Us

SourceForge Logo

Quick Links
Building XL
XL Mailing List

Understanding XL
Conceptual overview
XL examples
Inside XL
Concept Programming

In depth
Browse GIT
SourceForge Info

Other projects
GNU Project
The Mozart Project

XLR: Extensible Language and Runtime

The art of turning ideas into code

Abstract syntax tree

Prev: Compiler Status


Next: Type system

The XL abstract syntax tree (AST) is the internal representation of XL0. It can be thought of as an equivalent for XL of S-expressions for Lisp (XL0 being the equivalent of M-expressions which were never really used in Lisp).

The XL AST format defines 7 types of nodes, 4 leafs and 3 complex nodes, and also "wildcard" nodes:

  • Name leaf nodes hold a name such as Foo or a symbol such as %%. A name node that wasn't generated by the parser could contain a sequence of characters that would not be valid according to the XL0 syntax, such as 0a-- N__.

  • Integer leaf nodes contain an integral value such as 22, including based numbers such as 16#3AE (an hexadecimal number). The size of the largest integer number that can be represented is implementation-dependent, but the parser should detect overflows (it doesn't at this stage).

  • Real leaf nodes contain a floating-point value such as 3.14E-15, including based numbers such as 2#1.01#E3. The precision of the representation is implementation-dependent.

  • Text leaf nodes contain textual data such as "Hello World" or 'Quoted'. The quote being used is recorded, and can be used by the translation process to give different semantics to different kinds of text. By convention, "" are used for text and '' are used for characters. Other text delimiters for multi-line text can be defined in the xl.syntax input file (currently, double back-quote, as in ``Hello``, but I'm considering using what C considers shift operators, as in <<Hello>>. Opinions?)

  • Block is a complex node with one child which, in XL0, is surrounded by delimiters, such as (A), [A] or {A} (in all cases, the child being A). Special delimiters I+ and I- are used internally to represent the delimiters of indentation blocks.

  • Prefix is a complex node with two children, one prefixing the other in XL0. It will be used for not A, where the name node not is the left child and A is the right child. Somewhat confusingly, this type of node is also used for postfix operators, which are postfix only from a parsing point of view. For instance, N! is a prefix node with N as its left child and ! as its right child, though ! is known as a postfix operator by the parser.

  • Infix is a complex node with two children and an operator name, the children surrounding the operator in XL0. It will be used for A+B, where the operator is + and the children are two name nodes A and B.

  • Wildcard nodes are named nodes used for matching trees. For instance, if A and B are wildcards, then you can build a tree for not A + B and it will match trees like not (Z-5) + 2*4 or not 0 + 'A'.

The AST for the C++ version of the XL compiler was called Coda and was a much more complicated object-oriented structure. The new AST format makes the translation process much simpler.

See also the XL0 syntax.

To see the XL0 representation of a given input, you can use the following command line:

./nxl -parse input.xl -style debug -show

Prev: Compiler Status


Next: Type system

Copyright 2008 Christophe de Dinechin (Blog)
E-mail: XL Mailing List (polluted by spam, unfortunately)