|
|
|
|
Quick Links
Understanding XL
In depth
Other projects
|
XLR: Extensible Language and Runtime
|
The XL scanner takes a sequence of characters from a file and turns it into a sequence of tokens. It is implemented in the module xl.scanner. XL scanning is quite simple. There are only five types of tokens:
NUMBERS: Numbers can be written in any base, using the '#' notation: 16#FF. They can contain a decimal dot to specify real numbers: 5.21. They can contain single underscores to group digits: 1_980_000. They can contain an exponent introduced with the letter E: 1.31E6. The exponent can be negative, indicating a real number: 1.31E-6; 1E-3. Another '#' sign can be used before 'E', in particular when 'E' is a digit of the base: 16#FF#E20. The exponent represents a power of the base: 16#FF#E2 is 16#FF00 Combinations of the above are valid: 16#FF_00.00_FF#E-5. NAMES: Names begin with any letter, and are made of letters or digits: R19,Hello. Names can contain single underscores to group words: Big_Number Names are not case- nor underscore-sensitive: Joe_Dalton=JOEDALTON STRINGS: Strings begin with a single or double quote, and terminate with the same quote used to begin them. They cannot contain a line termination. A quote character can be embedded in a string by doubling it. "ABC" and 'def ghi' are examples of valid strings. Note that the type associated with strings of characters is called text, not string. SYMBOLS: Symbols are sequences of punctuation characters other than a quote that are not separated by spaces. In symbols, the underscore is a significant character. Examples of valid symbols include ++ , ---> %-% Symbols are normally made of the longest possible sequence of punctuation characters (being terminated by any space, digit, letter or quote). However, the six "parenthese" characters ( ) [ ] { always represent a complete symbol by themselves. Examples: ---X is the token --- followed by the token X --((X)) is the token -- followed by two tokens ( followed by the token X followed by two tokens ) BLANKS: In XL, indentation is significant, and represented internally by two special forms of parentheses, denoted as 'indent' and 'end'. Indentation can use space or tabs, but not both in the same source file. COMMENTS: The scanner doesn't decide what is a comment. This decision is taken by the caller (normally the parser). The Comment function can be called, and skips until an 'end of comment' token is found. For XL, this is under-utilized, since an end-of-comment is always an end of line.
|
Copyright 2006 Christophe de Dinechin (Blog)
E-mail: XL Mailing List (polluted by spam, unfortunately)