Home | About | Partners | Contact Us

SourceForge Logo

Quick Links
Home
News
Status
Building XL
XL Mailing List

Understanding XL
Conceptual overview
XL examples
Inside XL
Concept Programming

In depth
Browse SVN
Browse CVS (deprecated)
SourceForge Info
Contact

Other projects
GNU Project
The Mozart Project

XLR: Extensible Language and Runtime

The art of turning ideas into code

XL, an extensible programming language, implements concept programming
If you want to know more, you should start here.

Scope Injection / Context-oriented programming Daily
Friday, October 31, 2008

We were discussing how to simplify the implementation and extension of Write. Currently, two modules implement slightly different versions of Write:

- XL.UI.CONSOLE.Write writes to the console

- XL.TEXT_IO.Write writes to a file

The first one is implemented using the second one, writing to standard output. More are to come, like writing to a string. Like C++, we could decide to use a stream class, that would tell where to write, whether it's a file, a string, etc. We also need either that stream or something else to store formatting preferences, e.g. number of digits.

One problem with this approach is that it makes extending write rather complicated. For example, if you want to write a complex number, you need something like:

to Write (output : stream; Z : complex) is Write Z, "(", Z.re, ";", Z.im, ")"

This is not too complicated, but if all I actually use is writing to the console, the code above has a rather unnecessary built-in notion that "Write to stdout" is really a write to a stream. Also, when you write Z.re or Z.im, you will use the formatting for real-numbers (e.g. number of digits), but how would you specify additional formatting info for complex numbers, e.g. allow you to choose between (1;3), [1, 3] or 1 + 3i ?

Ideally, I would like this:

to Write(Z : complex) is Write ComplexFormat.OpenSeparator, Z.re, ComplexFormat.InnerSeparator, Z.im, ComplexFormat.ClosingSeparator

But then, that variant would not allow you to write to a file... or would it? This is where the idea of scope injection comes in.

It seems like we could actually make it work with a file, if we add the idea that the Write we call could have an implicit generic dependency on a StandardOutput variable. Consider that the final step in writing, writing a character, is currently written as something like:

to Write(F : stream; C : character) is C.putc(C, F)

What if we used any-lookup with an extension, and replaced that tail with one that doesn't take the standard output argument, but introduces it:

to Write(C : character) is Write any.StandardOutput, C

The idea is the following: if you do not redefine StandardOutput in the instantiation context, then any.StandardOutput evaluates as the StandardOutput in the current context, which is a stream, so you end up calling the Write-to-stream function. But you can override StandardOutput and instantiate complex to write into some text like this:

to Write (in out target : text; C : character) is target += C

to Write (in out StandardOutput : text; ...) is Write ...

Z : complex Write Z

Now, here is what happens. The call to Write Z gets decomposed to its components, until the final Write C for individual characters. But at that stage, the any.StandardOutput finds a StandardOutput in the instantiation context, specifically the local parameter to the second Write we just defined.

A few things are a bit complicated to make this work:

- We need to detect implicit dependencies to generic items, e.g. the any.StandardOutput in the character Write.

- This may now resolve to a local variable, not just some global. In that case, we need to pass some implicit parameter down the instantiation stack, in that case the StandardOutput : text parameter.

It is also not clear if it is a good idea to create this implicit dependency based solely on the implementation, i.e. based on the use of any.StandardOutput inside a function that, otherwise, would not even be generic. An alternative is to add some new syntax indicating that generic dependency in the interface, something like:

generic [name StandardOutput] to Write (C : character) is Write StandardOutput, C

This would make it much clearer which part is generic and which part is not.

Any opinion is welcome...

C++ Considered Harmful Daily
Tuesday, October 14, 2008

There is a blog entry on Eric Raymond's web page that is worth pointing at, in particular because of the discussion in the comments.

What strikes me in these comments is that people still oppose dynamic or scripting languages like Lisp or Python to system languages like C or C++. One commenter indicates that he has to use C++ because no other advanced language can do real-time. That is one of the reasons I chose to design XL (I started my professional life writing hard real-time code, think sub-millisecond.)

But as XL proves, even in its present unfinished state, it is possible to design a language that offers a much higher level of abstraction than C++ and is also much more efficient. For example, I remember at some point timing the code for native/TESTS/12.Library/julia.xl on Itanium, and getting almost a 70% speed boost relative to C++. Why? Because the XL construction does not require any kind of memory touch to perform complex operations, whereas C++ mandates them.

Specifically, the complex class has an implicit this pointer, so that an operator like operator+ implicitly performs four loads (real and imaginary part of both arguments) and two stores (storing real and imaginary part of the result). These loads and stores can be eliminated with really advanced back-end optimizations such as register field promotion, the kind of which the HP C++ compiler only performs at O3 or above. The XL complex arithmetic code in native/library/xl.math.complex.xl, by contrast, generates code that can be done entirely in registers, even at the lowest optimization levels. And this, despite having a level of abstraction that is at least comparable to the C++ implementation of the same.

What this tells me is that it is possible to design a language that is very high level, yet at the same time retains and even enhances the good attributes of C++, its ability to generate high-performance code.

Comment to ESR's announcement about "C++ considered harmful" Daily
Monday, October 13, 2008

This is a comment to the C++ considered harmful blog entry Eric Raymond posted last month.

Eric,

I agree with Jeff that writing about the problems of C++ will only be effective if you can offer something that is better. Consider your own critique of the Unix Hater's Handbook: the best chapters were those where the authors had something better in mind that they could use as a reference. I believe you suggest that Python or similar languages are the "better" solution in the language space, but as many have pointed out, there are many things C++ can do that Python can't. Let me respectfully propose my own biased answer.

Designing a modern language is a problem I have thought about long and hard for a very long time (I'd say 15 years). The result is the XL programming language, and a programming paradigm I called concept programming. Highly idiosyncratic, certainly, and even more highly confidential at this point. But I think it's worth sharing with you. Your criticism of C++ will be stronger if you know of other ways to achieve the same goals as C++ than if you suggest that we should change the goals.

And to make it clear, the goal in my opinion is to be able to develop very large and complex programs that still take advantage of the machine to the maximum possible extent. In other words, I am not ready to sacrifice performance or memory usage for convenience, and I don't think we should have to. Many commenters here expressed the same feeling.


So how exactly is XL improving relative to C++? First, XL stands for "Extensible Language". The initial intuition is that the problem set is not bounded. So I wanted to create a language that made the code representation of arbitrary concepts possible, not just "functions" or "objects". Actually, it should not just be possible, it should be easy. To illustrate, it should make it just as easy to add symbolic differentiation to the language as it would be to add a class in C++. I wouldn't comment here if I didn't think I have achieved that objective.

But it goes further. When you start thinking in terms of concepts and representations, you start seeing flaws in other languages and designs that are not obvious, and you start designing things in a different way. Highly idiosyncratic, I told you :-) In short, instead of a monolithic compiler like practically everybody else has done so far, XL has a tiny compiler core implementing a standard syntax, and a number of compiler plug-ins that cooperate to implement the language.

Let me illustrate this with things that you or your readers seem to care about: efficiency, code density, high abstraction levels, operator overloading, preprocessing, garbage collection, rapid prototyping, quick build cycles, interaction with other languages such as C, simple syntax. Obviously, that won't be a short post...

  • Efficiency: XL was designed to have machine code as one of the primary targets, and to take advantage of modern architectures. Modern architectures differ from the dominant architectures when C or C++ were designed, in at least two important ways. First, memory accesses are no longer cheap. Second, accelerators and "strange" computational models are everywhere.

    1. Memory accesses are extremely expensive, so mandating pointers for things like parameters (references or const refs are the default for most template code in C++ for a reason) or objects (the infamous 'this pointer') are things we should consider fixing. XL has explicit output parameters which it can pass by copy if more efficient (the Itanium ABI defines four 64-bit output registers), and can define objects without any implicit pointer reference. In one micro-benchmark involving complex number computations (for fractals), XL was 70% faster than C++ on Itanium, because it replaced loads and stores with register accesses. And I could easily have done more computations between loads and stores to tilt the balance even further.
    2. Machines are no longer uniform, there are co-processors for a number of key tasks, and even mainstream CPUs keep receiving new instructions that don't fit in the limited C++ computing model, such as vector arithmetic or multimedia operations. In XL, all operators are defined in the library, so adding new built-in types and a default implementation is not a problem at all. Similarly, XL can easily represent GPU programs in a way that hopefully will soon make ugly hacks like OpenCL unnecessary

  • Code density: XL tries to eliminate what I call syntactic and semantic noise. I invite you to compare the C++ implementation of complex arithmetic with the XL implementation (which, it should be noted, only implements a small number of operators at this point, so you should look at the code and not count lines). In general, XL generic code compares very favorably to C++ code, in particular because generic types behave like real types, e.g. you can define functions with them without having to repeat all the "template" declarations.
  • High abstraction levels: Thanks to the extensibility mechanisms, XL makes it possible to add your own abstraction levels. Like C++, you can create your own pointer types. Unlike C++, this is how the language's most fundamental types are defined: integer, real, pointer, array are all library types. Being able to define such abstractions is a testimony to the power of the approach. But of course, abstractions at a much higher level is what makes things really interesting. Again, what makes it possible to compile the differentiation notation in the code is this plug-in, a mere 191 lines of code including comments. Contrast this to the C++ implementation (I can't find a full working implementation on the web anymore, but from memory it required something like 4000+ lines of really hairy C++ code). The XL templates system is also pretty good, allowing one to define Maximum the way it should be, something that will only become possible (barely) with the upcoming C++ 0x standard.
  • Operator overloading: XL features something I called "expression reduction", which is operator overloading on steroids. Instead of overloading operators, you overload complete expressions. So you can optimize the memory access patterns for matrix linear algebar (A*B+C), or define specific shortcuts notations (X=0 without allowing X=1). This works for templates as well (XL calls them generics), so you can for example write "pointer to integer" or "array [1..5] of real" and have expression reduction reduce that to the right instantiation.
  • Preprocessing: The XL pre-processor is powerful enough to compute factorials or implement an assert macro that fails at compile time if it can determine that the condition is false statically.
  • Garbage collection: This is not implemented yet, so I won't boast about it, but my current design should allow you to garbage collect data structures in a file or network connections once it's complete.
  • Rapid prototyping: In my opinion, so called "scripting languages" stand out not just by their interactive nature, but also because they offer a wealth of everyday features that often includes the kitchen sink. I expect XL to mature in that direction, but at the moment, its library is pretty much empty. Code submissions are welcome, though: it's open source.
  • Quick build cycles: One reason C++ is bad in that respect is that it still relies on the very ugly include-file based modularity inherited from C, which doesn't scale well for a language that is complicated to analyze. XL has include files, but they are definitely not used much for standard modules. The XL syntax is also really simple and parses very very quickly relatively to C++. Now, does this result in faster builds? Not necessarily, because other things get in the way: extensibility and flexibility of notations have a cost. Also, I realized quickly that some concepts such as aspects apply to the program as a whole. The XL compiler itself uses such a whole-program extension to implement plug-ins. So between very fast parsing on one side and occasional whole-program passes, we'll see how this plays out in the end...
  • Interaction with other languages such as C: XL is designed to interface with a number of incompatible runtimes. So the notion of "runtime" was built into the language. A large fraction of its tests can be compiled to C or to Java runtimes, for example. Interfacing with C is pretty easy too:

    function Allocate_Memory(size : unsigned) return pointer is C.malloc

  • Simple syntax: The syntax of XL consists in a very small number of rules. The whole parser is implemented in 611 lines of XL, including comments, and the scanner is another 469 lines... There isn't a single line of Bison or Lex in the whole thing, another idiosyncracy. You can always parse an XL source without having to analyze it, this is a fundamental requirement if you want compiler plug-ins to cooperate. To most people, XL code looks like pseudo-code, except that it isn't.

There are other things I did not even talk about, like how XL will deal with real-time or parallelism. Can you guess I'm proud of my language? OK, it doesn't exactly have a cult following yet, and a lot of work remains to be done. But hey, I hear you are good at evangelizing stuff, so if you like it, why don't you talk about it and send good programmers my way? ;-)

The new C++ standard Daily
Friday, September 26, 2008

Bjarne Stroustrup, inventor of C++

DevX.com has a special report on the new C++ standard, currently referred to as C++ 0x because it was due sometime between 2000 and 2009. Well, the standard was quite late compared to early expectations, so an easy joke was that we might end up with 0xA or 0xB, the C++ hexadecimal notation for 10 or 11. Ultimately, chances are that the standard will make it for 2009, so we will probably refer to it as C++ 09...

This new iteration of the language is of interest to all programmers, because it brings a number of major changes to one of the most popular programming languages today, and one that is already very complex (and therefore hard to extend). But for me in particular, it is all the more interesting to consider how various "innovations" in that new standard compare to features that play the same role in XL.

Concepts?

One of the major features in C++ is concepts. DevX has a dedicated article about concepts. In short, concepts in C++ are a way to describe categories of templates, and to help the compiler figure out what the programmer intended for a given template. This new aspect of the language makes it easier to define a real contract between the users of a template and its implementers.

Concepts vs. code

C++ concepts, however, are somewhat annoying to me. One reason is that XL has been for a long time based on an approach that I dubbed concept programming. Concept programming, in the XL sense, is about the relationship between concepts that exist only in our head, and concept representations that exist in the computer. The key idea is to make sure that implementations look and feel like the concepts they represent.

One key consequence of that idea is that a programming language should comfortably support arbitrary concepts, not some finite set (e.g. functions or objects), because the set of concepts we manipulate is not a-priori limited. This is the key reason so much effort was put into making XL extensible.

To summarize, "concepts" in XL are only very remotely related to "concepts" in C++, although, arguably, the XL usage of the word is closer to the standard meaning.

XL generic validation = C++ concepts

Does that template or generic apply?

Many aspects of XL are a direct consequence of the concept programming design philosophy. For example, XL implemented, since at least 2002, the idea that one can describe how a generic type can be used. This feature is called generic validation in XL terminology. I invite the reader to compare the XL implementation of a minimum function with the C++-with-concepts implementation of the same. This should convince you that the two ideas are basically almost identical.

So where are the differences between C++ concepts and XL generic validation? One of them is how the contract is being specified. In C++, you specify the kind of operators and functions that define the concept. For example, you would write something like the following to indicate that a min function requires a less-than operator:

concept LessThanComparable<typename T> { bool operator<(const T& x, const T& y); }

template<typename T> requires LessThanComparable<T> const T& min(const T& x, const T& y) { return x < y? x : y; }

In XL, by contrast, you give an example of code that has to compile with the generic type you want to validate. For example, in XL, you would write something like:

generic type ordered where A, B : ordered Test : boolean := A < B

function Min (X, Y : ordered) return ordered is if X < Y then return X else return Y

Now, as you can see from this simple example, a significant difference is that XL considers the validation to be tied to a generic type, which can then be used to declare a function like Min directly. In other words, since you declared that ordered is generic, Min becomes implicitly generic. By contrast, in C++, LessThanComparable is a kind of predicate that applies to template classes, so you need one additional "connection" using the require statement, to let the compiler relate the T in the definition of min with the T in LessThanComparable. As a result, the C++ code for that example is more verbose and more convoluted. This becomes more visible as the code becomes more complex.

Another drawback is that the C++ concept specification as written doesn't work for, say, int because the less-than operator in that case doesn't have the right signature. So you need an additional concept_map in that case, making the code even more verbose, as shown below:

concept_map LessThanComparable<int> { }

One benefit of the C++ approach, however, is that the specification of the concept makes it easier to validate early that the implementation actually doesn't require anything besides what is declared in the concept. For example, if the body of min attempts to refer to an operator that is not present in the concept specification, the compiler may detect this. Doing this with the kind of specification given in XL is much more complicated. I am considering various ways to fix this problem, which is much easier in XL since practically nobody uses it yet.

Multitasking and Threads

C++ 0x also adds standard support for threads. In my opinion, it is ironic that they manage to shoe-in support for a thread model that is so "last century". Today, the difficult problem is not threading on a SMP system, but threading on non-uniform architectures, for example threading between a CPU and a GPU, or between the components of a Cell microprocessor, or threads that cooperate on machines with different architectures across the Internet.

This kind of problem is much more complicated, and is already, to some extent, solved by other languages such as Java or Erlang.

At this point, XL has little to offer in that space, because what is needed is not coded yet. However, I am confident that XL's extensibility will make it easy to implement not one, but a multitude of tasking models. Among the top candidates are rendez-vous based mechanisms similar to Ada, message-passing protocols similar to Erlang, or data-driven parallelism similar to several functional languages. Stay tuned.

Variadic templates

C++ 0x will, at long last, implement variadic templates. This feature will make it possible to write functions that take a variable number of arguments, yet are type-safe.

This is, again, something that existed in XL since 2001 or earlier. You can see that XL implementation of the Max function takes advantage of this feature.

The C++ implementation is more complete, however, as it makes it possible to create not just variadic functions, but also variadic classes. This is something that is planned, but not currently implemented in XL.

Range-based iterations

A new range-based iteration mechanism was also added to C++ 0x. XL has a more general form of iteration, that already covers this specific case. Here is for example how for loops are declared in XL:

iterator IntegerIterator(var It : integer; Low, High : integer) written It in Low..High is It := Low while It <= High loop yield It := It + 1

The notation It in Low..High is how you will invoke the iterator, and the yield statement in the iterator is where the body of the loop will go. The usage of the iterator is very natural:

for I in 1..5 loop for J in 1..I loop WriteLn "I=", I, " and J=", J

The benefit of this more general approach is that you can for example define two-variable iterators:

iterator MatrixIterator ( var I : integer; LI, HI : integer; var J : integer; LJ, HJ : integer) written I,J in [LI..HI, LJ..HJ] is I := LI while I <= HI loop J := LJ while J <= HJ loop yield J := J + 1 I := I + 1

for A, B in [3..5, 7..9] loop WriteLn "A=", A, " and B=", B

You can also define iterators over any kind of data structure, using any syntax you need for this particular data structure.

Constant Expressions

C++ 0x introduces the notion of generalized constant expression. This makes it possible to declare functions that the compiler will be able to evaluate at compile time.

Once again, the XL approach is very different. The XL compiler has various phases, implemented as "plug-ins" for the compiler. One of them deals with constant folding (i.e. evaluation of constant expressions). Here is an example showing how to compute factorials at compile-time using that technique.

The XL pre-processor also makes it easy to implement compile-time assertions, something that is also a new feature of C++ 0x. The XL implementation, however, will automatically optimize a static assertion if it can evaluate the argument at compile time, instead of requiring a specific keyword.

Conclusion

C++ is an extremely complex language, and extending it took a lot of effort. Many of the new features have already existed in XL for a while, and are much easier to implement. However, the implementation in C++ points out some weaknesses in the way things are currently done in XL, something that is fortunately still easy to change that early in the language's life.

Note: This article was originally published here.

Bringing Java tests in line Daily
Saturday, July 26, 2008

In addition to the constructors work and various other cleanup, I've also been attempting to bring the Java tests in line with the C tests. This has proven relatively difficult, because Java is not a very practical language as a target of code generation: no pointers, no unsigned types, ...

Another problem I'm having now is how to bring back the work I did from the git into Subversion. It turns out that git-svn is unable to follow the majority of branch merges done by the git. This makes it unable to follow complex branch histories, and you have to "squash" the work you did on a branch into the branch you will use for Subversion commits. That is annoying, as it loses a lot of the history, which was the primary benefit I was seeing in this approach. I hope that I will not run into the problem if I avoid cross-branch merges, but this is not entirely clear.

In any event, I now have a relatively large set of git history that is tricky to merge back. I'm losing a lot of history data in the process, which is unfortunate.

Vacations, Git, Constructors... Daily
Tuesday, July 22, 2008

I had a long week of vacation, which we spent mostly around home. Living in the French Riviera has its perks, like beautiful places to visit during vacation... The only bad news is that I hurt my knee during a hike, and I've been stuck at home for a week, with another week at least of restricted movement.

Anyway, vacation was a good time for doing some XL development, and I started attacking a couple rather tough nuts, bugs that had annoyed me for a while.

Two of them made it difficult to implement complex numbers the way I wanted. The problem was, put simply, how to initialize a field to zero. How hard can that be? Well, consider that the complex type is generic and depends on a number type. That number type can be integer or real for example (although you could consider intervals and any other kind of fancy possibility).

Now, if I'm writing the constructor for complex taking a single value, it's going to look something like:

function Complex (re : complex.value) return complex written re is result.re := re // result.im := 0.0

If I write result.im := 0.0, then instantiation will fail when instantiating complex with integer. Conversely, result.im := 0 will not work for real, because XL has no implicit conversion from integer to real (although you can add one easily).

So there are two solutions, neither of which worked correctly. The first one was to leave the code as above, without an initializer, and rely on some kind of default initialization semantics inside constructors. The problem with this solution is that such initializers did not exist.

The second solution was to add some new notation that would allow me to explicitly call a constructor, something like:

function Complex (re : complex.value) return complex written re is result.re := re result.im := complex.value(0)

Now, it's an explicit call to a constructor, and there is no problem with real(0). There are cases where this second solution is the only one (e.g. the default constructor doesn't work for you). The test for this is here.

That was actually a pretty large change set. I embarked into doing that, but decided that I would do that in the GIT to avoid corrupting the SVN database. That way, I could keep multiple work branches, etc. For an introduction on how to use the GIT with SVN, see http://tsunanet.blogspot.com/2007/07/learning-git-svn-in-5min.html. The GIT works pretty well for that, and I was really happy...

... Until the point where I wanted to commit to Subversion. I used git svn dcommit as the blog says. Things started rather well, see revisions after Revision 381... The problem is that I had made a mistake initially, and added some LLVM code that was not ready for prime time. The GIT somehow tried to check-in into SVN a .svn directory, wrecking havoc. git svn dcommit stopped with an error, the SVN state was bad (would not even compile), yuck yuck yuck...

It took me a good 45 minutes to sort out the mess on both sides, taking more than I had planned to use during my lunch break, but I think that I'm back on track. I lost a lot of the history on the SVN side, though, having one big massive check-in at the end with the end result instead of the individual steps.

The constructor work is not entirely there, but at least it's in a stable state. There are further fixes and other GIT branches that I want to commit at some point, but I will first revisit everything to check that it's safe...

Function pointers Daily
Tuesday, April 8, 2008

Added support for function pointers in Revision 370. The real problem was support for overloading, as illustrated in this test, reproduced below:

procedure Foo (X : integer) is
    WriteLn "Foo ", X

procedure Foo (X : real) is WriteLn "Foo (real) ", X

procedure Bar (X : integer) is WriteLn "Bar ", X

type proc_ptr is procedure (X : integer)

to Invoke (callback : proc_ptr) is callback 3

Invoke Foo Invoke Bar

In that example, the problem is with Invoke Foo, which needs to be able to decide which Foo is to be selected (the first one in that case).

October 2008
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
Sep  Nov


Copyright 2008 Christophe de Dinechin (Blog)
E-mail: XL Mailing List (polluted by spam, unfortunately)