Tuesday, January 1, 2013

Code Generators, Rewrite Rules, Aliasing and the Coq

DDC 0.3.1 was pushed onto Hackage just before Christmas.

The main features in this new release are:
  • Back end code generation via C and LLVM that accepts the new core language. There is enough implemented to compile first order programs, but we'll need to implement a lambda lifter for the new core language before we can compile higher order programs directly.
  • Lots more program transformations. You can apply program transformations to core programs on the command line, and check that the code is being optimised the way it should be. There is a tutorial describing how to do this. 
  • A rewrite rule framework that understands the region and effect system. Rewrite rules work on effectful expressions, and can be given predicates so they only apply when particular effects are non-interfering. This lets us perform build-fold fusion when the builder and folder have (non-interferring) side effects. (more info in Amos Robinson's honours thesis)
  • Propagation of no-aliasing meta-data to the LLVM code generator. We have extended the core language with witnesses of distinctness, that prove particular regions of the store are distinct (do not alias). When LLVM has this meta-data it can perform more optimisations on the generated code that it would otherwise be able to. (more info in Tran Ma's honours thesis)
The main missing feature is:
  • No type inference, so the core programs contain lots of type annotations. We'll need to get the type inferencer back online before the language will be useful to end-users.
Our immediate plans are to get a paper together about the new core language and how it supports the rewrite rules and LLVM meta-data. There are interesting stories to tell about all of these things. After that, we'll work on the type inferencer.

For the paper, we'll need a proof that the core language does what it's supposed to. To get this, I've started adding the region and effect system to the Coq formalisation of SystemF2+MutableStore I did about a year ago. I hadn't touched Coq since then, so eased my way back into it by cleaning up the proofs I had already done, then packaged them up into the Iron Lambda collection.

Iron Lambda is a collection of Coq formalisations for functional core languages of increasing complexity, with the SystemF2+Regions+Effects language I'm working on now being the latest one. I made a wiki page for Iron Lambda because I think it will be useful in its own right, to help intermediate Coq users bridge the gap between the end of the Software Foundations course and what appears in current research papers. The biggest language in Software Foundations is Simply Typed Lambda Calculus (STLC)+Records+SubTyping, but it doesn't include polymorphism or indirect mutual recursion. To deal with SystemF style polymorphism we need a better approach to names than unique identifiers (ie, something like deBruijn indices). To deal with indirectly mutually recursive language definitions we need to know how to define custom own induction schemes. Iron Lambda shows how to do these things.

For the impatient:

DDC 0.3.1 
 $ cabal update
 $ cabal install ddc-tools

Iron Lambda
 $ darcs get http://code.ouroborus.net/iron/iron-head/