- Demo time!
- Show me the code!
The nanopass framework is a scheme framework for writing compilers using tiny passes transforming the AST. The framework helps you immensely by generating transformers for you, in particular it handles recursively calling the appropriate transformers for the different kinds of terms in your AST.
Its implementations support Chez scheme and Racket. The Racket implementation also has some accompanying documentation.
However I found more answers to my hands-on questions by looking at the scheme-to-c compiler source. Although it's very well commented, it's quite a difficult read when not used to the framework.
So in order to showcase some of the framework's features and perhaps help new users, I decided to write down a small example of transforming a silly arithmetic language into Church encoded lambda calculus, which is trivial to then transform back into scheme.
For this example I've used Chez scheme for no other reason than that I saw Friedman and Byrd use it when showing of miniKanren.
The example code has a REPL, where we can evaluate our highly advanced scientifically important arithmetic calculations:
silly-church> (+ (- 3 1) (* 2 4)) 10
For a moment one might have the desire to know the resulting Church encoding, that urge can be satisfied as well:
> (encode '(+ (- 3 1) (* 2 4))) (((lambda (m) (lambda (n) (lambda (f) (lambda (x) ((m f) ((n f) x)))))) ((((lambda (pred) (lambda (m) (lambda (n) ((n pred) m)))) (lambda (n) (lambda (f) (lambda (x) (((n (lambda (g) (lambda (h) (h (g f))))) (lambda (u) x)) (lambda (u) u)))))) (lambda (f) (lambda (x) (f (f (f x)))))) (lambda (f) (lambda (x) (f x))))) (((lambda (m) (lambda (n) (lambda (f) (m (n f))))) (lambda (f) (lambda (x) (f (f x))))) (lambda (f) (lambda (x) (f (f (f (f x))))))))
but without further stalling for time...
Show me the code!
The code is kept here and as an overview, the small "compiler" goes through these passes:
The passes are primarily designed to showcase different feature of the nanopass framework, or secondarily perhaps to keep your black box hot.
Lsrc and the pass
(define-language Lsrc (terminals (number (n)) (operator (op))) (Expr (e) n (op e0 e1)))
The terminals all need to have corresponding predicates, and since
is already defined we only need to define:
(define (operator? x) (memq x '(+ - *)))
The first and last pass of a compiler written in the nanopass framework are quite different from the passes in between. Or to be precise, any pass to or from anything other than a language defined in the framework's syntax.
The first pass has to transform a regular scheme list, represented
in the nanopass framework as the
* language, into our
Here is how that is done:
(define-pass ast-to-Lsrc : * (ast) -> Lsrc () (parse : * (e) -> Expr () (cond [(number? e) e] [(and (list? e) (= 3 (length e))) (let ([op (car e)] [e0 (parse (cadr e))] [e1 (parse (caddr e))]) `(,op ,e0 ,e1))])) (parse ast))
Noteworthy things are:
- The defined pass
ast-to-Lsrctransforms the untyped language
*into the language
parsedefines a transformer from anything
*(i.e. a list) into the
Expr:s of the target language
Lsrc. The first pair of parenthesis name the arguments for the transformer and the last pair tells you that it will not return any additional values.
- Because the input language is
*the framework can't automatically figure out how to kick off the transformation, hence the need of the last explicit call to our transformer
(parse ast). The
astterm in the call is given its name when defining the type of the pass in the first line. (Note that, just like transformers, passes can receive multiple arguments as well as return multiple values.)
L1 and the pass
Here we encode the numbers into lambdas and applications, so the next language reflects exactly that:
(define-language L1 (extends Lsrc) (terminals (- (number (n))) (+ (variable (v)))) (Expr (e) (- n) (+ v) (+ (lambda (v) e)) (+ (apply e0 e1))))
To look at the full definition of a language do:
> (language->s-expression L1) (define-language L1 (entry Expr) (terminals (variable (v)) (operator (op))) (Expr (e) (apply e0 e1) (lambda (v) e) v (op e0 e1)))
encode-numbers pass encodes numbers using a small recursive function:
(define-pass encode-numbers : Lsrc (ast) -> L1 () (Expr : Expr (e) -> Expr () [,n (letrec ([go (lambda (n) (if (= n 0) `x `(apply f ,[go (- n 1)])))]) `(lambda (f) (lambda (x) ,[go n])))]))
Note that since the
(op e0 e1) expression is not affected by this transformation
it's not explicitly stated in the transformer. Here's the automagic, the
nanopass framework generates cases for these with the
appropriate recursive calls. That is, even deeply nested numbers get
encoded by this pass without us having to bother to take care of the recursion
over the AST.
L2 and the pass
Since our target lambda calculus only support lambda abstractions
with arity one we need to expand the dyadic
(op e0 e1) into two
L2 language reflects this change:
(define-language L2 (extends L1) (Expr (e) (+ op) (- (op e0 e1))))
Note that the
op terminal became an
Expr on its own, since it
will take the place of one the
(apply e0 e1).
curry-operators pass does the actual work:
(define-pass curry-operators : L1 (ast) -> L2 () (Expr : Expr (e) -> Expr () [(,op ,e0 ,e1) `(apply (apply ,op ,[Expr e0]) ,[Expr e1])]))
Note that we need to explicitly invoke our
to transform the two sub-expressions
e1. That is, the framework
does not automagically transform the input expression
e0 of type
L2 Expr, but that's exactly the type of the defined
L3 and the pass
Now we're quite ready to encode the operators:
(define-language L3 (extends L2) (terminals (- (operator (op)))) (Expr (e) (- op)))
and what we're left with is just:
> (language->s-expression L3) (define-language L3 (entry Expr) (terminals (variable (v))) (Expr (e) (apply e0 e1) (lambda (v) e) v))
The actual pass transforms the operators following
Since this in itself is not a nanopass related exercise, I've taken
the opportunity to showcase the
with-output-language syntax, which interprets
quasiquoted lists as the chosen output language's non-terminal (here the
(define-pass encode-operators : L2 (ast) -> L3 () (definitions (with-output-language (L3 Expr) (define plus `(lambda (m) (lambda (n) (lambda (f) (lambda (x) (apply (apply m f) (apply (apply n f) x))))))) (define pred `(lambda (n) (lambda (f) (lambda (x) (apply (apply (apply n (lambda (g) (lambda (h) (apply h (apply g f))))) (lambda (u) x)) (lambda (u) u)))))) (define minus `(apply (lambda (pred) (lambda (m) (lambda (n) (apply (apply n pred) m)))) ,pred)) (define multiply `(lambda (m) (lambda (n) (lambda (f) (apply m (apply n f)))))))) (Expr : Expr (e) -> Expr () [,op (case op [+ plus] [- minus] [* multiply] [else (error 'encode-operators "unsupported operator" op)])]))
The final pass simply transforms the
L3 language into scheme,
(define-pass output-scheme : L3 (ast) -> * () (Expr : Expr (e) -> * () [,v v] [(apply ,e0 ,e1) `(,[Expr e0] ,[Expr e1])] [(lambda (,v) ,e) `(lambda (,v) ,[Expr e])]))
and this pass completes the journey from
*! Happy passing!