# Write you an interpreter

Posted on December 30, 2016 by Kwang Yul Seo
Tags: interpreter

Writing an interpreter for a functional language is a good exercise in Haskell. There are several tutorials on this topic.

Implementation techniques used in these tutorials are similar even though their source languages are distinct. They all compile the source language into a small core language based on lambda calculus, and evaluate the program with a context (or an environment).

In this post, I am not going to revisit this common technique. Instead, I will show you how to compile a program to a finite, fixed set of combinators (SKI), and then evaluate these combinators as normal Haskell function. This technique was introduced in Matthew Naylor’s Evaluating Haskell in Haskell.

The source code is available here.

# Poly

We are going to borrow the parser and type checker from Stephen Diehls’s Poly, a simple ML dialect with definitions, let polymorphism and a fixpoint operator.

An example of Poly:

``let rec factorial n = if (n == 0) then 1 else (n * (factorial (n-1)));``

The core language of Poly is a variant of lambda calculus. `Let`, `If`, `Fix` and `Op` are added as additional constructs.

``````type Name = String

data Expr
= Var Name
| App Expr Expr
| Lam Name Expr
| Let Name Expr Expr
| Lit Lit
| If Expr Expr Expr
| Fix Expr
| Op Binop Expr Expr
deriving (Show, Eq, Ord)

data Lit
= LInt Integer
| LBool Bool
deriving (Show, Eq, Ord)

data Binop = Add | Sub | Mul | Eql
deriving (Eq, Ord, Show)``````

# Desugar

Our first task is to desugar `Let`, `If`, `Fix` and `Op` to simplify the later stage of compilation.

``````desugar :: Expr -> Expr
desugar (App fun arg) = App (desugar fun) (desugar arg)
desugar (Lam x body) = Lam x (desugar body)
desugar (Let x e body) = App (Lam x (desugar body)) (desugar e)
desugar (If cond tr fl) = foldl App (Var "\$IF") args
where args = map desugar [cond, tr, fl]
desugar (Fix e) = App (Var "\$FIX") (desugar e)
desugar (Op op a b) = foldl App (Var n) args
where
args = map desugar [a, b]
n = case op of
Sub -> "\$SUB"
Mul -> "\$MUL"
Eql -> "\$EQL"
desugar e = e``````

`desugar` function converts `let x = e in body` into `(\x -> body) e`. `If`, `Fix` are `Op` are desugared into function applications. `\$IF`, `\$FIX`, `\$ADD`, `\$SUB`, `\$MUL`, `\$EQL` will be provided as primitive functions. (Note that `\$IF` can be a function because we piggy back on the lazy evaluation of the host language, Haskell.)

# Compilation to SKI combinators

The next step is to compile expressions into a fixed, finite combinators. The key idea is to replace `Lam` and `Ap` constructors with Haskell’s built-in lambda and application constructs. The original interpreter of Poly is slow because it emulates beta reduction on top of Haskell, but our implementation avoids this overhead by utilizing the host system’s support for beta-reduction.

For example,

``Lam "f" (Lam "a" (Lam "b" (App (App (Var "f") (Var "b") (Var "a")))``

is compiled to

``CLam (\f -> CLam (\a -> CLam (\b -> ap (ap f b) a)))``

Here’s the definition of `CExpr`. You can see that `CLam` contains a Haskell function `CExpr -> CExpr`. No variable in the lambda abstraction is necessary.

``````data CExpr
= CVar Name
| CApp CExpr CExpr
| CLam (CExpr -> CExpr)
| CBool Bool
| CInt Integer``````

`compile` transforms a lambda calculus expression into an expression involving only `S`, `K`, `I` and constants. The SK compilation algorithm is well described in Simon Peyton Jones’s The Implementation of Functional Programming Languages.

``````compile :: Expr -> CExpr
compile (Var n) = CVar n
compile (App fun arg) = CApp (compile fun) (compile arg)
compile (Lam x body) = abstract x (compile body)
compile (Lit (LInt k)) = CInt k
compile (Lit (LBool k)) = CBool k

abstract :: Name -> CExpr -> CExpr
abstract x (CApp fun arg) = combS (abstract x fun) (abstract x arg)
abstract x (CVar n) | x == n = combI
abstract _ k = combK k

combS :: CExpr -> CExpr -> CExpr
combS f = CApp (CApp (CVar "\$S") f)

combK :: CExpr -> CExpr
combK = CApp (CVar "\$K")

combI :: CExpr
combI = CVar "\$I"``````

For example, `(\x -> + x x) 5` is transformed as follows:

``````S --> S (\x -> + x) (\x -> x) 5
S --> S (S (\x -> +) (\x -> x)) (\x -> x) 5
I --> S (S (\x -> +) I) (\x -> x) 5
I --> S (S (\x -> +) I) I 5
K --> S (S (K +) I) I 5``````

# Primitives

Here’s the definition of our primitive functions:

``````infixl 0 !
(!) :: CExpr -> CExpr -> CExpr
(CLam f) ! x = f x

primitives :: [(String, CExpr)]
primitives =
[ ("\$I", CLam \$ \x -> x)
, ("\$K", CLam \$ \x -> CLam \$ \_ -> x)
, ("\$S", CLam \$ \f -> CLam \$ \g -> CLam \$ \x -> f!x!(g!x))
, ("\$IF", CLam \$ \(CBool cond) -> CLam \$ \tr -> CLam \$ \fl -> if cond then tr else fl)
, ("\$FIX", CLam \$ \(CLam f) -> fix f)
, ("\$ADD", arith (+))
, ("\$SUB", arith (-))
, ("\$MUL", arith (*))
, ("\$EQL", logical (==))
]

arith :: (Integer -> Integer -> Integer) -> CExpr
arith op = CLam \$ \(CInt a) -> CLam \$ \(CInt b) -> CInt (op a b)

logical :: (Integer -> Integer -> Bool) -> CExpr
logical op = CLam \$ \(CInt a) -> CLam \$ \(CInt b) -> if op a b then true else false

true, false :: CExpr
true = CBool True
false = CBool False``````

The final step is link our compiled program with other functions and primitives in the environment. `link` traverses the structure of `CExpr` and replaces `CVar` node with the actual function definition.

``````type TermEnv = Map.Map String CExpr

emptyTmenv :: TermEnv
emptyTmenv = Map.fromList primitives

link :: TermEnv -> CExpr -> CExpr
link bs (CApp fun arg) = link bs fun ! link bs arg
link bs (CVar n) = fromJust (Map.lookup n bs)
link _ e = e``````

# Eval

Finally, `eval` is just a composition of `desugar`, `compile` and `link env`.

``````eval :: TermEnv -> Expr -> CExpr
eval env = link env . compile . desugar

runEval :: TermEnv -> String -> Expr -> (CExpr, TermEnv)
runEval env nm ex =
let res = eval env ex in
(res, Map.insert nm res env)``````

# Optimization

The basic compilation algorithm shown above tends to produce large combinator expressions. New combinators such as `B`, `C`, `S'`, `B'` and `C'` can optimize both execution speed and program size.