Kwang's Haskell Blog - Organize code using monads

Organize code using monads

Posted on December 23, 2016 by Kwang Yul Seo

Tags: monad, monad transformer, lambda lifting

Novice Haskell programmers think that monads are only for IO and stateful computations. But experienced Haskell programmers use monads to better structure their programs.

In this blog post, I am going to show you how we can better organize our code using monads.

Motivating example: Lambda lifting

Lambda lifting is a compiler transformation which eliminates all free variables from function definitions. It is an important step in a lazy functional language because it greatly simplifies evaluation on the graph reduction machine.

In his paper, Simon Peyton Jones describes how to perform lambda lifting in a modular fashion. The lambda lifter works in three steps:

-- | freeVars: Annotate every node in the expression with its free variables.
freeVars :: Expression -> AnnExpr Name (Set Name)

-- | Abstract the free variables from each lambda abstraction, replacing the lambda abstraction with the application of the new abstraction.
abstract :: AnnExpr Name (Set Name) -> Expression

-- | Give a unique name to each supercombinator and collect all the supercombinator definitions.
collectSCs :: Expression -> [SCDefn]

lambdaLift is the composition of these three functions.

lambdaLift :: Expression -> [SCDefn]
lambdaLift = collectSCs . abstract . freeVars

I am not going to explain the details of these steps in this post. Interested readers are referred to SPJ’s paper and book.

Instead, let’s dive into the last step and see how collectSCs is actually implemented.

Collecting supercombinators

collectSCs is defined in terms of a helper function named collecSC' which returns both the collection of supercombinators it has found and the transformed expression. It also carries around a name supply as an argument and returns the depleted supply as a result because it needs to generate fresh names for supercombinators.

-- | Gives a unique name to each supercombinator, collects all the
-- supercombinator definitions into a single list, and introduce the
-- $main supercombinator definition.
collectSCs :: Expression -> [SCDefn]
collectSCs e = ("$main", [], e') : scs
  where
  (_, scs, e') = collectSCs' initialNameSupply e

collectSCs' :: NameSupply -> Expression -> (NameSupply, [SCDefn], Expression)
collectSCs' ns (EConst k) = (ns, [], EConst k)
collectSCs' ns (EVar v) = (ns, [], EVar v)
collectSCs' ns (EAp e1 e2) =
  (ns2, scs1 ++ scs2, EAp e1' e2')
  where
  (ns1, scs1, e1') = collectSCs' ns e1
  (ns2, scs2, e2') = collectSCs' ns1 e2
collectSCs' ns (ELam args body) =
  (ns2, (name, args, body') : bodySCs, EConst (CFun name))
  where
  (ns1, bodySCs, body') = collectSCs' ns body
  (ns2, name) = newName ns1 "SC"
collectSCs' ns (ELet isRec defns body) =
  (ns2, scs, ELet isRec defns' body')
  where
  (ns1, bodySCs, body') = collectSCs' ns body
  ((ns2, scs), defns') = mapAccumL collectSCs'' (ns1, bodySCs) defns

  collectSCs'' (ns, scs) (name, rhs) =
    ((ns1, scs ++ scs'), (name, rhs'))
    where
    (ns1, scs', rhs') = collectSCs' ns rhs

The code is rather complex compared to what it actually does. The only place where interest things happen is lambda abstractions. It replaces lambda abstractions by names and return supercombinators.

The code is complex because it violates the most important software engineering principle: separation of concerns. collectSCs contains at least three orthogonal concerns:

Generation of fresh names
Accumulation of supercombinatros
Transformation of expressions

Organize code using monads

Monads are great tools to separate concerns. For example, Reader monad helps us get rid of an extra argument used to pass a context. Writer monad frees us from the agony of returning the accumulated results in every function.

So let’s separate our concerns in collectSCs.

Supply monad for fresh name generation (it’s a State monad in disguise)
Writer monad for accumulation of supercombinators

Because we need to compose two different monads, we use the monad transformer SupplyT.

type Collector a = SupplyT (Writer [SCDefn]) a

collectSCs :: Expression -> [SCDefn]
collectSCs e = ("$main", [], e') : scs
  where
  (e', scs) = runWriter $ evalSupplyT 0 (collectSCs' e)

collectSCs' :: Expression -> Collector Expression
collectSCs' (EConst k) = return $ EConst k
collectSCs' (EVar v) = return $ EVar v
collectSCs' (EAp e1 e2) = do
  e1' <- collectSCs' e1
  e2' <- collectSCs' e2
  return $ EAp e1' e2'
collectSCs' (ELam args body) = do
  name <- freshName
  body' <- collectSCs' body
  collect (name, args, body')
  return $ EConst (CFun name)
collectSCs' (ELet isRec defns body) = do
  body' <- collectSCs' body
  defns' <- traverse (\(name, defn) -> (name,) <$> collectSCs' defn) defns
  return $ ELet isRec defns' body'

collect :: SCDefn -> Collector ()
collect defn = tell [defn]

We can see that the resulting code is much more readable by removing all the clutters that not not essential to the core logic.