Writer monad

- January 21, 2017
Kwang's Haskell Blog - Writer monad

Writer monad

Posted on January 21, 2017 by Kwang Yul Seo

The Writer monad represents computations which produce a stream of data in addition to the computed values. It is commonly used by code generators to emit code.

transformers provides both the strict and lazy versions of WriterT monad transformer. The definition of bind operator >>= reveals how the Writer monad works.

instance (Monoid w, Monad m) => Monad (WriterT w m) where
    return a = writer (a, mempty)
    m >>= k  = WriterT $ do
        (a, w)  <- runWriterT m
        (b, w') <- runWriterT (k a)
        return (b, w `mappend` w')

runWriterT returns a pair whose second element is the output to accumulate. Because the output value is a Monoid instance, we can merge two outputs w and w' using mappend and return the combined output.

Here is a simple example of the Writer monad. It accumulates LogEntrys in a list. (CAUTION: Do not use WriterT for plain logging in real world applications. It unnecessarily keeps the entire logs in memory. I recommend fast-logger for logging.)

import Control.Monad
import Control.Monad.Trans.Writer.Strict

data LogEntry = LogEntry { msg::String }
  deriving (Eq, Show)

calc :: Writer [LogEntry] Integer
calc = do
  output "start"
  let x = sum [1..10000000]
  output (show x)
  output "done"
  return x

output :: String -> Writer [LogEntry] ()
output x = tell [LogEntry x]

test = mapM_ print $ execWriter calc

The code looks innocuous, but its performance deteriorates when the accumulated log gets bigger because the Monoid instance of [] uses (++) to append two lists and the concatenations are left-nested.

do { tell [1]; tell [2]; tell [3]; tell[4]; tell [5] }
=>
(((([1] ++ [2]) ++ [3]) ++ [4]) ++ [5])

(++) is known to perform poorly when applications of (++) are left-nested.

Difference List

One well-known solution is to use the difference list instead of an ordinary list. DList provides O(1) append and snoc operations on lists. Demystifying DList explains how DList works in details.

The code is almost the same except we replaced [LogEntry] with DList LogEntry, but it scales well as the accumulated log gets bigger.

import Data.DList

calc :: Writer (DList LogEntry) Integer
calc = ...

output :: String -> Writer (DList LogEntry) ()
output x = tell (singleton (LogEntry x))

test = mapM_ print $ toList (execWriter calc)

Endo

Another option is to use Endo wrapper from Data.Monoid. It is an endomorphism from type a to a.

newtype Endo a = Endo { appEndo :: a -> a }
               deriving (Generic)

Surprisingly, it is an instance of Monoid. mempty is the identity function and mappend is the composition of two functions.

instance Monoid (Endo a) where
        mempty = Endo id
        Endo f `mappend` Endo g = Endo (f . g)

But how can I output a log? We need a function of type [LogEntry] -> [LogEntry] to make an Endo value. The trick is to create a section ([LogEntry x]<>) which prepends a log entry to the list.

calc :: Writer (Endo [LogEntry]) Integer
calc = ...

output :: String -> Writer (Endo [LogEntry]) ()
output x = tell $ Endo ([LogEntry x]<>)

test = mapM_ print $ appEndo (execWriter calc) []

But why does this use of Endo perform well? To see why, we need to see how the following code is actually evaluated.

do { tell [1]; tell [2]; tell [3]; tell[4]; tell [5] }

is translated to

([1]++) . ([2]++) . ([3]++) . ([4]++) . ([5]++)

This is a composition of functions whose type is [Int] -> [Int]. We can obtain the final result by applying [].

([1]++) . ([2]++) . ([3]++) . ([4]++) . ([5]++) $ []
=>
[1] ++ ([2] ++ ([3] ++ ([4] ++ ([5] ++ []))))

We can see that (++) operators are right-nested.

This also explains why DList in the previous section performs well because DList is just Endo specialized to lists.

newtype DList a = DL { unDL :: [a] -> [a] }

instance Monoid (DList a) where
    mempty  = DL id
    mappend xs ys = DL (unDL xs . unDL ys)

State Monad

It is possible to implement the Writer monad in terms of the State monad. We can store the accumulated logs in the state and update it by appending a new log.

import Control.Monad.Trans.State
import Data.Monoid ((<>))

calc :: State [LogEntry] Integer
calc = ...

output :: String -> State [LogEntry] ()
output x = modify (<> [LogEntry x])

test = mapM_ print $ execState calc []

Unfortunately, this version has the same performance issue with the initial version because applications of (++) are left-nested.

But there is a magical trick that can change this situation.

Backward State Monad

The section “2.8 Variation six: Backwards state” of Philip Wadler’s The essence of functional programming briefly mentions the Backwards state monad (also known as reverse state monad). This is a strange variant of the State monad where the state is propagated backward.

newtype RState s a = RState { runRState :: s -> (a,s) }

instance Monad (RState s) where
    return x = RState $ (,) x
    RState sf >>= f = RState $ \s ->
        let (a,s'') = sf s'
            (b,s') = runRState (f a) s
        in (b,s'')

rget = RState $ \s -> (s,s)
rmodify f = RState $ \s -> ((),f s)
rput = rmodify . const

execRState f s = snd (runRState f s)

In the definition of >>=, the state s is passed to the second expression and its result s' is passed back to the first expression. This seems impossible because two expressions are mutually recursive, but Haskell’s lazy evaluation makes it possible. In the backward state monad, rget reads the state from the future!

With this in mind, we can implement the Writer monad by prepending the log to the state. Because the state contains all the future logs, we can simply prepend our log to it.

calc :: RState [LogEntry] Integer
calc = ...

output :: String -> RState [LogEntry] ()
output x = rmodify ([LogEntry x]<>)

test = mapM_ print $ execRState calc []

Applications of (++) are right-nested because logs are accumulated backward from the end.

Readers who would like to know more about the backward state monads are referred to:

Read more

Difference List for Writer Monad

- January 14, 2014
Kwang's Haskell Blog - Difference List for Writer Monad

Difference List for Writer Monad

Posted on January 14, 2014 by Kwang Yul Seo

In writing a compiler in Haskell, it is conventional to use the Writer monad to accumulate emitted code.

The class declaration of MonadWriter is

class (Monoid w, Monad m) => MonadWriter w m | m -> w where

Here w must be a Monoid instance, so that MonadWriter can combine the outputs of the subcomputations using mappend.

A naive implementation of a compiler can use MonadWriter [String] to accumulate emitted code. Haskell list is an instance of Monoid and its mappend function is implemented with (++) which appends two lists.

This looks okay at first, but it is not an efficient way to use the Writer monad because the time complexity of (++) is O(n) on the length of the first operand. Because the first operand gets bigger as MonadWriter accumulates more code, the time complexity of MonadWriter [a] becomes quadratic.

A Difference list is the rescue because it is a Monoid instance which supports O(1) append and snoc operations on lists. So we can use MonadWriter (DList Instruction) instead of MonadWriter [Instruction] to accumulate emitted code and convert it to a list using DList.toList.

Wikipedia explains the term difference list as in the following:

In the second approach, difference lists are implemented as single-argument functions, which take a list as argument and prepend to that list. As a consequence, concatenation of difference lists of the second type is implemented essentially as function composition, which is O(1). However, of course the list still has to be constructed eventually (assuming all of its elements are needed), which is plainly at least O(n).

There is a chapter on Real World Haskell about the difference list: Chapter 13. Data Structures. Learn you Haskell for a Great Good also provides a chapter on the difference list: For a Few Monads More.

Read more