Haskell for Pragmatic Programmers

Reified dictionaries

2017-02-06T00:00:00Z

GADT allows us to reify a constraint as an explicit dictionary. With ConstraintKinds, we can further generalize this trick. In this post, I will explain how this trick works.

In his article, Constraint Kinds for GHC, Max Bolingbroke showed a trick of reifying a constraint as an explicit dictionary using a GADT:

{-# LANGUAGE GADTs #-}

data ShowDict a where
  ShowDict :: Show a => ShowDict a

showish :: ShowDict a -> a -> String
showish ShowDict x = show x

use_showish :: String
use_showish = showish ShowDict 10

How does this trick work? GADTs extension plays an essential role here. When GADTs is enabled, a type-class context given in the constructor is available by pattern matching. In this example above, pattern matching on ShowDict makes the Show a type-class context available in the body of the showish function.

Operationally, the ShotDict constructor has a hidden field that stores the (Show a) dictionary that is passed to ShowDict; so when pattern matching that dictionary becomes available for the right-hand side of the match. Section 9.4.7 of the GHC user guide explains this behavior in details.

We can observe the (Show a) dictionary instance hidden in the constructor by dumping the GHC simplifier output. Pattern matching on the constructor reveals the hidden dictionary $dShow_aKG as follows.

showish_roY :: forall a_ayV. ShowDict a_ayV -> a_ayV -> String
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType]
showish_roY =
  \ (@ a_aKE) (ds_d10M :: ShowDict a_aKE) (x_ayW :: a_aKE) ->
    case ds_d10M of _ [Occ=Dead] { ShowDict $dShow_aKG ->
    show @ a_aKE $dShow_aKG x_ayW
    }

With ConstraintKinds extension, we can further generalize this idea by passing an arbitrary context to the constructor.

{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE GADTs #-}

data Dict ctxt where
  Dict :: ctxt => Dict ctxt

showish' :: Dict (Show a) -> a -> String
showish' Dict x = show x

use_showish' :: String
use_showish' = showish' Dict 10

Avoid overlapping instances with closed type families

2017-02-05T00:00:00Z

Overlapping instances are one of the most controversial features in Haskell. Fortunately, there are many tricks that let us avoid overlapping instances. In this post, I will introduce one such trick which uses closed type families.

Why overlapping instances are bad

In Haskell, we expect adding an extra instance in one module does not cause any other modules that depend on the given module to fail to compile or have different behaviors as long as the dependent modules use explicit import lists.

Unfortunately, OverlappingInstances breaks this expectation.

Module A

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}

module A where

class C a b c | a b -> c where
  f :: a -> b -> c

instance C String a String where
  f s _ = s

Module B

module B where

import A(C(..))

func :: String -> Int -> String
func = f

func "foo" 3 evaluates to "foo".

Let’s add a new instance declaration in A.

instance {-# OVERLAPPING #-} C String Int String where
  f s i = concat $ replicate i s

Module B still compiles, but func "foo" 3 now evaluates to "foofoofoo" because C String Int String is more specific than C String a String.

Wen can see that adding an extra instance silently broke the backward compatibility. To make the matters worse, there is no way to go back to the old behavior. GHC automatically chooses a more specific instance. In this case, C String Int String is chosen because it is more specific than C String a String.

Use cases of overlapping instances

Overlapping instances are controversial because they are too useful to remove. Overlapping instances are appealing because they express the common pattern of adding a special case to an existing set of overloaded functions.

Let’s check how show method from Prelude handles a list.

λ> show [1,2,3]
"[1,2,3]"
λ> show [False, True, False]
"[False,True,False]"

It converts a given list to a string by putting a comma between elements. According to the rule, it must show "foo" as ['f', 'o', 'o']. But show handles a string (a list of characters) in a different manner.

λ> show "abc"
"\"abc\""

This requires overlapping instances because [a] overlaps with [Char].

instance Show a => Show [a] where
  ...

instance {-# OVERLAPPING #-} Show [Char] where
  ...

Haskell 98 solution

Haskell Prelude avoided overlapping instances by using the extra-method trick. The trick does not require any GHC extensions, but class definitions become more complicated. Interested readers are referred to Brandon Simmons’s How the Haskell Prelude Avoids Overlapping Instances in Show for the details.

Another solution with closed type families

This solution is a variation of the solution introduced in Overcoming Overlapping section of Oleg Kiselyov’s Type equality predicates: from OverlappingInstances to overcoming them.

Here’s the list of GHC extensions and imports we need.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE UndecidableInstances #-}

import Data.Proxy

F is a type-level function which returns 'True for Char and 'False for any other types. This does not require overlapping instances because the set of special cases are closed.

type family (F a) :: Bool where
  F Char  = 'True
  F a     = 'False

ShowList class defines showl method.

class ShowList a where
  showl :: [a] -> String

The type checker computes the type of flag by evaluating F a and dispatches the method based on the type of flag. If it is 'True, it searches the special case instances. Otherwise, it searches the generic case instance.

instance (F a ~ flag, ShowList' flag a) => ShowList a where
  showl = showl' (Proxy :: Proxy flag)

class ShowList' (flag :: Bool) a where
  showl' :: Proxy flag -> [a] -> String

instance ShowList' 'True Char where
  showl' _ x = x

instance (Show a) => ShowList' 'False a where
  showl' _ x = show x

We can add another special case for Bool as follows:

type family (F a) :: Bool where
  F Char  = 'True
  F Bool  = 'True
  F a     = 'False

instance ShowList' 'True Bool where
  showl' _ x = map toBinaryDigit x
    where toBinaryDigit False = '0'
          toBinaryDigit True  = '1'

Now showList [True,False,True] evaluates to 101 instead of [True,False,True].

Context reduction

2017-02-02T00:00:00Z

Hello, Haskellers! Today I am going to explain what context reduction is and why it is necessary.

Quiz

Let’s start with a quick quiz. What’s the type of f?

f xs y  =  xs == [y]

The return type f must be Bool because the type of == is Eq a => a -> a -> Bool. If we assume the type of y is t, the type of xs must be [t] because two operands of == must have the same type. The type constraint must be Eq [t] because two lists are compared for equality. So we expect the type of f should be Eq [t] => [t] -> t -> Bool.

Let’s check the type in GHCi.

λ> f xs y  =  xs == [y]
f :: Eq t => [t] -> t -> Bool

Surprisingly, the context is Eq t instead of Eq [t]. Even though the equality is taken at the list type, the context must be simplified. This is called context reduction and is specified in Haskell 2010 Language Report (also in Haskell 98).

Context reduction

Type Classes and Constraint Handling Rules mentions two reasons why context reduction in Haskell is important.

Syntactically, context reduction allows the type checker to present type class constraints to the programmer in a more readable form.
Operationally, context reduction allows the type checker to put type class constraints into a more efficient form. Type class constraints are translated into dictionaries. Hence, simplifying type class constraints may allow a more efficient translation.

Let’s visit each reason with concrete examples.

Readability

What’s the type of g?

g a b = [show (a,a), show (a,b), show (b,a), show(b,b)]

If the type checker infers the type without simplification, it will be

g :: (Show (a,a), Show(b,b), Show (a,b), Show (b, a)) => a -> b -> [String]

But Haskell simplifies the context to

g :: (Show b, Show a) => a -> b -> [String]

The inferred type looks simpler to programmers.

Surprisingly, GHCi reports the simplified type even though I explicitly annotate the type with the former.

λ> :type g
g :: (Show b, Show a) => a -> b -> [String]

Efficient translation

GHC implements type classes as dictionary passing. Readers are referred to Section 4 of How to make ad-hoc polymorphism less ad hoc for the details.

Let’s see how type classes are actually translated by dumping the GHC simplifier output.

ghc -ddump-simpl -ddump-to-file -c a.hs

{-# NOINLINE f #-}
f :: (Eq a, Ord a) => a -> a -> Bool
f x y = x > y

main = print $ f 1 2

g_rn6 takes two dictionary arguments though the first one is never used (marked as Dead).

f_rn6
  :: forall a_aoY. (Eq a_aoY, Ord a_aoY) => a_aoY -> a_aoY -> Bool
[GblId, Arity=4, Caf=NoCafRefs, Str=DmdType]
f_rn6 =
  \ (@ a_a1vN)
    _ [Occ=Dead]
    ($dOrd_a1vP :: Ord a_a1vN)
    (x_a1rc :: a_a1vN)
    (y_a1rd :: a_a1vN) ->
    > @ a_a1vN $dOrd_a1vP x_a1rc y_a1rd

Call sites of g must create and pass these dictionary arguments when they call g.

main :: IO ()
[GblId, Str=DmdType]
main =
  print
    @ Bool
    GHC.Show.$fShowBool
    (f_rn6
       @ Integer
       integer-gmp-1.0.0.1:GHC.Integer.Type.$fEqInteger
       integer-gmp-1.0.0.1:GHC.Integer.Type.$fOrdInteger
       1
       2)

Simplifying type class constraints allow a more efficient translation because it removes redundant dictionary arguments.

{-# NOINLINE f #-}
f x y = x > y

main = print $ f 1 2

is translated to

f_rn6 :: forall a_a1vz. Ord a_a1vz => a_a1vz -> a_a1vz -> Bool
[GblId, Arity=3, Caf=NoCafRefs, Str=DmdType]
f_rn6 =
  \ (@ a_a1vz)
    ($dOrd_a1zP :: Ord a_a1vz)
    (x_aoY :: a_a1vz)
    (y_aoZ :: a_a1vz) ->
    > @ a_a1vz $dOrd_a1zP x_aoY y_aoZ

g_rn6 takes only one dictionary argument $dOrd_a1zP because context reduction merged (Eq a, Ord a) into Ord a. This is a valid simplification because Ord a implies Eq a.

Formal semantics

The Haskell report provides only informal hints about context reduction.

Fortunately, Section 7.4 of Mark P. Jones’ Typing Haskell in Haskell gives us the formal semantics of context reduction in Haskell. Section 3.2 of Type classes: exploring the design space also discusses context reduction. Interested readers are referred to both papers.

Simple benchmarking with GHCi

2017-02-01T00:00:00Z

GHCi has a lesser known option :set +s. When turned on, GHCi displays some stats for each expression evaluated.

Let’s experiment with the option.

λ> :set +s

:set +s displays the elapsed time and number of bytes allocated after evaluating each expression.

λ> fibs = 0 : scanl (+) 1 fibs
(0.00 secs, 0 bytes)

The number of bytes allocated is zero for fibs because no GC has occurred.

NOTE: the allocation figure is only accurate to the size of the storage manager’s allocation area, because it is calculated at every GC. Hence, you might see values of zero if no GC has occurred.

λ> fibs !! 100
354224848179261915075
(0.01 secs, 110,440 bytes)

fibs !! 100 took 0.01 seconds and allocated 110,440 bytes of memory.

This is a quick-and-dirty way to get a feel for the performance of a function. If you need a serious benchmark, please use criterion instead.

Type-level insertion sort

2017-01-30T00:00:00Z

Multi-parameter type classes and functional dependencies made type-level programming possible. Back in 2000, Thomas Hallgren showed an implementation of insertion sort as an example of static computation using functional dependencies. The code has a strong resemblance to logic programming which looks bizarre to most functional programmers. In this post, I will show you a more “functional-style” implementation of insertion sort using closed type families.

Term-level insertion sort

Here’s an implementation of insertion sort we all know.

sort :: (Ord a) => [a] -> [a]
sort [] = []
sort (x : xs) = insert x (sort xs)

insert :: (Ord a) => a -> [a] -> [a]
insert x [] = x : []
insert x (y : ys) = insert' (compare x y) x y ys

insert' :: (Ord a) => Ordering -> a -> a -> [a] -> [a]
insert' LT  x y ys = x : (y : ys)
insert' _   x y ys = y : insert x ys

l = [1, 3, 2, 4, 7, 9, 5]

sort l sorts the given list.

λ> sort l
[1,2,3,4,5,7,9]

To implement insertion sort in type-level, we must be able to define

naturals, booleans and lists
functions

in type-level. For the basics of type-level programming, readers are referred to Type-level functions using closed type families.

Insertion sort

Here’s an implementation of type-level insertion sort. One can see the strong similarity with the term-level insertion sort.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}

type family Sort xs where
  Sort '[] = '[]
  Sort (x ': xs) = Insert x (Sort xs)

type family Insert x xs where
  Insert x '[] = x ': '[]
  Insert x (y ': ys) = Insert' (CmpNat x y) x y ys

type family Insert' b x y ys where
  Insert' 'LT  x y ys = x ': (y ': ys)
  Insert' _    x y ys = y ': Insert x ys

type L = [1, 3, 2, 4, 7, 9, 5]

In this simple scenario, converting a term-level function into a type-level function is almost mechanical. Just a few rules suffice.

sort -> type family Sort
[] -> ’[]
(x : xs) -> (x ’: xs)
compare -> CmpNat

We can evaluate Sort L using GHCi’s kind! command.

λ> :kind! Sort L
Sort L :: [Nat]
= '[1, 2, 3, 4, 5, 7, 9]

Build your Haskell project continuously

2017-01-28T00:00:00Z

Today I am going to introduce handy tools which help you build your Haskell project continuously so that you can see the list of errors and warnings quickly as you program.

Stack

stack build command has --file-watch option. When turned on, stack watches for changes in local files and automatically rebuild.

stack build --file-watch

Use --fast option if you want fast build which turns off optimizations (-O0). Also use --pedantic flag if you want to fix all warnings(-Wall and -Werror).

stack build --file-watch --fast --pedantic

ghcid

Neil Mitchell’s ghcid provides a similar functionality in a different way. It runs GHCi as a daemon and runs :reload whenever your source code changes.

ghcid executes stack ghci by default if you have stack.yaml file and .stack-work directory.

ghcid

If you would like to give a custom command, use --command option.

ghcid "--command=ghci Main.hs"

ghcid is much faster than stack build because it uses GHCi.

Steel Overseer

If you want to run arbitrary commands when arbitrary files change, use Steel Overseer instead. You can specify the pattern and commands in .sosrc file using YAML syntax. The following example has two rules.

Watch *.hs files under System directory and run stack build.
Watch *.hs files under test directory and run stack test.

- pattern: src/(.*)\.hs
  commands:
  - stack build
- pattern: test/(.*)\.hs
  commands:
  - stack test

sos command watches the specified files and runs the corresponding commands.

sos

Wrap-up

These small tools greatly increase your productivity. Please choose one and enjoy instant feedback!

How I learned Haskell

2017-01-27T00:00:00Z

Happy lunar new year! Today I would like to share my experience of learning Haskell. It’s been a really long journey but a really wonderful one.

First Encounter

Back in 2000, I was an undergraduate student majoring in computer science. At that time, I was into system programming and enjoyed learning low-level implementation details of operating systems and system applications. My primary language was C and programmed all programming assignments in C. I was proud that I understood how my code was compiled and executed on the machine.

One day my friend told me about Haskell. I wasn’t interested in functional programming at that time, but I became curious because he enthusiastically persuaded me to learn Haskell. I ended up reading two books on Haskell.

Both books were really nice and I learned I could program in an entirely different way. However, I wasn’t sure if Haskell could solve the real-world problems I wanted to solve. So my interest in Haskell stopped there.

Dark age

My first job was mainly about porting and optimizing Java Virtual Machine for embedded systems. My company licensed Sun’s CDC JVM and I was responsible for maintaining it.

It was 2002 and Linux was still a luxury for embedded systems. RTOSes such as pSOS and VxWorks were popular on STBs and I ported JVM to these OSes. These RTOSes didn’t have distinctions between kernel and user space and an application was linked statically with the kernel and ran as a single (kernel) process application on the device.

The implication was profound. I had no safety guarantee provided by modern operating systems. A bug in an application could corrupt the kernel data and crash the entire system. Moreover, because there were dozens of threads competing for shared resources, race conditions and dead locks were a common place. It took hours or even days to find and fix a trivial bug.

The situation was much better when debugging an application written in Java. Thanks to the safety guarantee of Java, certain types of bugs were impossible. A Java program can’t corrupt memory and crash the system. Dead locks are reported systematically by the JVM. It was relatively fun to fix a bug in Java applications.

These experiences motivated me to find systematic ways of preventing bugs and led me to read Benjamin C. Pierce’s Types and Programming Languages. It was the best computer science text book I’ve ever read. I understood why types were important in statically typed functional languages like Haskell! If universities had used this book as a undergraduate PL textbook, many of the confusions and misunderstandings about dynamic vs static typing would have disappeared.

Stuck again

By fully understanding the merits of type systems, I started to learn Haskell in a different perspective. I read A Gentle Introduction To Haskell and many tutorials on monads. It wasn’t very hard to understand specific instances of monads such as Reader, Writer, State, List and Maybe. But I couldn’t figure out how they were related. I managed to write simple applications in Haskell, but wasn’t confident that I could use Haskell productively because I couldn’t fully understand one of the core ideas of Haskell.

The Challenges of Multi-core Programming

In the meantime, I changed my career and founded a tech start-up in 2008. I built mobile web browsers for the embedded systems. I created a port of WebKit and hacked various components of WebKit to speed up the performance. The primary means for optimization was to leverage the multi-core CPU and GPU.

WebKit performs lots of tasks concurrently but it is mostly single-threaded. Loading a page does not benefit much from having a multi-core CPU. So I offloaded some tasks to separate threads but I only gained marginal performance benefits in exchange for largely increased complexity. I learned a lesson that I must pay very high costs of complexity to get small benefits of performance boost. Considering the already complex nature of WebKit, I ended up abandoning most of performance optimizations to keep the complexity under control.

While struggling to squeeze performance out of WebKit, I learned Haskell again to get some insights on parallel programming because Haskell was the only programming language which natively supported STM(Software Transaction Memory). Simon Marlow’s Parallel and Concurrent Programming in Haskell helped me understand how Haskell supported parallel and concurrent programming. Though I learned many valuable lessons from the book, I also felt that the lazy nature of Haskell didn’t go well with parallel programming.

Reunion

I have spent more than 10 years of my career on embedded systems and increasingly got frustrated with the tools available. So I decided to teach myself Haskell again and use it at work. This time I started to read classic papers on functional programming and Haskell.

Philip Wadler’s Monads for functional programming clicked my mind and I finally became enlightened. The paper is really well written, but I don’t think I could understand Monad just because I read the paper. Years of trials and errors were necessary to understand abstract concepts like monad. It was the most exciting moment of my long journey to Haskell.

Once I understood how I could learn abstractions, the rest was easy. Now I don’t get discouraged just because I don’t understand abstractions at first glance. It takes time and practice to understand abstract things. I also realized that monad was just the beginning. There exist many Haskell idioms that require other abstract concepts such as applicative functor, arrow, profunctor and so on.

Here is the list of papers I found most enlightening when learning Haskell. I also recommend you read any paper with “Functional Pearl” attached to it.

Back to Real-World

I was confident that Haskell was a good programming language and I was looking for opportunities to use Haskell in production. Bryan O’Sullivan, Don Stewart, and John Goerzen’s Real World Haskell was a good reference in this direction. It showed how I could use Haskell to do my daily work such as networking, system programming, databases and web programming.

Finally, I started to read real-world Haskell code available on the hackage and realized that the Haskell I know was different from the Haskell that is actually used. Real world Haskell uses lots of GHC extensions which makes me feel it is an entirely different language. A typical Haskell module starts with:

{-# LANGUAGE CPP                        #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE ConstraintKinds            #-}
{-# LANGUAGE FlexibleInstances          #-}
{-# LANGUAGE FunctionalDependencies     #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE RecordWildCards            #-}
{-# LANGUAGE TupleSections              #-}
{-# LANGUAGE TypeFamilies               #-}
{-# LANGUAGE RankNTypes                 #-}
{-# LANGUAGE DeriveDataTypeable         #-}

It seems sticking to Haskell 98 or 2000 does not have much practical sense because many Haskell packages already use many GHC extensions. So I learned them too. 24 Days of GHC Extensions was a really great way of learning this topic.

I like the approach of Yesod Web Framework Book which explains the GHC extensions used in the library before explaining how to use the library. This is often the first step to learn a new library for many Haskell programmers. For example, you can’t use Servant unless you understand DataKinds and TypeOperators. So I encourage Haskell library authors to write more about the GHC extensions they use.

I also found that some packages are essential to use Haskell in practice.

String type has problems. You need either text or bytestring for efficient string data processing.
Lazy IO looks nice, but does not work well in practice. To process streaming data properly you need either pipes or conduit.
You will need a custom monad or monad transformer for your application sooner or later. Either mtl or transformers is required.
JSON is a really universal data exchange format these days. aeson will help you here.
QuickCheck is a bonus you get from using Haskell!

Haskell in production

I founded a small Haskell shop this year and started to use Haskell in production. I realized that using Haskell in production was, surprisingly, easier than learning Haskell. It took me more than 10 years to learn Haskell, but I felt confident that I could use Haskell in production only after a few months of experiments.

There is one thing I would like to emphasize. Using Haskell does not mean that you must understand all the dependencies you use. Haskell programmers tend to care much about the implementation details of their dependencies because Haskell makes it so easy to understand the meaning of programs with types and equational reasoning. But in my opinion, this is a blessed curse.

That’s not how civilization works. You can drive a car without understanding how engines work. Python or JavaScript programmers do not care about the implementation details of their dependencies because it is simply impossible. Haskell is no exception. Time and money are limited. Please don’t spend too much time on understanding things. Spend more time on building things. Be practical.

Fortunately, some library authors provide a high-level overview of their library. Type-level Web APIs with Servant is a great example. It explains the core concepts and the implementation techniques of the library without involving accidental complexities of implementation details. I would love to see more papers like this.

Tools and Libraries

Stackage and the Stack are indispensable tools for using Haskell in production. All the hard work of FP Complete gave me confidence that Haskell was production-ready. The Haskell ecosystem is not small anymore. There are multiple competing web frameworks such as Yesod, Scotty, Snap, Happstack and Servant. Qualities of these packages are all good.

If you write web servers in Haskell, all the packages you need such as web servers, web frameworks, logging packages, database drivers are already available. I use Servant, persistent and esqueleto for my server. So far, everything works fine.

Haskell Community

Haskell community is relatively small compared to other major languages, but I am often surprised by the quality of feedback I get from the community. Haskell is a great language, but the community is even greater. That’s the biggest reason why I love programming in Haskell.

My journey to Haskell is still going on.

Write your own stream processing library Part1

2017-01-25T00:00:00Z

pipes and conduit are two competing libraries for handling stream data processing in Haskell. Though both libraries provide excellent tutorials on the usage of the libraries, the implementation details are impenetrable to most Haskell programmers.

The best way to understand how these streaming libraries work is to write a minimalistic version by ourselves. In this post, I will show you how we can write a small streaming data library with coroutine. Our implementation is based on Mario Blazevic’s excellent article Coroutine Pipelines.

Generator

Generator is a monad transformer which allows the base monad to pause its computation and yield a value. This corresponds to Producer of pipes or Source of conduit.

{-# LANGUAGE LambdaCase #-}

import Control.Monad
import Control.Monad.Trans.Class

newtype Generator a m x =
  Generator { bounceGen :: m (Either (a, Generator a m x) x) }

Generator a m x represents a computation which yields values of type a on top of the base monad m and returns a value of type x.

Either indicates that Generator has two cases:

(a, Generator a m x): A pair of a yielded value and a suspension to be resumed.
x: A return value x.

The enclosing m allows us to perform monadic actions while running the generator.

The definition of Monad instance for Generator is as follows:

instance Monad m => Monad (Generator a m) where
  return  = Generator . return . Right
  t >>= f = Generator $ bounceGen t
                      >>= \case Left (a, cont) -> return $ Left (a, cont >>= f)
                                Right x -> bounceGen (f x)

instance MonadTrans (Generator a) where
  lift = Generator . liftM Right

yield :: Monad m => a -> Generator a m ()
yield a = Generator (return $ Left (a, return ()))

>>= operator has two cases to consider. If t is a suspension (Left case), it yields a and combines the remaining computation cont with f. If t is a value x (Right case), it continues the computation by passing the value to f. Once we define >>= this way, the definition of yield is straightforward. It yields a value and does nothing more.

To run a Generator, we need runGenerator function which collects the yielded values while executing the generator. run' uses a difference list to collect yielded values and converts it to the normal list by applying [] at the end.

runGenerator :: Monad m => Generator a m x -> m ([a], x)
runGenerator = run' id where
  run' f g = bounceGen g
             >>= \case Left (a, cont) -> run' (f.(a:)) cont
                       Right x -> return (f [], x)

Now we are ready to create generators. triple is a generator which yields the given value three times.

triple :: Monad m => a -> Generator a m ()
triple x = do
    yield x
    yield x
    yield x

Running triple 3 returns ([3, 3, 3], ()) as expected.

λ> runGenerator $ triple 3
([3,3,3],())

When the base monad is IO, we can interleave IO actions. For example, loop yields the line input from the stdin until an empty string is read.

loop :: Generator String IO ()
loop = do
    str <- lift getLine
    when (str /= "") $ do
      yield str
      loop

λ> runGenerator loop
Hello
world!

(["Hello","world!"],())

It is even possible to mix two generators by alternating each generator.

alternate :: Monad m => Generator a m () -> Generator a m () -> Generator a m ()
alternate g1 g2 = Generator $ liftM2 go (bounceGen g1) (bounceGen g2)
  where
    go (Left (a, cont)) (Left (b, cont')) = Left  (a, Generator $ return $ Left (b, alternate cont cont'))
    go (Left (a, cont)) (Right _)         = Left  (a, cont)
    go (Right _)        (Left (b, cont))  = Left  (b, cont)
    go (Right _)        (Right _)         = Right ()

We can see that the outputs of triple 1 and triple 2 are intermingled.

λ> runGenerator $ alternate (triple 1) (triple 2)
([1,2,1,2,1,2],())

Part 2 of this post will continue the discussion with Iteratees.

Generating the Docker client with servant-client

2017-01-24T00:00:00Z

Servant provides a type-level DSL for declaring web APIs. Once we write the specification with the DSL, we can do various things including:

Write servers (this part of servant can be considered a web framework),
Obtain client functions (in Haskell),
Generate client functions for other programming languages,
Generate documentation for your web applications

The primary use case of Servant is to write servers, but we can use servant-client to generate client functions for the pre-existing web servers too! In this post, I will show you how we can generate client functions for the Docket remote API automatically with servant-client.

API specification

To make the exposition simple, we will specify only three APIs: ping, version and containerList.

The simplest API is Ping which tests if the server is accessible. Its path is /v1.25/_ping and it returns OK as a plain text with status code 200. We can succinctly describe this endpoint with Servant’s type-level DSL.

type Ping = "_ping" :> Get '[PlainText] Text

Version is a slightly more complex API which returns the version information as JSON. Version data type has the required fields and it declares an instance of FromJSON for unmarshalling JSON data into Version. fieldLabelModifier is used to bridge JSON field names to Version field names.

type Version = "version" :> Get '[JSON] Version

data Version = Version
  { versionVersion       :: Text
  , versionApiVersion    :: Text
  , versionMinAPIVersion :: Text
  , versionGitCommit     :: Text
  , versionGoVersion     :: Text
  , versionOs            :: Text
  , versionArch          :: Text
  , versionKernelVersion :: Text
  , versionExperimental  :: Bool
  , versionBuildTime     :: Text
  } deriving (Show, Eq, Generic)

instance FromJSON Version where
  parseJSON = genericParseJSON opts
    where opts = defaultOptions { fieldLabelModifier = stripPrefix "version" }

stripPrefix :: String -> String -> String
stripPrefix prefix = fromJust . DL.stripPrefix prefix

Finally, ContainerList returns the list of containers. The API takes optional query parameters such as all, limit, size and filters as specified follows. We created a newtype wrapper for ContainerID and declared FromJSON instances for ContainerID and Container. Some fields are omitted for brevity.

type ContainerList = "containers" :> "json" :> QueryParam "all" Bool
                                            :> QueryParam "limit" Int
                                            :> QueryParam "size" Bool
                                            :> QueryParam "filters" Text
                                            :> Get '[JSON] [Container]

newtype ContainerID = ContainerID Text
  deriving (Eq, Show, Generic)

instance FromJSON ContainerID

data Container = Container
  { containerId               :: ContainerID
  , containerNames            :: [Text]
  , containerImage            :: Text
  , containerImageID          :: ImageID
  , containerCommand          :: Text
  , containerCreated          :: Int
  -- FIXME: Add Ports
  , containerSizeRw           :: Maybe Int
  , containerSizeRootFs       :: Maybe Int
  -- FIXME: Add Labels
  , containerState            :: Text
  , containerStatus           :: Text
  -- FIXME: Add HostConfig
  -- FIXME: Add NetworkSettings
  -- FIXME: Add Mounts
  } deriving (Show, Eq, Generic)

Our API is just the combination of these endpoints.

type Api = Ping :<|> Version :<|> ContainerList

API Versioning

Because the Docker remote API has many versions, it adds a version prefix in the path. Servant allows us to expression this version scheme by declaring a new Api with the version prefix.

type ApiV1_25 = "v1.25" :> Api

We can also mix-and-match many endpoints as the Docker remote API changes. Let’a assume that the docker API version v1.26 changed the specification of the Version endpoint. We can reuse unchanged endpoints by replacing only the changed endpoints with new ones.

type Version1_26 = ...
type ApiV1_26 = "v1.26" :> (Ping :<|> Version1_26 :<|> ContainerList)

Generating Client Functions

Now it’s time to generate client functions from the specification. It’s super easy! We can simply pass our API to client function.

ping :: ClientM Text
version :: ClientM Version
containerList' :: Maybe Bool -> Maybe Int -> Maybe Bool -> Maybe Text -> ClientM [Container]

ping
  :<|> version
  :<|> containerList' = client apiV1_25

ping and version functions are okay, but the signature containerList' is a bit confusing. We have to pass four Maybe values and two of them have the Bool type and it is not easy to remember the order of the arguments. We can improve the function by declaring a wrapper function containerList. It takes a ContainerListOptions, and the users of the function can pass defaultContainerListOptions as the default value.

data ContainerListOptions = ContainerListOptions
  { containerListOptionAll     :: Maybe Bool
  , containerListOptionLimit   :: Maybe Int
  , containerListOptionSize    :: Maybe Bool
  , containerListOptionFilters :: Maybe Text
  } deriving (Eq, Show)

defaultContainerListOptions :: ContainerListOptions
defaultContainerListOptions = ContainerListOptions
  { containerListOptionAll     = Just False
  , containerListOptionLimit   = Nothing
  , containerListOptionSize    = Just False
  , containerListOptionFilters = Nothing
  }

containerList :: ContainerListOptions -> ClientM [Container]
containerList opt = containerList' (containerListOptionAll opt)
                                   (containerListOptionLimit opt)
                                   (containerListOptionSize opt)
                                   (containerListOptionFilters opt)

Because the expressiveness of Haskell is much more powerful than that of the REST API specification, these wrappings are somewhat unavoidable to make our client functions more Haskell-friendly.

Using Client Functions

Now our client functions for the Docker API is ready. We need to prepare a ClientEnv by passing the host, port and url prefix of the server. We also created a custom connection manager which uses the domain socket for communication because the Docker server listens on the domain socket by default. Interested readers are referred to my previous article Custom connection manager for http-client for the implementation details of newUnixSocketManager.

query :: ClientM [Container]
query = do
  ok <- ping
  liftIO $  print ok
  version <- version
  liftIO $ print (versionVersion version)
  containerList defaultContainerListOptions

app :: String -> Int -> IO ()
app host port = do
  manager <- newUnixSocketManager "/var/run/docker.sock"
  res <- runClientM query (ClientEnv manager (BaseUrl Http host port ""))
  case res of
    Left err          -> putStrLn $ "Error: " ++ show err
    Right containers  -> mapM_ print containers

Because ClientM is a monad, we can combine multiple monadic actions into one. query function pings the server, queries the version information and then request the list of containers.

Swagger

So far I manually specified the API with Servant’s DSL, but if the server has the Swagger specification we can even generate the Servant DSL from the Swagger specification. swagger-codegen has the HaskellServantCodegen, so we can use it! (I haven’t tried it yet.)

Wrap-up

Writing client functions for existing servers are boring and repetitive. With servant-client, we no longer need to write these functions. We just specify the API and Servant writes the client functions for us. Have fun with Servant!

Custom connection manager for http-client

2017-01-23T00:00:00Z

http-client provides the low-level API for HTTP client. In this post, I will explain how to create custom connection managers. If you want to know the basics of the library, read Making HTTP requests first.

Every HTTP request is made via a Manager. It handles the details of creating connections to the server such as managing a connection pool. It also allows us to configure various settings and setup secure connections (HTTPS).

The easiest way to create one is to use newManager defaultManagerSettings as follows.

import Network.HTTP.Client
import Network.HTTP.Types.Status (statusCode)

main :: IO ()
main = do
  manager <- newManager defaultManagerSettings

  request <- parseRequest "http://httpbin.org/post"
  response <- httpLbs request manager

  putStrLn $ "The status code was: " ++ (show $ statusCode $ responseStatus response)
  print $ responseBody response

But the default connection manager is not enough for some cases. One such case is the docker remote API. Because Docker listens on the unix domain socket by default for security reasons, we can’t access to the API with the default connection manager which uses tcp.

But we are not out of luck. We can configure the connection manager to create a unix domain socket instead of a tcp socket by setting a custom managerRawConnection field.

managerRawConnection :: ManagerSettings -> IO (Maybe HostAddress -> String -> Int -> IO Connection)

It is used by Manager to create a new Connection from the host and port. So we can make the connection manager to create a unix socket by replacing the default implementation with openUnixSocket.

import qualified Network.Socket as S
import qualified Network.Socket.ByteString as SBS
import Network.HTTP.Client
import Network.HTTP.Client.Internal (makeConnection)
import Network.HTTP.Types.Status (statusCode)

newUnixDomainSocketManager :: FilePath -> IO Manager
newUnixDomainSocketManager path = do
  let mSettings = defaultManagerSettings { managerRawConnection = return $ openUnixSocket path }
  newManager mSettings
  where
    openUnixSocket filePath _ _ _ = do
      s <- S.socket S.AF_UNIX S.Stream S.defaultProtocol
      S.connect s (S.SockAddrUnix filePath)
      makeConnection (SBS.recv s 8096)
                     (SBS.sendAll s)
                     (S.close s)

By creating a connection manager with “/var/run/docker.sock”, we can make a request to the docker. The code below returns the version of the docker.

main :: IO ()
main = do
  manager <- newUnixDomainSocketManager
   "/var/run/docker.sock"
  request <- parseRequest "http://192.168.99.100:2376/v1.25/version"
  response <- httpLbs request manager

  putStrLn $ "The status code was: " ++ (show $ statusCode $ responseStatus response)
  print $ responseBody response

Haskell for Pragmatic Programmers

Reified dictionaries

Avoid overlapping instances with closed type families

Why overlapping instances are bad

Use cases of overlapping instances

Haskell 98 solution

Another solution with closed type families

Other solutions

Context reduction

Quiz

Context reduction

Readability

Efficient translation

Formal semantics

Simple benchmarking with GHCi

Type-level insertion sort

Term-level insertion sort

Insertion sort

Build your Haskell project continuously

Stack

ghcid

Steel Overseer

Wrap-up

How I learned Haskell

First Encounter

Dark age

Stuck again

The Challenges of Multi-core Programming

Reunion

Back to Real-World

Haskell in production

Tools and Libraries

Haskell Community

Write your own stream processing library Part1

Generator

Generating the Docker client with servant-client

API specification

API Versioning

Generating Client Functions

Using Client Functions

Swagger

Wrap-up

Custom connection manager for http-client