Skip to content

Generic Programming

Generic programming is a way of automatically deriving useful typeclass instances for custom datatypes. This is often a pleasant alternative to using macros, which are not very Haskelly.

This means Haskell can automatically work out how to do things like:

  • serialize data of your custom type
  • parse it (from JSON, from command line args)
  • memoize functions that take it as input
  • sort and group it in O(n) time (possible for many types using variations of radix sort)
  • write lenses for it
  • generate examples of it for property tests

When writing a large codebase with many custom types, generic programming take remove a huge amount of "boilerplate" code. It also makes refactoring vastly easier, because substantial changes to a type's definition do not require any rewriting of manual instance definitions.

Warning

Errors messages that arise from a mistake involving generic programming constructs can often be very hard to interpret without knowledge of implementation details. This is the main downside to their use.

The following gives an example of all these abilities. Here are the imports and extensions that are needed:

Imports

-- some extensions that are needed
{-# LANGUAGE BlockArguments, DerivingVia, DerivingStrategies, TypeFamilies, LambdaCase, DeriveAnyClass, DataKinds, OverloadedStrings #-}

import Data.Set (Set)
import GHC.Generics (Generic)
import Data.Generics.Sum.Any
import Data.Generics.Product.Any
import Data.Discrimination
import Data.MemoTrie 
import Test.QuickCheck.Arbitrary.Generic
import Test.QuickCheck.Arbitrary
import Test.QuickCheck 
import Generic.Data 
import Data.Serialize (Serialize, encode, decode)
import Control.Lens ((^.), (^?), (^..))
import Data.ByteString (ByteString)
import Data.Constraint
import Data.Data (Proxy(..))
import Data.Functor.Compose (Compose)
import Options.Generic (ParseRecord, getRecord)
import qualified Data.List as List

An example using a custom type

Suppose we have some custom types like Position and UserInput:

data UserInput = Keys [Char] | Mouse {down :: Bool, moving :: Bool} 

We might declare a type like this to represent user input. The idea of generic programming is to avoid having to write "boilerplate" code to

Standard deriving

A familiar feature of Haskell in general is that we can automatically derive instances for certain special classes like Eq, Ord and Show (see more here).

deriving instance Eq UserInput
deriving instance Ord UserInput
deriving instance Show UserInput

Note

Here, the deriving expressions are standalone statements. One can also place them as part of a definition, as in:

data ExampleType = ExampleValue deriving Show

Generic serialization

Generic programming extends deriving to many more typeclasses. For example, we can derive functions to (un)serialize our type:

deriving instance Generic UserInput -- (1)!
deriving instance Serialize UserInput

serializedData :: ByteString
serializedData = encode (Keys ['a', 'b'])

unserializedData :: Either String UserInput
unserializedData = decode serializedData
  1. This is the instance which allows other instances, like Serialize to be defined in an automatic way.

This is like pickle in Python, but for arbitrary types, not just special pre-specified ones. We can also serialize to readable JSON, using the aeson library.

Generic parsing

Next, we can automatically create a command line parser for our type:

instance ParseRecord UserInput

commandLineParser :: IO ()
commandLineParser = do
    inp <- getRecord "Test program"
    print (inp :: UserInput)

When commandLineParser is called in the main of the executable, you can now run with command line arguments:

> cabal run executableName -- mouse --down
Mouse {down = True, moving = False}

> cabal run executableName -- keys "ab"
Keys "ab"

Generic memoization

Next, we can memoize (using entirely pure code!) costly functions that take this type as input. For example, suppose we have a function:

costlyFunction :: UserInput -> Int
costlyFunction = \case 
    Keys ['a','b'] -> head $ reverse [1..100000000] 
    _ -> 0 

instance HasTrie UserInput where
    newtype (UserInput :->: b) = UserInputTrie { unUserInputTrie :: Reg UserInput :->: b } 
    trie = trieGeneric UserInputTrie 
    untrie = untrieGeneric unUserInputTrie
    enumerate = enumerateGeneric unUserInputTrie




example = do 
    let memoizedCostlyFunction = memo costlyFunction

    print (memoizedCostlyFunction (Keys ['a','b']))
    print (memoizedCostlyFunction (Keys []))

    -- no need to recompute
    print (memoizedCostlyFunction (Keys ['a','b']))

Here, we did write an explicit instance for HasTrie (the class required in order to use the memoization function memo), but the definitions in the instance, like untrieGeneric, are all given via the Generic instance.

Generic sorting and grouping

Next, we can efficiently sort or group a list of UserInputs. For many types, the normal bound of \(O(n \log n)\) can be reduced to \(O(n)\) by clever variations of radix sort, and remarkably, Haskell can automatically derive these for you.

deriving instance Grouping UserInput
deriving instance Sorting UserInput

unique :: [UserInput]
unique = nub [Mouse True True, Keys ['a','b'], Mouse True True]

Generic lenses

The Generic instance also makes it possible to automatically derive lenses:

value1 = Keys ['a', 'b']
value2 = Mouse True False

keys :: [[Char]]
keys = value1 ^.. _As @"Keys" -- (1)!
-- > [['a','b']] (3)

isMoving :: [Bool]
isMoving = value1 ^.. _As @"Mouse" . the @2 
-- > []

isMoving2 :: [Bool]
isMoving2 = value2 ^.. _As @"Mouse" . the @2 -- (2)!
-- > False
  1. You need to use ^.. instead of ^. because there might be no subpart of a UserInput value of the form Keys _, as in the value Mouse True True.
  2. the @2 selects the second value in Mouse True False.
  3. The repl will print this as ["ab"], because "ab" is a way of writing ['a','b'] in Haskell.

Generic property testing

Finally, we can generate examples of UserInput to use for property tests. For instance, we could generate a random list of UserInputs, and check that no matter which one we generate, both the \(O(n)\) sort and the standard \(O(n\log n)\) sort agree:

deriving via GenericArbitrary UserInput instance Arbitrary UserInput -- (1)!

main :: IO ()
main = do 


    quickCheck (\ls -> (sort @[UserInput] ls == List.sort ls))


    -- define a property of "idempotency", which means: doing an operation twice is the same as doing it once
    let idempotent f x = f x == f (f x)

    quickCheck (idempotent (sort @[UserInput]))
    quickCheck (idempotent (nub @[UserInput]))
  1. This via syntax uses the DerivingVia extension. See more here.