Most languages have some form of printf function that lets you print formatted strings. In Python it looks like this
In C it looks like this
What they have in common is that they take a format string as their first argument, followed by a variable number of arguments, of varying type. The arguments are formatted, interpolated into the string and then printed to the screen.
The immediate problem you face if you want to write printf in Haskell is the type system. All functions have a type, and the type determines what arguments it can accept. You can write polymorphic functions that accept a range of input types, but writing a function that accepts a variable number of arguments? That seems tough.
I recently read through the source of the Text.Printf module, and was impressed by how concise and self-contained it is, and especially by the neat trick that allows
printf to take a variable number of arguments. In this post I’ll walk through creating a simplified, cut-down version of Text.Printf to give a flavor of the module.
I’ll need the following uncontroversial language extensions:
First define a data type to hold things that are possible arguments to
printf (in our cut-down module, we will only care about interpolating strings and integers into the format string).
With this in hand, we could write the universal printf function,
uprintf. This takes a format string and a list of arguments to be interpolated into the string, and returns the formatted string. Its type signature is therefore:
All printf functionality will eventually be implemented in terms of this function. We’ll hold off on giving its implementation until later in the module, and focus on how we can write the polyvariadic, polymorphic
We get polymorphism using type classes. Define a type class for types that can be converted into a
UPrintf, along with a couple of useful instances:
1 2 3 4 5 6 7 8
There are multiple valid return types for the
printf function. For example, we might want to return a
String, or an
IO action that outputs to a file or to stdout, or more generally any instance of
MonadIO. So introduce a type class for valid return types:
spr function takes a format string and a list of suitable arguments to be interpolated into the string, and returns something. The key is in what the valid return types are. Certainly, it makes sense to return a string by calling
uprintf (we reverse the argument list, for reasons that will become clear in a bit)
It also makes sense to return an I/O action that does some printing
Here’s the cool part. If
a is an instance of
t is an instance of
PrintfType, then a function
a -> t is also a valid instance of
PrintfType – you should picture it as consuming the
a, and outputting another valid return type (which might be another function).
This is one way to implement polyvariadic functions in Haskell. Functions are curried by default so it’s easy to incrementally consume arguments, and return a function that consumes the rest of the arguments.
This is why we need to reverse the argument list when we make the final call to
uprintf – the arg list acts like a stack, so that the last arguments pushed onto it are the first ones out. We need to reverse the stack at the end so that we format the arguments in the right order.
Now to write
printf, just call
spr with an empty argument list – the supplied arguments will be consed onto the list by
spr until there are none remaining, and
uprintf is called:
The rest is just details. The
uprintf function incrementally consumes the format string, building the output up as it goes. If it exhausts either the format string or the argument string while the other still has elements remaining, then it throws an error. Otherwise it watches for the
% signs that signal a value to be interpolated, and calls the correct formatting function:
1 2 3 4 5 6 7 8 9
fmt function looks at what type of value is to be interpolated, and makes a call to the appropriate function for converting the argument into a string:
1 2 3 4 5 6 7 8 9 10 11
Our simple module only allows for strings and integers to be interpolated into the format string, and doesn’t allow for complicated format strings like
"%6.2f". These features can be implemented by adding more cases to
fmt – the fundamental architecture that allows for polymorphic, polyvariadic functions is in place.