For Laurence Rouesnel, whose profound interview question about folds I failed so spectacularly that it took me 3 years to finally understand that it was about fold composition. I am grateful I got the job anyway, and got to work with you. You were a skilled and generous teacher!
In the previous post, we introduced
Applicative Functors, and observed a symmetry between operating with unaugmented functions on unaugmented values, and augmented functions on augmented values.
In this post, we’ll bring all our concepts together into an application, and see how functional programming simplifies things in the real world.
A simple real-world problem
Let’s consider the following hypothetical problem:
A space-craft is chasing down a comet, and sending back a sequence of telemetry data messages (let’s call this type
TMsg). We are tasked with writing a software library that processes the sequence of these message and derives aggregates. There are complex rules governing cases like duplicate and corrupted messages, so we use a library provided from the science division where the scientists write up the following functions whose implementations we don’t get to see or change.
Now, using the provided
inc, can we compute the total and count of an arbitrary sequence of values?
A simple solution
This seems pretty straightforward, and we knock out something like this in C#:
A little wrinkle
Now here’s the first doozy: Can we compute the average of this sequence (defined as the ratio of the
sum to the
count - of course the boffins in the science division have given us such a function!) without iterating over the sequence twice?
Hmm…this is a little ugly, but we’re made of stern stuff, so we dig up the editor and source code, and come up with:
OK - that was close, but we made it work. Most people won’t find a sense of unease looking at this code because this is quite in line with how we write and extend code, but some things should start becoming obvious, even you don’t consider them problems:
- We needed to change the compile-time signature to account for the type of the notional
- We can conceive that there will be a proliferation of these functions going forward.
- We needed to muck with the inside of the for-loop, injecting the second computation of
countadjacent to the first computation of
Let’s solve the first two concerns first, using traditional abstraction techniques.
We now have a single function which we can use to build aggregates, and a single return-type returning all the values we computed. So at least we have limited the number of functions as requirements change. Definitely a little less ugly, but it still leaves us with the task of having to mess with the innards of this function (and the structure of the result type, but perhaps that’s not a terrible problem) every time we want to extend this function.
The end of the road
There’s a rumour going around that the scientists are going to give us some kind of comparison operators, expecting us to detect a notional
min, and then we may also have to compute some kind of standard deviation on the sequence. It dawns on us that we have no idea what the scientists will dream up next, so unless we want to keep opening up the
BuildAggregates function and rewriting it for every new combinator, we need to think of a better solution.
Perhaps we want a function that looks something like this
But now the types really start getting in our way.
- What kind of type arguments does this function take?
- How even can we pass in an array of arbitrarily typed combinators?
- What is the result type of our function?
It almost looks like we have tied ourselves up in knots going with a typed approach and maybe our only noninvasive solution will use dynamic types - sacrificing type safety for conciseness.
Is there a better way?
A New Hope :)
This is where Functional Programming concepts start to shine!
Each of the operations
average is technically called a
catamorphism. A catamorphism is a kind of
folding operation that breaks down a structure into a single element. While this might be the most common
folding operation you encounter,
folds are actually super-powerful, and we can implement many operations that you might not associate with folding with folds.
Let’s look at folds in an abstract way.
fold over a structure containing
items takes a
seed value, an operation to combine the
seed with an
item, and returns the
seed value with all the
items subsumed in it.
sum over a set of integers might be written as :
This structure describes an operation which will use on other structures, like lists.
For example, we could use this structure to operate on
IEnumerable<int> as follows, and we could use
Fold instead of
We can now conceivably compose two folds together to give a single fold:
It is important to note that each of the constituent folds has a different type of
'seed - but they all have to fold over the same kind of
What is cool about this is that we are composing folds by tupling the seeds and creating a composite accumulator that uses the constituent accumulators and tuples the result. We can now get the sense that we can compose an arbitrary number of folds together into a single fold using this technique.
OK, so let’s turn this into an applicative functor, and see how that turns out for us!
Map & Apply
One way to map over a fold is to map over the folded result. Basically, this gives us a way to process a resultant value after the fold is run. Let’s put that in:
Now, it’s important to note that the
Fold structure is actually already a representation of functions in context, so
Applying a fold over another fold is simply a matter of combining the folds exactly as we have in the
Go ahead and see if you can work out the type of the
(+) function. This practice is called equational reasoning and is a useful skill to develop when you are working with pure, statically typed functions.
Let’s put in some inline wrappers for good measure:
And find a way to use this over something foldable - like a
This is actually an annoyance in F# because have to create one of these for each type we consider “Foldable”. This means you will have to write a separate one for
Option - instead of having a concise way of specifying and abstracting over all types that are Foldable - as you can with languages such as Scala and Haskell.
An elegant and extensible solution
Let’s put the whole solution in a single block so we get the beauty of the solution at one shot. I’ll put some comments in the block where necessary.
You can also cut/paste the whole block into FSI to play with it…
So what have we achieved?
- We represented an abstract
Foldby a structure
- We gave that structure the characteristics of an Applicative Functor
- We used the mechanics afforded by the applicative functor to compose arbitrary folds directly from the provider combinators
- We had no need to rewrite structures or wrangle type signatures
- We computed many aggregates with a single pass over the input foldable
- We maintained complete compile-time type safety
- If we had more expressive types in F#, we would never write this structure more than once - and indeed, this abstraction would’ve been provided by the language!
In other words, we used FP and some mathematical foundations to come up with a totally general way to compose combinator functions - something we couldn’t actually do with the mechanics afforded by the traditional OO programming models.
How does it work?
While the rigorous mathematical underpinning of this approach is outside the scope of this blog post, we can try and get a handle, in loose terms, on why this approach works.
Consider what happens to each of the three members of the
Fold type when we compose two folds together:
When we are given two folds, the seed of the composite fold needs to reflect each of the constituent folds.
In a traditional approach, we would define the composite type statically much like we did with the
Aggregates type, and this is where things come unstuck.
What we actually need is a way to generically compose types together without losing type safety and without having to statically name the composed type, then we can compose types safely at runtime. Leaning on a formal type algebra which sits on a mathematical foundation (aka Category Theory) gives us the concept of product types or tuples. Modern C# does have tuples, and so the approach we took can be implemented with the functional aspects of modern C#.
The key to figuring out how this is extensible is to note that each time we compose a fold we increase the arity of the tuple. Strictly speaking, we could do this an arbitrary number of times, resulting a super-complicated type at runtime - but the compile-time simplicity and elegance of the structure doesn’t change; and the mathematical foundation ensures that the type safety is preserved the whole time.
Since we are smashing two simpler folds together into a composite one, the composite
Accumulate function needs to work with the tupled value as its input. F# (and modern C#) allow us to destructure the tuple argument and operate on the individual components. So we create new lambda that closes over the constituent
Accumulate functions and generates a result tuple!
The reasoning for the
Finish function is analogous to the
The important thing to see is that we can use
Type algebra to do the composition, and
Applicative Functor mechanics to do the composition within the
This has been a long post, with a lot of foundational material preceding. But I think it showcases a fairly plausible real-world situation, with real-world constraints, where the FP solution is far more elegant, safe and scalable than the alternative.
I hope this series has kindled a spark of curiosity within you, gentle reader, to dip your toes into the areas of FP and mathematically-sound reasoning.