In part I and part II we played around with different constructs inside C++ templates. Now we’re ready to tackling something a bit meatier - sorting a list at compile time!
Recall from the previous sections that when working with template meta-programming that we’re dealing with programs whose functions take the form of templates and values take the form of types.
As a result, remember that that our lists are of the form:
Now to sort this list, we’re going to use a merge sort. The variant of this that we’re going to implement works as follows:
Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted).
Repeatedly merge sublists to produce new sorted sublists until there is only 1 sublist remaining. This will be the sorted list.
To write the first part, splitting the list into smaller parts, we’ll need a PAIR data type so we can split a list in to two parts. It’s tempting to create a struct with values for this, but remember, we want to allow the data type to hold arbitrary values from meta-programming, so it needs contain types (i.e. typedefs).
Now’re we’re able to write the first part of the mergesort algorithm, the split routine.
Note how we’ve used private typedefs to hold temporary values that are used to generate the public result type.
To write the second part, we’ll need to be able to merge two sorted lists together to make a single sorted list. This will require a conditional, an IF.
Here, we now have a IF template that we can use to generate conditionals. We’ve defined the general template at the top, and then in two
template specialisations we’ve defined the ELSE and THEN cases. To use it, we call it like so:
We can now write our MERGE template.
And finally, putting it all together, we can write a SORT that recurisively splits a list into smaller and smaller parts, and then merges them back together into a single sorted list.
And there we go - we’ve defined a list and sorted it, all at compile time.
Code for the examples in this post (and more!) are in the git repository at tmp-play.
Libevent is a library to allow for callbacks to be run when specific events occur on file descriptors, after a timeout has occured, when a POSIX-style signal is raised, or at user defined points. I’ve been using it to write a single threaded program to start and control network processes. It works well, but I ran into a problem when combining libevent with fork, exec and children processes.
When you trap a POSIX-style signal, libevent documentation mentions that:
Note that signal callbacks are run in the event loop after the signal occurs,
so it is safe for them to call functions that you are not supposed to call
from a regular POSIX signal handler.
This is excellent! This means we can call printf, malloc and all the other asynchronous-unsafe functions that you can’t call from an ordinary signal handler.
However, the caveat I discovered, arose in libevent’s implementation of how this is achieved.
My program would start other programs after timeouts, catch their SIGCHILD and potentially restart them. These child programs are arbitrary executables that other parts of our code base could configure and some of these children would need to catch signals of their own. I was finding that any signals that I installed handlers for (using libevent) in the parent program were being ignored in my children process.
After some head scratching, I looked at my parent process’s signal statuses at /proc/<pid>/status.
All the signals that I was catching in my parent process were being blocked in my child process! No wonder they were ignoring signals!
So, what was going on? Well, the way libevent manages to run it’s signal callbacks in the main event loop, sometime after the actual signal was caught, is by not running the callback directly from the signal handler. All the signal handler needs to do is to mark the callback event as active, and then exit. Then when the main libevent loop next runs it will see that the signal callback is active and execute it from the main context, not from an asynchronous signal handler context.
When the main libevent loop is running a callback, it uses sigprocmask(SIG_BLOCK... to mask out the signals so the asynchronous handlers can’t interrupt the callbacks. To start my children, I was calling fork and exec from libevent callbacks and so my child processes were inheriting their parent’s block mask (signal block masks are inherited across a fork and aren’t cleared by an exec). And so my child processes had their block masks set to be the mask of what was being caught by my parent process.
To fix this, I added the following code after my call to fork but before my exec:
This could also be run from a child atfork handler registered with pthread_atfork.
This ensured that my child processes started life with a clean signal block mask, letting my arbitrary processes continue to catch and handle signals without needed modification.
In part I we discussed some basic constructs that could be built using C++ template metaprogramming. This time, we’re going to build up to doing more complicated things - building lists and operating on them.
How can we build a list in a C++ template? I decided to borrow a common idea from Lisp, a cons list
The idea is to essentially build a recursive structure so that each element of the list contains two values - the value at that point in the list, and the remaining list. Each list node holds a head and a tail, (or in Lisp jargon, car and cdr). In pseudo-code:
list a (list b (list c (list d ...)))
But then the question arises of how to terminate the list. The best way to do this is with some terminal marker or sentinel to indicate the last element, or the empty list. In Lisp, this could be nil, or in Haskell a special constructor for the empty list []. Again, in pseudo code, this is something like:
list a (list b (list c (list d (emptylist))))
Then to transate this to a C++ template, we can do the following:
So here we have a type that contains a list of int. When we build the list, myList, we’re building a type that represents the list [a, b, c, d]. Not a runtime series of structs, but a type that we can operate on at compile. For example, pulling the HEAD of the list out can be achieved with:
Remember in part I, we built functions out of templates whose input were types. Now that we have a list in a type, we can write functions that operate on it, at compile.
To run a function over the list, say to calculate the sum:
Note again how I’ve used the pattern matching in template specialisation to define the different implementations for the recursive case and the base case in the SUM definitions.
For something more interesting, how about map-reduce!
Here, MAP_REDUCE is a function (template) that takes a LIST, a unary function F (again, a template) to map over the list and a binary function R (also a template) to reduce the list to a single value (e.g. addition or minimum).
There are toy examples showing map reduce in action (such as summing the squares of a list, finding the minimum) in the git repository at tmp-play.
In the next section, I’ll be talking about more advanced operations, building up to what I originally set out to do - sort a list at compile time!