simoneb's blog

about design, development, profession and avoidance of fluff

Labeling Trees - Introduction

A few years ago I came across a video interview on Channel 9 starring Brian Beckman on a topic which I had never heard of before. The interview is still available online (part 1 and part 2) but you don’t need to watch it to follow me as I’ll start from the beginning and guide you gently through the whole topic.

Although I then considered myself an already proficient C# developer I was quite confused, I just couldn’t wrap my head around the crazy concept of passing state around invisibly, although it was done in my favorite language. The interview comes with full source code and an assignment containing 10 exercises, which I was quite disappointed I could not even tackle. The main topic of the interview is labeling a binary tree, and Brian showed how to do it in several ways, which we’ll see and try to understand before moving on to the exercises.

Time has passed and I can finally go back to that interesting topic and hopefully help someone else out there. The topic, needless to say, is monads, and not even the simplest of the monads, the state monad. This post is the first of a series where I’ll introduce the concept and then go through the exercises, hopefully solving them correctly.

Without further ado, here’s the class which will accompany us throughout the journey, during which it will undergo only minor changes: a binary tree in C#.

A binary tree (binary-tree.cs) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
public abstract class Tree<T>
{
    public abstract void Show(int indent = 0);
}

public class Leaf<T> : Tree<T>
{
    public T Value { get; private set; }

    public Leaf(T value)
    {
        Value = value;
    }

    public override void Show(int indent)
    {
        Console.Write(new string(' ', indent * 2) + "Leaf: ");
        Console.WriteLine(Value.ToString());
    }
}

public class Branch<T> : Tree<T>
{
    public Tree<T> Left { get; private set; }
    public Tree<T> Right { get; private set; }

    public Branch(Tree<T> left, Tree<T> right)
    {
        Left = left;
        Right = right;
    }

    public override void Show(int indent)
    {
        Console.WriteLine(new string(' ', indent * 2) + "Branch:");
        Left.Show(indent + 1);
        Right.Show(indent + 1);
    }
}

As you can see a tree can either be a leaf, containing a Value, or a branch, whose left and right sides will be other trees (and thus either branches or leaves). The entire class hierarchy is generic for no special purpose except that it will allow us to give a meaningful representation to the leaves via their T Value property. I’ve also included a Show(int) method to easily ask the tree to dump itself to the output stream. A couple of helper methods to create branches and leaves will also come useful:

Helper methods to create trees (helper-methods.cs) download
1
2
3
4
5
6
7
8
9
public Tree<T> leaf<T>(T value)
{
    return new Leaf<T>(value);
}

public Tree<T> branch<T>(Tree<T> left, Tree<T> right)
{
    return new Branch<T>(left, right);
}

Now creating and displaying a tree of type string is straightforward:

A simple tree (a-simple-tree.cs) download
1
2
3
4
5
6
7
8
9
10
                                //  tree.Show() outputs:
var tree = branch(              //  Branch:
            leaf("a"),          //    Leaf: a
            branch(             //    Branch:
              branch(           //      Branch:
                leaf("b"),      //        Leaf: b
                leaf("c")),     //        Leaf: c
              leaf("d")));      //      Leaf: d

tree.Show();

Manual labeling

First of all we need to know what labeling a tree means. In our case it means to navigate the tree and assign a integer label to every leaf. Tree navigation can usually be done in two ways, depth-first (DFS) or breadth-first (BFS), we’ll use the former here. In order to store the label in the leaves we need to enrich its structure. We’ll do that by making the generic parameter of the tree a Labeled<T>, defined as follows.

Label (labeled-class.cs) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
public class Labeled<T>
{
    public int Label { get; private set; }
    public T Value { get; private set; }

    public Labeled(int label, T value)
    {
        Label = label;
        Value = value;
    }

    public override string ToString()
    {
        return new { Label, Value }.ToString();
    }
}

public Labeled<T> labeled<T>(int label, T value)
{
    return new Labeled<T>(label, value);
}

In practice, labeling a structure of type Tree<T> means transforming it into a Tree<Labeled<T>>. With this new knowledge labeling our tree manually is as simple as recreating the same tree structure and replacing the value of the leaves with a Labeled<string> instance, incrementing the value of the label manually every time we visit a leaf.
The value, or state, of the current label at each point in time lives in our brains: we’ll have to remember the previous value and increment it every time we visit a leaf. Not a big deal as long as the tree is small.

Manually labeled tree (manually-labeled-tree.cs) download
1
2
3
4
5
6
7
8
9
10
                                                 // labeledTree.Show() outputs:
var labeledTree = branch(                        // Branch:
                    leaf(labeled(0, "a")),       //   Leaf: { Label = 0, Value = a }
                    branch(                      //   Branch:
                      branch(                    //     Branch:
                        leaf(labeled(1, "b")),   //       Leaf: { Label = 1, Value = b }
                        leaf(labeled(2, "c"))),  //       Leaf: { Label = 2, Value = c }
                      leaf(labeled(3, "d"))));   //     Leaf: { Label = 3, Value = d }

labeledTree.Show();

Functional labeling

As you can see labeling a tree manually is not very useful, imagine to have a tree with more than a bunch branches and leaves, it becomes quickly inconvenient. What if we could label it programmatically? We indeed can, and we’ll have to come up with a function, which given our initial unlabeled Tree<T>, gives us back a Tree<Labeled<T>>.
Let’s call Tree<Labeled<T>> Label<T>(Tree<T>) such a function.

Label function (label-function-empty.cs) download
1
2
3
4
5
public Tree<Labeled<T>> Label<T>(Tree<T> tree)
{
    // what to write here?
    return null;
}

One constraint that we want to have is that we don’t want to store the current label in shared memory and update it whenever we visit a leaf. We would like to do it in a purely functional way. This means that the state needs to be passed around explicitly, with a starting value of 0. This suggests that we’ll need an additional support function that will thread the state via its arguments and return the updated state via its return value.
Given the structure of the data we’re handling this function will also most likely call itself recursively.

Label and Label1 functions (functional-label.cs) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
public Tree<Labeled<T>> Label<T>(Tree<T> tree)
{
    return Label1(0, tree).Item2;
}

public Tuple<int, Tree<Labeled<T>>> Label1<T>(int label, Tree<T> tree)
{
    if(tree is Leaf<T>)
    {
        var lf = tree as Leaf<T>;

        return Tuple.Create(label + 1, leaf(labeled(label, lf.Value)));
    }
    else
    {
        var br = tree as Branch<T>;

        var left = Label1(label, br.Left);
        var right = Label1(left.Item1, br.Right);

        return Tuple.Create(right.Item1, branch(left.Item2, right.Item2));
    }
}

Label1 accepts an integer value representing the current label and a tree to be labeled. Its return value is a tuple containing the new label value and the labeled tree.

Let’s go through the code to see what it does exactly. You can see that it does two different things according to whether the tree to be labeled is a leaf or a branch.
In the first case it labels the leaf with the current value of the label, bumps it and returns a tuple containing the new label value and a leaf labeled with the value of the label before bumping it.
In the second case it first calls itself recursively to label the left branch of the tree, which returns again a tuple with the current label value and the labeled sub-tree. Then it does the same with the right branch by passing the just-returned label value as the current label value. Finally it returns a tuple containing the final state of the label and a newly constructed labeled branch containing the previously labeled left and right branches.
You might notice that it is this way of recursing that enables the depth-first approach.

Finally, Label calls Label1 passing 0 as the initial state and returning only the second item in the tuple, which is the labeled tree.

The interesting thing about this approach is that it does not rely on any shared state to do the labeling. The current value of the label is passed as a function argument and the new value is returned as part of the result, versus storing the label value in a variable accessible from the function and incremented every time.

In the next post we’ll see how to make this state-passing less explicit, while still avoiding shared memory to store it.

Async Support in NUnit

.NET’s version 5 of the C# compiler introduced support for an interesting feature related to multithreaded programming, available via the new async and await keywords.

This post is not much about the feature itself as plenty of information is available online, but rather the support that was introduced for it in NUnit. Here is a simple NUnit test:

A simple test (nunit-simple-test.cs) download
1
2
3
4
5
6
7
8
[Test]
public void OneSimpleTest()
{
    var eightBall = new EightBall();
    var answer = eightBall.ShouldIChangeJob();

    Assert.That(answer, Is.True);
}

So far so good, but let’s assume that invoking that method on the 8 ball is really expensive in terms of execution time and you would like your production code invoking it to not just sit and wait until it completes. That’s where the async feature fits, and you could change the method to be an async, Task-returning one instead. How to make that specific method asynchronous is definitely out of scope here though.

Testing asynchronous code

In order to call the new method asynchronously the test code would need to be adapted as shown here:

A simple async test (nunit-simple-async-test.cs) download
1
2
3
4
5
6
7
8
9
10
[Test]
public async void OneSimpleTest()
{
    var eightBall = new EightBall();
    var answer = await eightBall.ShouldIChangeJob();

    Assert.That(answer, Is.True);

    // why am I still here?
}

Awaiting the result of the method implies that the test method has to be marked as async, too. Now, assuming that you’re not planning to change job you would really expect the test to fail, which it instead doesn’t if run with a version of NUnit older than 2.6.2.

The reason behind this incorrect behavior is that asynchronous methods are rewritten by the C# compiler in a way that they return control to the caller as soon as they await on an operation which has not completed yet, as is the case with the invocation of the ShouldIChangeJob method. In this case therefore the control is returned to the code which called the OneSimpleTest method, which turns out to be NUnit itself.

Up to version 2.6.2 of NUnit the test runner always assumed that as soon as a test method returned then its execution was complete, and thus if no assertion failures were reported the test was successful. This is no longer the case with asynchronous methods, as we just found out, because the now asynchronous test returns immediately after invoking ShouldIChangeJob on line 5.

Without any support from NUnit there was a simple fix available already, which was to Wait on the Task returned by the asynchronous method rather than await it.

A simple test - workaround for async (nunit-simple-test-workaround.cs) download
1
2
3
4
5
6
7
8
9
[Test]
public void OneSimpleTest()
{
    var eightBall = new EightBall();
    Task<bool> answer = eightBall.ShouldIChangeJob();
    answer.Wait();

    Assert.That(answer.Result, Is.True);
}

This would work perfectly, although it’s not as nice as being able to write the test code in the same way you would write the production code, that is, by using the await keyword. This has the additional disadvantage of exception handling behaving differently between await and Wait. When using await the code looks like everything is happening sequentially, and you can catch exceptions in the same way you do with sequential code. If ShouldIChangeJob threw an exception of the fictitious type TimeoutException you would see exactly that exception bubbling up from your code. When using Wait instead a System.AggregateException would bubble up, which is a sort of container for multiple exceptions which might have happened during an asynchronous operation, and would wrap the real TimeoutException. This of course makes await more intuitive and preferable over Wait.

Just a matter of Waiting

Introducing this support in NUnit doesn’t necessarily mean that NUnit has now become asynchronous or that it allows to run tests in parallel. This is something which is being worked on for the next 3rd major release of NUnit though.

What we did in NUnit 2 was to allow users to write async tests without worrying about tests completing before their assertions were even evaluated, in addition to supporting thorough use of asynchronous methods in some specific framework use cases. In other words, when invoking asynchronous test methods NUnit will “sit and wait” on your behalf, until the test method, along with all the asynchronous operations it invokes, has finally completed.

Behind the scenes

As should be clear by now NUnit’s support for async methods is mainly a matter of detecting async methods, calling Wait on the Task returned by them and handling exceptions accordingly. This is true in most cases, although NUnit’s need to be compatible with .NET 2.0 means that all of this logic needs to be implemented without referencing any .NET > 2.0 assemblies.

Yet this was still not as straightforward, but we really strived to open up as many intuitive usages of await/async as possible so to relieve the users from having to even think about it.

Although you might not have noticed it you’ve already encountered one example of why this is not trivial. In the asynchronous sample above you can see that the test method is void. What does it mean exactly? Well, it means that there is no Task on which to call Wait, and that really waiting for it to complete means setting up a new SynchronizationContext to hook into the .NET implementation of the async/await feature.

Although asynchronous void methods were not designed for this purpose and Microsoft itself discourages using them besides in event handlers, we really wanted to support them for a couple of reasons:

  1. Not supporting them would mean that NUnit would throw an exception at runtime every time a test is written with an async void signature
  2. Most users would probably repeatedly commit the same mistake of writing asynchronous test methods as void async because no one really cares about their return value

We thought that the two reasons above would lead to a frustrating user experience and decided to support async void tests as well as Task-returning ones. This choice came with some drawbacks too:

  1. The ways async void methods interact with the SynchronizationContext in which they run is undocumented and might change in the future, which opens up the possibility of breaking NUnit’s implementation
  2. Switching the SynchronizationContext in which the test method runs (the way NUnit supports async void tests, that is) might affect even up to a great extent the flow of the code wrapped by the new SynchronizationContext

We though that these two issues alone did not justify the poor user experience and we opted to fully support async void tests.

If not void, what?

Now, although it seems quite reasonable to write the above asynchronous test as one returning void it is equally valid to write it so that it returns a Task instead.

A simple async test returning Task (nunit-simple-async-task-test.cs) download
1
2
3
4
5
6
7
8
[Test]
public async Task OneSimpleTest()
{
    var eightBall = new EightBall();
    var answer = await eightBall.ShouldIChangeJob();

    Assert.That(answer, Is.True);
}

It is actually more advisable to do so if you are especially interested about forwards compatibility of your code for the reasons explained above. If you do this NUnit will not have to jump through hoops to run it correctly and will simply fall back to waiting on the returned task.

async, where else

As briefly mentioned earlier support for await/async in NUnit 2 goes beyond test methods. The good thing about it is that in most cases you won’t have to think about whether it is supported or not, because it most likely is.

In any case here’s a by-no-means complete overview of places where we had to do some relevant work.

Test cases checking results via Task return values

Methods marked with TestCaseAttribute can return values, which are then checked against the Result property of the test case. They now support returning result asynchronously:

An async test case returning Task<int> (nunit-async-testcase.cs) download
1
2
3
4
5
6
7
8
9
10
[TestCase(1, 2, Result = 3)]
public async Task<int> TestAddAsync(int a, int b)
{
    return await SumAsync(a, b);
}	

public async Task<int> SumAsync(int a, int b)
{
	return await Task.FromResult(a) + await Task.FromResult(b);
}

Async lambdas

The async support went slightly beyond test methods, and extended to framework features which accept lambda expression as their arguments:

Async lambda support in NUnit framework (async-lambda-support.cs) download
1
2
3
4
5
6
7
8
9
10
11
12
[Test]
public async Task AsyncLambaSupport()
{
    // throwing asynchronously
    Assert.That(async () => await ThrowAsync(), Throws.TypeOf<InvalidOperationException>());

    // returning values asynchronously
    Assert.That(async () => await ReturnOneAsync(), Is.EqualTo(1));

    // "After" works with async methods too
    Assert.That(async () => await ResutnOneAsync(), Is.EqualTo(1).After(100));
}

Framework support is not yet available in NUnit 2.6.2, it will be in the next build.

Test context availability

If you don’t know about TestContext I suggest you check it out as it might come handy in a bunch of scenarios. If you’re already using it, just be aware that it is now accessible anywhere inside the body of asynchronous tests, which is how you would expect it to be.

What else?

This feature has been ported to the upcoming major release 3.0 of NUnit as well as to NUnitLite to make it available consistently throughout the whole testing platform.

If there’s anything we’ve missed feel free to let us know on the NUnit’s mailing list.

NUnit via LINQPad

I’ve been a frequent user of LINQPad for several years now for I enjoy the simplicity with which you can fire it up, start writing code and see the results immediately. I appreciate that other programming languages provide REPLs for you to try and even modify the currently running application, and I enjoy them when I have a chance to work with node.js or Ruby, but this is not something that people writing C# code are used to, and in any case outside the scope of this post.

Recently my involvement with NUnit has increased to some extent and on several occasions I found myself needing to try running one-off tests and quickly see their outcome in order to validate bug reports or in general for experimenting.
As you can guess this would usually require launching a full-featured IDE like Visual Studio, create a class library project, reference NUnit, build and run some NUnit runner against the resulting assembly. That’s a lot of work for just getting a few lines of code to run.

The idea was therefore to be able to run NUnit tests on LINQPad, which is definitely possible and quite simple, as it turns out.

Now, a bit of background about NUnit is in order. The current release, which at the time of this writing is 2.6.2 and has been available for a few days, along with the whole 2.x series, has always kept the framework and the core separate.

The framework is contained in the nunit.framework.dll assembly, which client code usually references in order to write tests. Just to mention a few it contains the TestAttribute and the Assert classes, which everyone writing unit tests should be familiar with.
The core, on the other hand, contains the code needed to run the tests, which includes test discovery, filtering, the logic for handling application domains and processes, and finally exposes this functionality by means of test runners. NUnit currently ships with a console and a graphical GUI runner.
This net separation of the core from the framework basically means that running tests in LINQPad via NUnit requires referencing both the core and the framework assemblies, and then invoking a runner pointing it at the current assembly. This is possible but also not the fastest and cleanest way to accomplish it.

Enter NUnitLite. NUnitLite has been around for some time now and the main difference with NUnit v2 that is of some relevance here is that there is no distinction between the core and the framework. Everything is contained within a single assembly that you can reference from your tests, and the very same project containing the tests can be used to run them with a single one-liner.

Although NUnitLite does not provide all of the features of NUnit it has quite enough of them for most needs, and above all simplifies our life a lot here.
On the other side, we’re going to leverage a new feature available in LINQPad, the ability to reference NuGet packages, which right now is provided in the beta version downloadable from here

Now that the ground is set here are the steps to get started writing unit tests in LINQPad:

  1. Create a new query and set its Language to C# Program, which will create a stub main method
  2. Add the NUnitLite NuGet package, which is easily done by hitting F4, then Add NuGet… and then looking up and selecting NUnitLite. Also add the namespaces NUnit.Framework and NUnitLite.Runner
  3. Fill the main method with the single line which will allow to run the tests in the current assembly and finally start writing unit tests.

    NUnitLite in LINQPad (linqpad-nunitlite.cs) download
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    
    void Main()
    {
      new NUnitLite.Runner.TextUI().Execute(new[]{"-noheader"});
    }
    
    // Define other methods and classes here
    [Test]
    public void SomeTest()
    {
      Assert.Pass();
    }
    
  4. Hit F5 and see the results of the test run

The steps outlined above are quick and simple, but still require some time if you have to do it over and over, therefore a better option would be to save the raw template as a LINQPad query somewhere on your file system, or even better, although this chance is limited to the owners of a professional license, using a snippet which is available for download from the NUnit wiki here.

The snippet takes care of the whole plumbing so you just need to download and copy it to the snippets folder, usually in %userprofile%\Documents\LINQPad Snippets, then just type the snippet shortcut, nunit, in any query window and you’re ready to write your tests.