This is the second post of a series about labeling trees. In the previous post we saw how to mark leaves of a binary tree with integer, incrementing labels, first manually and then functionally without relying on mutable state. We achieved it by writing a recursive Label1
function which threaded the value of the label, the state, via function arguments and return value, incrementing it every time it came across a leaf of the tree.
Let’s review the signature of the Label1
function and then move one step forward towards a different, less explicit way to pass the state around.
1
|
|
As the signature clearly shows the state passing is visible both in the arguments and the return value of the function. There is certainly a good reason for that: if we need to increment the label and use the new value whenever we come across the next leaf would it be possible to do otherwise? Let’s as well review the part of the function which took care of applying labels to the leaves.
1 2 3 |
|
This simple code mixes together two concerns: creating a labeled leaf and incrementing the value of the label, squeezing them into a tuple. Let’s try to abstract and extract these two responsibilities and then apply a series of simple transformations over them, while keeping the behavior unchanged.
First transformation: from integer to tuple
1 2 3 4 5 |
|
Here we extracted the increment concern as a standalone operation. newAndOldLabel
is a tuple whose first item contains the new value of the label and the second item contains the previous, unchanged value. The second statement simply uses it to create the final tuple.
Second transformation: from tuple to functions returning tuples
1 2 3 4 5 6 |
|
Here we wrap the last statement from the previous step into a function which accesses newAndOldLabel
via its arguments (rather than via local variables) and returns the result of invoking the function itself with newAndOldLabel
.
Third transformation: from anything to functions
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Here we apply a transformation to both the label increment and the leaf-labeling logic. In the first case we wrap the creation of the newAndOldLabel
tuple into a function which rather than grabbing the value of the label from the integer label
argument of the containing function accesses it via its own argument.
In the second case we curry the already existing function so that instead of accepting a tuple argument it accepts a single integer argument and returns a function which accepts another integer argument and finally returns our output tuple. The external function takes care of fixing the current value used to apply a label to the leaf, while the inner function fixes the new, incremented label value as the first item of the resulting tuple.
The rest of the code binds together these two functions in a way which preserves the overall behavior.
Fourth transformation: Bind
As we just mentioned binding, let’s finally abstract this concept into its own Bind
generic function. This function basically encapsulates the logic of composing bumpLabel
and labelLeafCurried
from the previous transformation.
1
|
|
Before looking at its definition let’s observe some of the characteristics of this function. It takes two functions as its arguments and returns a third function, which is then executed passing label
as its only argument. Knowing both the signatures of the input functions and the return type of the containing Label1
function implies that we also know the signature of the Bind
function. Furthermore, being a replacement for the last few lines of code from the previous step means that the body will encapsulate the same logic in some form.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The important thing to note is that we’ve come here by applying some simple and logical transformations to the initial code, and if you try to follow what this code is doing you can realize that the behavior is unchanged. One final step to abstract this even more is to make it generic over the second item of the tuples involved, thus leading to a new signature (the body is unchanged).
1 2 3 |
|
Fifth transformation: S type
The signature of our Bind
function is a mouthful, but we can see that there’s a recurring pattern in there. Specifically, we can see occurrences of Func<int, Tuple<int, X>>
again and again. Let’s encapsulate this into a generic type S<T>
which will have a single member of that function type, arbitrarily named RunWithState
.
1 2 3 4 |
|
Let’s as well rewrite Bind
in terms of S
.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Now we can apply one additional transformation to our Label1
function by rewriting everything in terms of S
.
1 2 3 4 5 6 7 8 9 10 11 |
|
Sixth transformation: Return
We can finally observe one more pattern in how the labelLeafS
function creates an instance of the S
type. You can see that the only value this instance depends on is the labeled leaf created inside its RunWithState
function, which is of exactly the same type as the generic argument of the instance S<Tree<Labeled<T>>>
. Not that we benefit much from it now, but let’s extract this pattern into a Return
method, which given a T
, returns an instance of S<T>
.
1 2 3 4 5 6 7 |
|
Let’s also rewrite the relevant part of the Label1
function in terms of Return
, additionally inlining the labelLeafS
function.
1 2 3 |
|
Hiding the state
Before we move on and formalize the outcome of this long sequence of transformations in a new, named pattern let’s observe one important thing in the final version of our code. The final statement returns a value of type Tuple<int, Tree<Labeled<T>>>
simply because we execute the RunWithState
function of the S<Tree<Labeled<T>>>
instance returned by Bind
. Executing this function is also the only occasion in which we use the integer label
argument of the containing function Label1
.
The observation is therefore: if rather than returning a Tuple<int, Tree<Labeled<T>>>
we returned S<Tree<Labeled<T>>>
then we wouldn’t need the label
argument in Label1
and shift to the caller the responsibility of executing RunWithState
with the initial label value. In fact Label
, the caller, is already doing that, just more explicitly via function arguments, but we can see that we have an opportunity to change the pattern of providing the initial state by moving from a function with this signature:
1
|
|
To a function with this signature:
1
|
|
Those familiar with functional programming will find that this is vaguely similar to currying, where a function of n arguments is transformed into n functions of 1 argument each.
We now have all the building blocks for rewriting our whole labeling functions Label
and Label1
from a completely new perspective, where the final outcome is the same but the means by which we get there are completely different. This also justifies creating two new functions called MLabel
and MLabel1
, respectively.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
Together with Bind
, Return
and the type S<T>
this is the bulk of the code required to label trees in this new way.
Now, although we have gone to great lengths talking about leaves, you might wonder why branches did not deserve their own considerations as I’ve included them in the code above already.
The reason is that as soon as you understand what we did with leaves it should be easy, even easier, to figure out what is happening with branches, as there is no state change involved. Before we try to understand the similarities I suggest you lookup the final version of the code from the previous post.
The analogy should be more visible than it is for leaves, let’s explain in words. We bind a recursive call to MLabel1
fed with the left tree of the branch to a function which takes the left labeled tree and binds another recursive call, this time fed with the right tree, to a function which takes the right labeled tree and returns an instance of the state monad created from a branch whose left tree is accessed via a closure and the right is the argument of the function itself.
Besides the mouthful, did I just say state monad? Ah right, the state monad is the triplet represented by S<T>
, Bind
and Return
, while MLabel
and MLabel1
are just our logic to label a tree in a third way besides manually and functionally: monadically.
As for the state monad, we just invented it, but more on this in the next post.