Trees, like lists, are powerful, ubiquitous data structures. Also, like lists, they are recursively defined.

## A Naive Solution

At first blush, a reasonable way of defining a tree might be:

 1: 2: 3: 4:  type Tree<'a> = { Tag : 'a Children : Tree<'a> list } 

This looks clean and elegant, but we immediately find it restrictive when we want to actually create a nested and branched structure. Unless we always build the tree up from the leaves towards the root, we need more sophisticated ways of walking the structure, and this leads to an interesting problem.

Specifically, if we need to introduce a reference to the parent of a given node, things get out of hand very quickly. This is the equivalent of trying to building a doubly-linked list with immutable data structures - which turns out to be a very difficult problem to solve.

## An Object Lesson

F# is a multi-paradigm language, and we can easily sacrifice immutability to get bi-directional links.

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52:  module Tree = type NodeId = | Id of string [] module internal Node = [] type NodeBase<'a> () = class member val private _children : NodeBase<'a> list = [] with get, set member internal this.Children with get () = this._children and set ns = this._children <- ns abstract member Level : int end type Root<'a> () = class inherit NodeBase<'a> () override this.Level = 0 end type Node<'a> (id: NodeId, value : 'a option, parent : NodeBase<'a>) = class inherit NodeBase<'a> () member this.Id = id member this.Parent = parent member val Value = value with get, set override this.Level = parent.Level + 1 end type Tree<'a> () = class let root = Root<'a> () member private this.Root = root member val private this.Current = root // Modify the value of the current node member this.ModifyValue f = ... // Push a child on to the current node and make it the current node member this.PushChild name = ... // Add a sibling to the current node and make it the current node member this.AddSibling name = ... // Pop to the parent of this node and make it the current node member this.Pop l = ... // Other modification operations elided... // Visit this tree in pre-order starting at the root member this.VisitPreOrder f = ... // Visit the path to the root from the current node member this.VisitToRoot f = ... end 

We note the following immediately:

1. This is a familiar coding pattern. It's entirely conceivable that you would see similar code in C# or Java.
2. We are taking advantage of F#'s module system to encapsulate the Node and Tree data structures, hiding implementation details, and exposing just the operations we wish to provide
3. We are full-blown object-oriented and mutable at this point, so we are obliged to address several concerns that immutable data structures obviate.
4. The Current member is modified only whilst tree-building, and serves as the starting point for the VisitToRoot operation.
5. VisitPreOrder and VisitToRoot must each have their own way of traversing the tree without modifying either Root or Current. Traversing the tree should necessarily be a read-only operation.

(Pre-Order Traversal is one way to walk a tree from its root - other traversals are also possible.)

This solution may suffice for some cases, but we're going to consider a situation where immutablity is actually something we need for the purposes of the domain. For example, let's say we're building the tree as part of an operation, and we want to ensure that the tree returns to its original state if that operation fails. Keeping track of the tree as it grows, and being able to roll-back to a given state, is not something that is pleasant to do correctly when mutability is in the picture - and doubly so when concurrency and mutability meet as part of the problem.

So we are faced with an interesting quandary - having the ability to VisitToRoot or Pop requires bi-directional linking - which is hard to do with immutable data structures; and having the ability to check-point and roll-back tree-modification operations is difficult to do correctly without immutable data structures! What do we do?

## Painting By Numbers

What if, instead of actually creating and modifying a tree like we were taught in CS 101, we simply keep track of the list of tree-modification instructions as a kind of program? This list would have to support a limited form of mutability in that the only way to modify the list would be to append to it, but the existing contents of the list could never change.

When the tree needs to be visited, we take the list of instructions and interpret them to build a tree using the mutable approach, but since the contents of the list at this point is fixed, the tree that we create from it, is, in some sense, constant even though it contains mutable parts. Indeed, the only operations that the tree needs to support from that point on are (possibly repeated) traversals.

This approach is quite a powerful one, and can be applied to a variety of problems. We could, in fact, generalize the pattern completely in other languages that allow abstraction over types, and this forms the general principle behind what is known as the 'Free Monad'. However, since the concept is quite powerful, we are going to explore the concept concretely, and leave the abstraction of the pattern to Haskell and Scala programmers!

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21:  module Tree = // other members elided ... type internal ConstructOperation<'a> = | PushChild of NodeId * 'a option | AddSibling of NodeId * 'a option | ModifyValue of ('a option -> 'a option) | Pop of int option type Tree<'a> () = class member val private ops : ConstructOperation<'a> list = [] with get, set member this.PushChild x = this.ops <- PushChild x :: this.ops; this member this.AddSibling x = this.ops <- AddSibling x :: this.ops; this member this.ModifyValue x = this.ops <- ModifyValue x :: this.ops; this member this.Pop ?x = this.ops <- ConstructOperation.Pop x :: this.ops; this end 

Of course, this is all well and good to build up a list of operations, but this doesn't actually build a tree - and we aren't really able to traverse the tree in any meaningful way.

One sneaky thing we have done is to build the list in reverse. This ensures that each operation is processed in constant-time.

In order to build the tree, we start with a single node, and fold over the list processing each node in turn. We want the result of the fold to be the tree with bi-directional links.

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38:  let rec applyOp op (node : NodeBase<'a>) : NodeBase<'a> = match op with // Push a child on to the given node and return it | PushChild (x, v) -> let child = Node (x, v, node) node.Children <- upcast child :: node.Children upcast child // Add a sibling to the given node and return it | AddSibling (x, v) -> match node with | :? Node<'a> as n -> let sibling = Node(x, v, n.Parent) n.Parent.Children <- upcast sibling :: n.Parent.Children upcast sibling | _ -> failwith "Cannot add sibling to root" // Modify the value of the given node and return it | ModifyValue f -> match node with | :? Node<'a> as n -> n.Value <- f n.Value upcast n | _ -> failwith "Cannot modify value of root" // Pop (recursively) to an ancestor of this node and return it | ConstructOperation.Pop l -> match node with | :? Node<'a> as n -> let level = l |> Option.defaultValue (n.Level - 1) if (n.Parent.Level = level) then n.Parent elif (n.Level > level) then applyOp (ConstructOperation.Pop (Some level)) (n.Parent) else failwith "How did we get here?" | _ -> failwith "Cannot pop root" 

This function takes an operation op and applies it to a given node, returning a result. The signature of the function has been chosen to align with one of the folding functions, so if we start with a list of operations and a root node, we should be able to build up a full tree from the list, and end up pointing to the current node.

 1:  let current = List.foldBack applyOp ops (upcast (Root())) 

Of course, we will want to also have a handle to the root of the tree, so we can do traversals like a pre-order walk. We can get that by recursively walking up from the current position until we hit a root node.

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:  let visitRoot start = let rec visit (node : NodeBase<'a>) = match node with | :? Node<'a> as n -> seq { yield node yield! visit n.Parent } | :? Root<'a> as r -> seq { yield node } | _ -> Seq.empty visit start let last = visitRoot current |> Seq.last let root = last :?> Root<'a> 

Now, since we have started with a fixed list of operations, the current and root values represent a fixed tree. We can keep this pair in a structure that represents the "tree" version of the operations/

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:  module Tree = // other members elided... type Tree<'a> () = class // other members elided... member this.Build () = TreeCursor<'a> (this.ops) end and TreeCursor<'a> internal (ops) = class let rec applyOp op (node : NodeBase<'a>) : NodeBase<'a> = ... let current = List.foldBack applyOp ops (upcast (Root())) let visitRoot start = ... let root = visitRoot current |> Seq.last :?> Root<'a> end 

While it might seem like a good idea to use a record for this, it might be better to use a class instead, because we don't want to expose the actual current and root members.

In fact, by using appropriate privacy modifiers on the constructor, we can make both the Tree<'a> and TreeCursor<'a> classes totally opaque - hiding the entire data structures within and only providing a clean programmatic interface to them.

Also, since a TreeCursor instance represents a Tree fixed at a given point, the only meaningful thing we can do to a TreeCursor is to traverse it, which leads to a very interesting observation. Since the tree is fixed, its traversals are also fixed. Which means we only have to traverse it once and build up a list of things we saw in the traversal, and then we can play back the traversal operations and process the tree in any way we choose.

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38:  module Tree = // other members elided... type VisitOperation<'a> = | VisitRoot | VisitChild of NodeId | ReadValue of NodeId * 'a option | Pop type TreeCursor<'a> internal (ops) = class // other members elided... member this.PathToRoot = let readValue (nb : NodeBase<'a>) = match nb with | :? Node<'a> as n -> ReadValue (n.Id, n.Value) | _ -> VisitRoot visitRoot current |> Seq.map readValue member this.PreOrderPath = let rec visit (node : NodeBase<'a>) = seq { match node with | :? Node<'a> as n -> yield ReadValue (n.Id, n.Value) | _ -> yield! Seq.empty for child in node.Children |> List.rev do match child with | :? Node<'a> as c -> yield! seq { yield VisitChild c.Id yield! visit child yield VisitOperation.Pop } | _ -> yield! Seq.empty } visit root end 

In the code snippet above, we have defined two interesting traversals - one starts at the current node and walks back to the root, and the other starts at the root and traverses the whole tree "pre-order".

Each traversal results in a fixed sequence of VisitOperation<'a> for future use.

Tree traversals are best represented as folds. This is actually a much broader topic of discussion, but folding over trees can build all kinds of other data structures - including other trees, and allow for tree-rewriting.

In our case, we can traverse the tree, and then fold over it, as follows:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15:  module Tree = // other members elided... type TreeCursor<'a> internal (ops) = class // other members elided... member this.VisitRoot<'o> (processor : 'o -> VisitOperation<'a> -> 'o) (seed : 'o) = Seq.fold processor seed this.PathToRoot member this.VisitPreOrder<'o> (processor : 'o -> VisitOperation<'a> -> 'o) (seed : 'o) = Seq.fold processor seed this.PreOrderPath end 

And there we have it.

We have implemented a traditional tree which affords the benefits of immutable data structures (like check-pointing), whilst allowing for efficient tree traversals using parent-pointers, and functionally separating out the traversal concerns from the tree-node processing concerns.

And in less than 130 lines of code!

## Soup's Up!

Let's build an example to see how this can be used:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:   let t = Tree () let t = t.PushChild (Id "a", None) let t = t.PushChild (Id "b", None) let t = t.PushChild (Id "b1", None) let t = t.AddSibling (Id "b2", None) let t = t.Pop () let t = t.AddSibling (Id "c", None) let t = t.PushChild (Id "c1", None) let t = t.AddSibling (Id "c2", None) let t = t.Pop () let t = t.Pop () let t = t.PushChild (Id "d", None) 

At this point, we have represented the building of a nested structure in an idiomatic manner, but the internal representation is simply a list of operations describing the building of the structure, rather than the structure itself.

We can then create the tree structure - with bi-directional links - at a fixed point in time, allowing us to traverse the tree.

 1:   let tc = t.Build () 

Now let's write a function to process each node as we encounter it in the traversal.

The signature of this function matches the signature used by a folding function, which allows us to fold over the list of visit operations and build up a composite value.

In our case, we want to build up a string containing a textual representation of the path in the traversal.

 1: 2: 3: 4: 5: 6: 7: 8:   let printNode res curr = let c = match curr with | ReadValue (id, vo) -> sprintf "%s%s" id.unapply (vo |> Option.map (sprintf " (%A)") |> Option.defaultValue "") | VisitRoot -> "|" | VisitChild id -> "↓" | VisitOperation.Pop -> "↑" sprintf "%s %s" res c 

For a given visit operation, we compute a glyph describing the traversal ('up' and 'down' for Pop and Push), or the node's 'id' and 'value'.

We tack this value at the end of the string which represents the path taken so far.

Finally, we pass the printing function to the visitor methods instance

 1: 2:   printfn "Path to root : %s" <| tc.VisitRoot printNode "" printfn "Pre-Order walk: %s" <| tc.VisitPreOrder printNode "" 

## Conclusion

This method of description and deferred interpretation is a very powerful technique in functional programming. In our case, it allowed us to separate out concerns between tree creation and tree traversal, and appropriate the benefits of immutability (for tree creation) and mutability (for traversals) without sacrificing cleanliness or readability. In fact, we have hoisted all the mechanics of traversal away from the user, and visiting the tree is reduced to simply providing a folding function.

The concept is well worth learning, as in other languages with higher-kinded types, a lot of mechanical work is lifted by these abstractions. For example, the IO monad in Haskell, and the Free Monad in Scala and Haskell both use and amplify this concept.

Keep typing!

type Tree<'a> =
{Tag: 'a;
Children: Tree<'a> list;}

Full name: funwithtrees.Tree<_>
Tree.Tag: 'a
Tree.Children: Tree<'a> list
type 'T list = List<'T>

Full name: Microsoft.FSharp.Collections.list<_>
type NodeId = | Id of string

Full name: funwithtrees.Tree.NodeId
union case NodeId.Id: string -> NodeId
Multiple items
val string : value:'T -> string

Full name: Microsoft.FSharp.Core.Operators.string

--------------------
type string = System.String

Full name: Microsoft.FSharp.Core.string
Multiple items
type AutoOpenAttribute =
inherit Attribute
new : unit -> AutoOpenAttribute
new : path:string -> AutoOpenAttribute
member Path : string

Full name: Microsoft.FSharp.Core.AutoOpenAttribute

--------------------
new : unit -> AutoOpenAttribute
new : path:string -> AutoOpenAttribute
Multiple items
type AbstractClassAttribute =
inherit Attribute
new : unit -> AbstractClassAttribute

Full name: Microsoft.FSharp.Core.AbstractClassAttribute

--------------------
new : unit -> AbstractClassAttribute
Multiple items
type internal NodeBase<'a> =
new : unit -> NodeBase<'a>
abstract member Level : int
member Children : NodeBase<'a> list
member private _children : NodeBase<'a> list
member Children : NodeBase<'a> list with set
member private _children : NodeBase<'a> list with set

Full name: funwithtrees.Tree.Node.NodeBase<_>

--------------------
internal new : unit -> NodeBase<'a>
val set : elements:seq<'T> -> Set<'T> (requires comparison)

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.set
val this : NodeBase<'a>
member internal NodeBase.Children : NodeBase<'a> list with set

Full name: funwithtrees.Tree.Node.NodeBase1.Children
property NodeBase._children: NodeBase<'a> list
val ns : NodeBase<'a> list
abstract member internal NodeBase.Level : int

Full name: funwithtrees.Tree.Node.NodeBase1.Level
Multiple items
val int : value:'T -> int (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.int

--------------------
type int = int32

Full name: Microsoft.FSharp.Core.int

--------------------
type int<'Measure> = int

Full name: Microsoft.FSharp.Core.int<_>
Multiple items
type internal Root<'a> =
inherit NodeBase<'a>
new : unit -> Root<'a>
override Level : int

Full name: funwithtrees.Tree.Node.Root<_>

--------------------
internal new : unit -> Root<'a>
val this : Root<'a>
override internal Root.Level : int

Full name: funwithtrees.Tree.Node.Root1.Level
Multiple items
type internal Node<'a> =
inherit NodeBase<'a>
new : id:NodeId * value:'a option * parent:NodeBase<'a> -> Node<'a>
member Id : NodeId
override Level : int
member Parent : NodeBase<'a>
member Value : 'a option
member Value : 'a option with set

Full name: funwithtrees.Tree.Node.Node<_>

--------------------
internal new : id:NodeId * value:'a option * parent:NodeBase<'a> -> Node<'a>
val id : NodeId
val value : 'a option
type 'T option = Option<'T>

Full name: Microsoft.FSharp.Core.option<_>
val parent : NodeBase<'a>
val this : Node<'a>
member internal Node.Id : NodeId

Full name: funwithtrees.Tree.Node.Node1.Id
member internal Node.Parent : NodeBase<'a>

Full name: funwithtrees.Tree.Node.Node1.Parent
override internal Node.Level : int

Full name: funwithtrees.Tree.Node.Node1.Level
property NodeBase.Level: int
Multiple items
type Tree<'a> =
new : unit -> Tree<'a>
member AddSibling : x:'a0 -> Tree<'a>
member ModifyValue : x:'a0 -> Tree<'a>
member Pop : ?x:'a0 -> Tree<'a>
member PushChild : x:'a0 -> Tree<'a>
member private Root : Root<'a>
member private ops : obj list
member private ops : obj list with set

Full name: funwithtrees.Tree.Tree<_>

--------------------
new : unit -> Tree<'a>
val root : Root<'a>
val this : Tree<'a>
Multiple items
member private Tree.Root : Root<'a>

Full name: funwithtrees.Tree.Tree1.Root

--------------------
type internal Root<'a> =
inherit NodeBase<'a>
new : unit -> Root<'a>
override Level : int

Full name: funwithtrees.Tree.Node.Root<_>

--------------------
internal new : unit -> Root<'a>
member Tree.ModifyValue : x:'a0 -> Tree<'a>
member Tree.PushChild : x:'a0 -> Tree<'a>
member Tree.AddSibling : x:'a0 -> Tree<'a>
member Tree.Pop : ?x:'a0 -> Tree<'a>
member Tree.PushChild : x:'a0 -> Tree<'a>

Full name: funwithtrees.Tree.Tree1.PushChild
val x : 'a
property Tree.ops: obj list
member Tree.AddSibling : x:'a0 -> Tree<'a>

Full name: funwithtrees.Tree.Tree1.AddSibling
member Tree.ModifyValue : x:'a0 -> Tree<'a>

Full name: funwithtrees.Tree.Tree1.ModifyValue
member Tree.Pop : ?x:'a0 -> Tree<'a>

Full name: funwithtrees.Tree.Tree`1.Pop
val x : 'a option
val applyOp : op:'a -> node:'b -> 'c

Full name: funwithtrees.applyOp
val op : 'a
val node : 'b
val failwith : message:string -> 'T

Full name: Microsoft.FSharp.Core.Operators.failwith
module Option

from Microsoft.FSharp.Core
union case Option.Some: Value: 'T -> Option<'T>
val current : obj

Full name: funwithtrees.current
Multiple items
module List

from Microsoft.FSharp.Collections

--------------------
type List<'T> =
| ( [] )
| ( :: ) of Head: 'T * Tail: 'T list
interface IEnumerable
interface IEnumerable<'T>
member GetSlice : startIndex:int option * endIndex:int option -> 'T list
member IsEmpty : bool
member Item : index:int -> 'T with get
member Length : int
member Tail : 'T list
static member Cons : head:'T * tail:'T list -> 'T list
static member Empty : 'T list

Full name: Microsoft.FSharp.Collections.List<_>
val foldBack : folder:('T -> 'State -> 'State) -> list:'T list -> state:'State -> 'State

Full name: Microsoft.FSharp.Collections.List.foldBack
val visitRoot : start:'a -> 'b

Full name: funwithtrees.visitRoot
val start : 'a
val visit : ('c -> 'd)
val node : 'c
Multiple items
val seq : sequence:seq<'T> -> seq<'T>

Full name: Microsoft.FSharp.Core.Operators.seq

--------------------
type seq<'T> = System.Collections.Generic.IEnumerable<'T>

Full name: Microsoft.FSharp.Collections.seq<_>
module Seq

from Microsoft.FSharp.Collections
val empty<'T> : seq<'T>

Full name: Microsoft.FSharp.Collections.Seq.empty
val last : obj

Full name: funwithtrees.last
val last : source:seq<'T> -> 'T

Full name: Microsoft.FSharp.Collections.Seq.last
val root : obj

Full name: funwithtrees.root
Multiple items
module Tree

from funwithtrees

--------------------
type Tree<'a> =
{Tag: 'a;
Children: Tree<'a> list;}

Full name: funwithtrees.Tree<_>
val map : mapping:('T -> 'U) -> source:seq<'T> -> seq<'U>

Full name: Microsoft.FSharp.Collections.Seq.map
val rev : list:'T list -> 'T list

Full name: Microsoft.FSharp.Collections.List.rev
val fold : folder:('State -> 'T -> 'State) -> state:'State -> source:seq<'T> -> 'State

Full name: Microsoft.FSharp.Collections.Seq.fold
union case Option.None: Option<'T>
val id : x:'T -> 'T

Full name: Microsoft.FSharp.Core.Operators.id
val sprintf : format:Printf.StringFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.sprintf
val map : mapping:('T -> 'U) -> option:'T option -> 'U option

Full name: Microsoft.FSharp.Core.Option.map
val printfn : format:Printf.TextWriterFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn