i-think Twenty-Two

Now with more coherency.

Boosting productivity with F# discriminated unions and pattern matching

| Comments

So I’ve been working on some on some Project Euler problems to make sure I’m as ready as possible for some upcoming job interviews. If you aren’t already aware of Project Euler go and check it out.

I have been using a mixture of C# and F# to solve the problems. I have been wanting to expand my F# knowledge for some time now, so my strategy so far is to re-implement re-usable components that I had written in C# into F#. Because I’m a huge fan of recursion many of these re-implementations were extremely straightforward as I was already following a common functional pattern.

Eventually I came across Problem 57. The problem involves evalutating an expanding formula into the appropriate fractional result. Nb: Code found in this post provides a partial solution to this problem.

These are the first four iterations as described in the problem:

1 + 1/2 = 3/2 = 1.5
1 + 1/(2 + 1/2) = 7/5 = 1.4
1 + 1/(2 + 1/(2 + 1/2)) = 17/12 = 1.41666...
1 + 1/(2 + 1/(2 + 1/(2 + 1/2))) = 41/29 = 1.41379...

So I am essentially working with a tree of operations. Each iteration can be based on the one preceding it. The problem requires that each of these is able to be reduced to a single fraction. Therefore the pieces that I need are:

  • A data structure to hold the expression
  • A function to reduce the expression to a fraction
  • A function to determine the next iteration of the expression

I won’t be looking at the last of these three pieces here (I’ll leave that as an exercise for you).

Also, it would be kind of nice if I had a function to show me a string representation of my expression so that I could easily check I’m on the right track (because that is generally easier than trying to understand an object graph plus I already had a string representation to target).

So when it came to picking the data structure I was reminded of discriminated unions in F#. These allowed you to define different but related structures. The relationship could then be used to enable pattern matching to deal with a specific case.

The expressions that I needed to support were quite simple, add, divide and of course a constant value. In this case I’ll throw caution to the wind and use an integer.

Problem57.fs (excerpt)
type Expression =
  | Constant of int
  | Add of Expression * Expression
  | Divide of Expression * Expression

Here my constant values are stored in a Constant object which is itself an expression. Add and Divide are both comprised of a pair of expressions. Expression is a recursive type and can either have constant leaf nodes of branches which perform an operation.

I could have made a simplification as in the values I was constructing the left operand of Add and Divide would always be a constant, but this more flexible structure generally felt better.

I could now easily construct the first iteration like so:

Problem57.fs (excerpt)
let root = Add(Constant(1), Divide(Constant(1), Constant(2)))

This is rather verbose and difficult to read and is only going to become more difficult as we increment the value. So the next thing I’m going to do is create a function which will create a string version of the expression so that I can review it by hand.

Now when I am thinking about the string function (I’m going to call it stringify) I can think of the various types:

  • Constant will just be a ToString() call on the value.
  • Add and Divide will be recursive and call the stringify function of both expressions and separate the results with the appropriate operator.
  • Also we want to surround the denominator in brackets if the denominator is not just a constant.

Putting these into practice becomes extremely straightforward (ignoring that I haven’t used a StringBuilder).

Problem57.fs (excerpt)
let rec stringify v =
  match v with
  | Constant(x) -> x.ToString()
  | Add(x, y) -> (stringify x) + "+" + (stringify y)
  | Divide(x, Constant(y)) -> (stringify x) + "/" + y.ToString()
  | Divide(x, y) -> (stringify x) + "/(" + (stringify y) + ")"

What struck me as awesome was the amazing simplicity of this code. Using pattern matching I was able to easily address my special case for including brackets and give the other cases in a straightforward manner as well. The F# compiler even helped by ensuring that I matched all possible patterns (it uses wizardry known as type inference to determine that v is an Expression type).

I could then execute my stringify function on root to check that everything worked as intended. The delightful part at this point was that it did. The next step was to reduce one of these expressions down to its smallest possible form. This too would prove to be trivial. Here’s the code I came up with:

Problem57.fs (excerpt)
let rec reduce v =
  match v with
  | Constant(c) -> Constant(c)
  | Add(Constant(a), Constant(b)) -> Constant(a + b)
  | Divide(Constant(a), Divide(Constant(b), Constant(c))) -> Divide(Constant(a * c), Constant(b))
  | Add(Constant(a), Divide(Constant(b), Constant(c))) -> Divide(Constant((a * c) + b), Constant(c))
  | Add(a, b) -> Add(reduce a, reduce b) |> reduce
  | Divide(Constant(_), Constant(_)) -> v
  | Divide(a, b) -> Divide(reduce a, reduce b) |> reduce

Here I looked at the sort of smaller patterns I was dealing with and repeatedly called reduce until I achieved a desirable pattern. Again, the simplicity of each pattern and its resulting reduction really drove home the incredible benefit provided by pattern matching.

Pattern matching is a technique that I really miss when I use languages like C#. For this reason alone I would really like to see F# become more widely adopted so it may be finally possible to use it in projects where we are currently confined to a single language.

I have written a number of code generating tools over the years, especially focussing on the generation of SQL from some sort of structure. Looking at how easily I was able to solve the problem above using F# I wish that I had some of these techniques at my disposal then.

Finally, I made a slight modification in the code above which will prevent it from being used verbatim to solve the Euler problem.