RX Framework for manipulating real-life data

As I said in my previous blog post, in one of my previous roles involved performing real-time analysis of incoming sensor and telemetry data, combining streams of data of different data rates, allowing the user to write their own calculation operations on the data, and displaying it to the user. I wondered what this would have been like to implement using RX.

As an aside, it may be that the new TPL Dataflow library would be more suitable for this – and I am a bit confused about the overlap between some many different technologies, such as RX and the forthcoming C# 5.0 async, or whether TPL Dataflow replaces the Actor model in Microsoft Axum or F#).

Anyway, back with the plot…

Imagine that you have to handle incoming streams of sensor data. This doesn’t have to be hardware sensor data, it could be a phone’s GPS or accelerometer data, or incoming streams of trade data. In the old style of programming there would be a callback or event handler that would be triggered when the data arrived, and to perform aggregate calculations over multiple data streams would involve convoluted code with locks – with RX there’s none of that, just a simple linq query to combine the data:

Imagine that you had two streams of position data (here the observables are generated, but they could be observables from events, the usage is identical, which is beautiful!)

var positions = Observable.Generate(0.0, i => i < 10.0, i => i + 1.0, i => i);

var positions2 = Observable.Generate(10.0, i => i >= 0.0, i => i – 1.0, i => i);

An operator to calculate the separation is as simple as:

var separation = positions.Zip(positions2, (i, j) => new { i, j }).Select(pos => Math.Abs(pos.j – pos.i));

That’s really nice and succinct, and the logic is all in one place (instead of being spread out amongst different callbacks).

As well as calculating the difference between two positions, you might want to find the rate of change of some input data:

static void Main(string[] args)

{

const double accelerationGravity = 9.81;

var positions = Observable.Generate(0.0, i => i < 10.0, i => i + 1.0, i => accelerationGravity * i * i / 2.0);

var velocity = positions.DifferentiateWithTime();

Console.WriteLine(“– velocities — “);

velocity.Subscribe(i => {

Console.WriteLine(i); });

var acceleration = velocity.DifferentiateWithTime();

Console.WriteLine(“– accelerations — “);

acceleration.Subscribe(i => {

Console.WriteLine(i); });

}

}

Again, this is nice. The acceleration is correctly written out for every value as 9.81 (well, within rounding errors).

DifferentiateWithTime is implemented as:

public static partial class ObservableEx2

{

public static IObservable<double> DifferentiateWithTime(this IObservable<double> obs)

{

return (from j in obs.WindowWithCount(2) select j[1] – j[0]);

}

..

In this case we’re taking the current and previous value and taking the difference between them – this of course is assuming that we are receiving values once per second. We could include the time of each sample in a Tuple, and use the Where operator to ignore values which occur at the same time period (to avoid divide-by-zero errors).

WindowWithCount I found on StackOverflow. This takes a rolling window of values around an observables value, which can be used for operations such as a moving average.

public static IObservable<IList<TSource>> WindowWithCount<TSource>(this IObservable<TSource> source, int count)

{

Contract.Requires(source != null);

Contract.Requires(count >= 0);

return source.Publish(published =>

from x in published

from buffer in published.StartWith(x).BufferWithCount(count).Take(1)

where buffer.Count == count

select buffer

);

}

This is cool, but I am concerned that we’re returning a new list for each sample, even though we’re only interested in two values, this could be hammering the garbage collector for high rates of data. It does say in section 6.1 of the RX Design Guidelines that new operators should be composed of existing operators, unless performance is a concern – this is something I may get around to investigating in a future blog post.

Reactive Framework thoughts

I have been very slack with blogging recently, as I was too busy with work and life. I’ve recently moved jobs, and even though I’m just as busy as I was in my previous job, I have a long train commute so I thought that I may as well do something constructive with my time. I figured that blogging is as good as anything else.

In my new role, I’m working on a project that makes heavy use of the the RX Framework, and I’m getting up to speed as fast as I can. It’s actually really fun – the way to approach a problem is different to the old method I’m used-to of having async callbacks everywhere, it’s much more functional, and at the start I felt a little bit like my brain had been turned inside out.

The project is using RX for handling the streaming of trading data, and for orchestrating asynchronous operations. The code base is a lot more succinct, understandable and elegant, than it would be without RX (though I’m finding it takes more time to think through how to write the RX operations). There are no .NET events throughout the codebase, all events are exposed as IObservables, allowing for composition with all sorts of async operations.

I’ve come across the following blog post, where the author was performing calculations on trade data, and that sprung to mind how in a previous job I was manipulating streams of data from hardware devices, and I thought I might go ahead and have a bit of a play with this.

http://eprystupa.wordpress.com/2009/12/20/running-vwap-calculation-using-reactive-extensions-for-net/

I’ll play with this in my next blog post(s).

Klapa Solin 10th anniversary

I was privileged to be able to go to Klapa Solin’s 10th anniversary. Klapa is traditional a cappella singing, and the tradition is alive and well in Dalmatia.

The event was set in the striking Roman ruins of Solin (Salona), and it couldn’t have made a more perfect backdrop to the event.

What’s more – even though tourists would pay handsomely for the night, it was put on especially just for friends of the singing group, with a feast of seafood (black squid risotto, fried anchovies, salata od hobotnice (octopus salad), cod soup) with, of course, plenty of wine to wash it down. The night was totally authentic, everyone was there to enjoy the music, ambience and the food, and life in general.

In between each musical performance were recitals where they spoke of tradition and love of the land. Especially interesting was Pepe Kalafot, and his funny Dlaka recital (I followed enough of this to get the gist 😉 – he has a strong dialect, I think from Komiža on the island of Vis, and from his intonation it sounds almost as if he’s speaking Italian!)

Even more special than the live performance were the impromptu performances afterwards, in the beautiful terrace gardens of Salona, full of food and wine groups would break into song. All in all, it was a magical evening.

These videos don’t do the night justice, but they might give an idea:
Video 1
Video 2
Video 3

Theory driven development using Microsoft Pex Explorer

EDIT: NOTE that some of the contents of this post have been recently been edited thanks to great feedback from Peli de Halleux (http://blog.dotnetwiki.org/). He pointed me at a great post on Test Driven Development with Parameterized Unit Tests that he’d wrote here: http://blog.dotnetwiki.org/TestDrivenDevelopmentWithParameterizedUnitTests.aspx

I’ve been playing with Microsoft Pex to try out some theory-driven development. I’ve found it a great tool – actually much better than I was hoping, it’s found more bugs, and test cases, than using just unit testing or TDD (Test-Driven Development). It’s test-driven development on steroids!

If you’ve been reading this blog, you’ll remember I blogged a while ago about TDD using theories, and what theories are – with the following posts:

http://taumuon-jabuka.blogspot.com/2008/03/comparing-theories-to-more-traditional.html

http://taumuon-jabuka.blogspot.com/2008/03/theories.html

I mentioned that theories could have data-points generated by some sore of explorer, and Jeff Brown (http://blog.bits-in-motion.com/) one of the comments replied that maybe I could use Microsof Pex to do so, and I’ve found it to be perfect! A lot more powerful than I hoped.

This blog post will try to detail my experiences, but by far the best way to get to grips with this is to download Pex and try writing some theories!

In the example below, I’ll create a queue with the following operations:

Enqueue()

Dequeue()

Capacity

Count

It will have an initial capacity, and it will double in size once its capacity is exceeded (similar to the interface on .NET lists, for instance).

First, I’ll skip over a few tests written via TDD (a red-bar green-bar refactor cycle), before I get onto writing some theories.

First, in MSTest, we write a test.

        [TestMethod]
public void Test_NewlyConstructedQueue_CountIsZero()
{
Queue<int> queue = new Queue<int>();
Assert.AreEqual<int>(0, queue.Count);
}

Then we get the tests to compile, by implementing our queue with a Count property, such that we get a red bar, by getting the count property to return -1.

We then fix the test to get a green bar, but letting the Count return zero:

    public class Queue<T>
{
public int Count
{
get { return 0; }
}
}

We then introduce a test that default constructor creates stack with default stack of capacity 8.

        [TestMethod]
public void Test_ConstructDefaultConstructor_InitialCapacityIsEight()
{
Queue<int> queue = new Queue<int>();
Assert.AreEqual<int>(8, queue.Capacity);
}

The implementation is pretty simple:

        private const int initialCapacity = 8;
public int Capacity
{
get { return initialCapacity; }
}

Now, want to add a Dequeue method, and test that calling it on an empty stack throws an InvalidOperationException.

        [TestMethod]
[ExpectedException(typeof(InvalidOperationException))]
public void Test_NewlyConstructedQueue_DequeueThrowsException()
{
Queue<int> queue = new Queue<int>();
int i = queue.Dequeue();
}

public T Dequeue()
{
throw new InvalidOperationException("Should not call Dequeue on an empty queue");
}

We’re going to come back to look at this test shortly.

We add a test for Enqueue, to check its count is 1.

        [TestMethod]
public void Test_NewlyConstructedQueue_Enqueue_CountIsOne()
{
Queue<int> queue = new Queue<int>();
queue.Enqueue(1);
Assert.AreEqual(1, queue.Count);
}

Add an Enqueue method that does nothing to get this to compile and redbar. We actually have to add some implementation now, to get this to pass.

    public class Queue<T>
{
private const int initialCapacity = 8;
private T[] items = null;

public int Count
{
get { return items == null ? 0 : 1; }
}

public int Capacity
{
get { return initialCapacity; }
}

public T Dequeue()
{
throw new InvalidOperationException("Should not call Dequeue on an empty queue");
}

public void Enqueue(T item)
{
items = new T[initialCapacity];
items[0] = item;
}
}

There are lots of things wrong with this implementation – there are some obvious bugs, we’re recreating the array on each Enqueue, not to mention the array being a fixed size, with no capability to grow., and the weird Count implementation. This obviously needs implementing and refactoring, but it passes all our tests, and that’s TDD – we don’t go adding code until we drive it via adding more tests. So, lets add another couple of tests to drive the implementation:

        [TestMethod]
public void Test_NewlyConstructedQueue_Enqueue_DequeueReturnsItem()
{
Queue<int> queue = new Queue<int>();
queue.Enqueue(1);
int item = queue.Dequeue();
Assert.IsTrue(1 == item);
}

And can get the tests to pass with the following:

        public T Dequeue()
{
if (items == null)
{
throw new InvalidOperationException("Should not call Dequeue on an empty queue");
}
return items[0];
}

NOTE: We’re really taking baby steps, but to ensure that we’re not writing any untested implementation.

Now let’s add a test for adding and removing more than one item:

        [TestMethod]
public void Test_NewlyConstructedQueue_EnqueueTwoItems_DequeueReturnsFirstItem()
{
Queue<int> queue = new Queue<int>();
queue.Enqueue(1);
queue.Enqueue(2);
int item = queue.Dequeue();
Assert.IsTrue(1 == item);
}

To get this to pass we can more property implement our queue.

    public class Queue<T>
{
private int tailIndex = 0; // insertion
private int headIndex = 0; // removal
private const int initialCapacity = 8;
private T[] items = new T[initialCapacity];

public int Count
{
get { return tailIndex - headIndex; }
}

public int Capacity
{
get { return initialCapacity; }
}

public T Dequeue()
{
if (Count == 0)
{
throw new InvalidOperationException("Should not call Dequeue on an empty queue");
}
return items[headIndex--];
}

public void Enqueue(T item)
{
items[tailIndex++] = item;
}
}

To get this to work we had to add more and more example tests – we were saying quite general statements, but expressing this in more and more specific example cases, adding functionality as we went. This process is called triangulation (as I mentioned in an earlier blog post).

There are still a number of limitations here: if we exceed 8 items in the queue, we’ll get an IndexOutOfRange exception.

Can quite easily write a test for adding more values for the auto-growth.

Also, a more insiduous bug is that our values can migrate along the array – we’re not moving items back towards the back (or front of the queue), so could run out of space even with less than 8 values – a much harder bug to find. This is a subtle bug, and to find it using a unit test would be quite tricky, if you didn’t see it up front. USING PEX TO AUTOMATICALLY GENERATE THE THEORY VALIDATION IS GREAT FOR THESE TYPES OF BUGS!

Let’s see if we could have written our test in such a way that this problem may have been detected.

        [PexMethod]
public void Theory_CanAddRemoveAnyNumberOfItemsToNewlyConstructedQueue_IsEmptyAfterwards(int num)
{
Queue<int> queue = new Queue<int>();
for (int i = 0; i < num; i++)
{
queue.Enqueue(i);
}
for (int i = 0; i < num; i++)
{
queue.Dequeue();
}
Assert.AreEqual<int>(0, queue.Count);
}

Use the menu option “Run Pex Explorations”.

4 explorations (unit tests) are generated for this – passing in values of num 0, 1073741824, and 2 and 1.

0 passes.

1073741824 throws an IndexOutOfRangeException

2 throws an IndexOutOfRangeException

1 throws an AssertFailedException

This straight away shows that the code is more bugged than I thought. This would probably have been spotted on the introduction of further tests (for instance, test adding and removing three items from the list), but already highlights the power of theories.

Note: This could have been written as a unit test, but would have had to iterate over all possible integers, instead of just the 4 values that Pex provided.

Passing in a value of 1 – we had a similar test to this in enqueing and dequeing a single item, but we tested that the item dequeued was the same as that enqueued, we didn’t test the Count property.

This bug is due to the fact that the headIndex should have been incremented, instead of decremented on dequeuing.

Fixing that and running our tests (including the previously generated ones – without reexploring) results in 9 of the 10 tests passing. The failing case is we’re passing in a large in value.

This is obviously due to the fact that we’ve got a fixed size array, and it doesn’t auto-grow.

Let’s add an assumption to get our test to pass, before we go on to add the auto-grow functionality.

PexAssume.IsTrue(num <= queue.Capacity, "num is less than capacity");

In theories, assumptions allows us to state the input data for which the theory is valid.

We need to re-run the explorations so we’re not calling our test with data points that fail the assumption.

Before we go to get our queue to expand its capacity, let’s change our theory to see if it can detect the problem of running out of space after adding and removing items even before it goes out of capacity.

I originally wrote my theory in the format below, but see the second format and following comments, following feedback from Peli de Halleux.

        [PexMethod]
public void Theory_CanAddRemoveAnyNumberOfItemsAnyEmptyQueue_IsEmptyAfterwards(
[PexAssumeNotNull]Queue<int> queue, int num)
{
PexAssume.IsTrue(num <= queue.Capacity, "num is less than capacity");
PexAssume.IsTrue(queue.Count == 0);
for (int i = 0; i <= num; i++)
{
queue.Enqueue(i);
}
for (int i = 0; i <= num; i++)
{
queue.Dequeue();
}
Assert.AreEqual<int>(0, queue.Count);
}

This was rewritten to:

        [PexMethod]
[PexGenericArguments(typeof(int))]
public void Theory_CanAddRemoveAnyNumberOfItemsAnyEmptyQueue_IsEmptyAfterwards<T>(
[PexAssumeUnderTest]Queue<T> queue, [PexAssumeNotNull]T[] values)
{
PexAssume.IsTrue(values.Length <= queue.Capacity, "num is less than capacity");
PexAssume.IsTrue(queue.Count == 0);
for (int i = 0; i < values.Length; i++)
{
queue.Enqueue(values[i]);
}
for (int i = 0; i < values.Length; i++)
{
queue.Dequeue();
}
Assert.AreEqual<int>(0, queue.Count);
}

The PexAssumeNotNullAttribute states that the theory is not valid for null values of that parameter, but the PexAssumeUnderTest, as well as stating that the parameter cannot be null, gives Pex more hints for tuning its search strategies.

This is a much more general theory – we’re now saying given any empty queue, it will be empty after enqueueing and dequeueing.

We can instruct Pex how to construct a queue, that has had items added and removed:

        [PexFactoryMethod(typeof(Queue<int>))]
public static Queue<int> Create(int numberInitialAdds,
int numberInitialRemoves)
{
Queue<int> queue = new Queue<int>();
if (numberInitialAdds <= 0 || numberInitialRemoves < 0)
{
return queue;
}
for (int i = 0; i < numberInitialAdds; ++i)
{
queue.Enqueue(i);
}
int numberToRemove = numberInitialAdds > numberInitialRemoves ?
numberInitialRemoves : numberInitialAdds;
for (int i = 0; i < numberToRemove; ++i)
{
queue.Dequeue();
}
return queue;
}

Now, running the theory gives an IndexOutOfRangeException after adding and removing ight items to the queue in the factory – and then subsequently adding and removing one items. This should work – as we’re not exceeding the capacity.

This is powerful – if we were unaware of the existence of this bug it’s difficult to imagine how we would have come across it in TDD.

Let’s fix the failing theory before we remove our assumptions.

        public T Dequeue()
{
if (Count == 0)
{
throw new InvalidOperationException("Should not call Dequeue on an empty queue");
}
T item = items[headIndex++];
if (headIndex == tailIndex)
{
headIndex = tailIndex = 0;
}
return item;
}

Now, the tests run. Let’s go on to address the auto-grow issue.

Now, remove the assumption that values.Length <= queue.Capacity and re-explore this theory.

Get an index out of range exception when attempting an array of 15 items to the queue.

We can fix this by implementing auto-grow in the stack:

        public void Enqueue(T item)
{
if (tailIndex == items.Length - 1)
{
items = new T[items.Length * 2];
}
items[tailIndex++] = item;
}

We can now re-run our tests and they all pass.

Now, deleting all Generated Unit Tests in the class, and re-running Pex Explorations, gives Path bounds exceeded messages – the time is being spent in the factory adding and removing items to the queue, and exploring the paths of auto-growing the stack.

Instead of getting our queue into the required states by manipulating it via its public interface, we can more-directly construct it into any state.

Remove the factory, and add an overloaded constructor. 

        public Queue(T[] items, int headIndex, int tailIndex)
{
if ((uint)tailIndex >= (uint)(items.Length))
throw new ArgumentException("(uint)tailIndex >= (uint)(items.Length)");

this.items = items;
this.headIndex = headIndex;
this.tailIndex = tailIndex;
}

And instruct Pex to use this constructor with an assembly attribute:

[assembly: PexExplorableFromConstructor(typeof(Queue<int>),
typeof(int[]), typeof(int), typeof(int))]

Now, Pex is able to explore our theory, and finds no failing cases.

NOTE: In my original version of the theory, my theory took an int, the number of values to add to the queue, instead of the array of values, and it that case Pex would pass in a number of integers to add that would cause OutOfMemoryExceptions to occur. Instead, passing in the array of values means that Pex passes in more meaningful data.

The implementation of the queue is progressing; we’re autogrowing the stack, but we’re not copying the values across. We’re also not updating the capacity. We need to check that we can enqueue or dequeue items,and we’d obviously write more theories to do this.

This does show that the thought process behind writing theories is similar to TDD – the tests/theories are only as good as you write, and it’s still easy to miss obvious implentation (you still need to think about what you’re writing), but I feel that writing theories, with an explorer, allows many more bugs to be caught during the process. Hopefully this has given a good taste of theory driven development.

Faster boids

I used the built in performance profiler in VSTS to look at speeding the boids up (and there seems to be a bug in the profiler – when memory profiling was turned on, it seemed unable to resolve symbols, so I switched to the CLRProfiler to look at the memory being used).

Most of the time in the application was spent in calculating the boids’ steering behaviours.
The optimisations here were to try to reduce the time spent in calculating the forces – one optimisation was that instead of each behaviour (cohesion, separation, alignment) each scanning through the entire set of boids, the results of one of the behaviours could be used by both others.

Secondly, the cohesion force was not updated on each frame – the value is cached for 10 iterations. This should not be a fixed value – there should be some heuristic to determine how it should be varied once work is in to measure the “quality” of the flock.

This boosted the frame rate from approx 3fps to 10fps (dropping to 5 fps after some time).

Next came some micro-optimisations.

Changing the Vector to be a struct instead of a class (to avoid over-stressing the garbage collector) resulted in 14fps dropping to 8 fps after running.

Other micro-optimisations that I tried, but didn’t result in any real performance difference included:

Making the trail generator to use a fixed size array (instead of adding/removing from a list).

Removing IEulerRigidBody, calling onto objects directly instead of through an interface.

Changing all doubles to be floats (didn’t actually keep this change in the code).

For Scene8, the drawing code was the real bottleneck, changing the gluSphere to be drawn using VertexBufferObjects resulted in double the frame rate (and it isn’t particularly optimised – could use glDrawElements).

For Scene7, it’s limited by the O(n2) collision detection (this is the next thing to tackle).

Boids

An improvement of the flocking behaviour in Jabuka is available.

There are now very simplistic birds drawn, to give an idea of the boid’s orientation.

The separation behaviour has been fixed. The behaviour was garbage before, the force away from another boid was proportional to the distance away from it. Now the force increases the nearer another boid is.

The boid’s orientation is taken as the smallest angle between being aligned with the World’s x axis, and the boid’s velocity. The roll doesn’t behave realistically (see scene 13), but the boid’s behaviour isn’t realistic. A boid would really change its velocity by adjusting its orientation (change the wing shape to apply a torque to allow it to rotate…)

This paper describes how the steering behaviours should change the intention of the boid, instead of providing actual forces onto the boid.

A simple particle system has been introduced to allow the path of a few individual boids to be more easily seen.

Performance.
Running under the Visual Studio profiler, most of the time was seen to be spent finding the flockmates in proximity of the boid.

The same calculation was being done twice, for both the cohesion and alignment behaviours. Adding a ProximityFinderCache increased the frame rate by 50%, with no change in the calculations. Adjusting the code so that the boids in proximity are only calculated on every third iteration added another 20% to the framerate (giving 15fps with only two TrailGenerators attached).

Seized Quill Stem




I thought I’d blog about my experiences freeing a seized quill stem, as while researching on the internet on how to free it there were a few potential solutions listed, but no-one seemed to say definitively which of the kookier methods worked for them.

I bought a second hand bike off a work colleague a few years ago, and found out later that the stem was frozen (an aluminium stem in a steel steerer tube). I’ve put up with it for a while, even though the bike position didn’t suit me, but I finally got around to sorting it as I want to take more advantage of the good weather – it was 13 miles each way to commute to my last job and I was managing to do it roughly once a week last year, but my new job’s looking further away, and I want to try to do it more often so I want to get my position sorted. More importantly, I was prompted by the headset needing servicing.

I’ve tried off and on for the last couple of years to free the stem, without much success, saturating the stem in WD40 etc. A couple of weeks ago I managed to free the bottom quill wedge by backing out the stem bolt and hammering it using an allan key (using a big wrench – if you’ve got a hammer then all you see is nails, and if you don’t have a hammer anything that comes to hand’ll do). This freed the wedge, but not the stem. I didn’t feel confident cycling without the wedge being fastened, even though it was jammed, as I didn’t trust Sod’s or Murphys’s law from freeing at a dangerous moment, so I had to spend an hour trying to hold the wedge in position with an old spoke fiddling to get the bolt to attach again.

There were a number of solutions listed to free a quill stem, including soaking it in penetrating oil, freezing it, using household ammonia, or coke to free the stem. I didn’t really try twisting the stem out via grabbing the front wheel whilst securing the handlebars, as I read that it was more likely to twist the forks out of alignment than actually free the stem.

I got some “Shock and Unlock” from Halfords, which doubled as trying the penetrating fluid solution, and the freezing solution – I applied to either side of the stem (i.e. from below and above) for a week, then filled the stem with a large quantity of this solution, to ensure that it would have had the cooling effect, and then leathered the living hell out of the stem with a hammer, from above, below, and every other angle, to no effect.

I didn’t fancy using household ammonia, as messing with chemicals doesn’t really appeal to me, and I haven’t got anywhere safe I could do it without poisoning the neighbour’s cats. The coke method seemed not worth bothering with – apparently the aluminium oxide is dissolved by alkalis (hence the ammonia), and as coke is acidic this wouldn’t apply. I’m not a chemist though (as you can tell ;-)) but I really didn’t see it working, and I didn’t fancy paying the Coca Cola corporation just to turn my bike into a sugary sticky mess (if I was interested in this, apparently lime juice is more acidic than coca cola, but the claims are that it’s the electrolytic effect of coke that works – it’d be great if anyone lets me know if that actually worked for them).

I followed the advice in this thread and decided to cut it out. I hacksawed the stem off about 5mm above the steerer, which was the easy job. Then I had to cut a slot down the length of the stem. This was initially tricky, as the hacksaw blade only just fit into the inner diameter of the stem, I had to take it easy from the top to get a good angle to get the slot cut. Some threads say how delicately they took it, but I just wrapped the top of a hacksaw blade in duck tape, and went at it.

The aluminium is a lot softer than steel, and it’s obvious when you’re through, so I didn’t pansy around. It took around 4 hours to cut through though, solidly going at it (spread over two evenings). I did badly blister and cut up my nancy-boy office worker hands, so if you’re concerned about that kind of thing it might be worth getting some gloves or taking it easier…

I had to cut a slot totally through one side of the stem, and cut almost entire through the other side of the stem, to allow it to flex enough. I gradually bent the stem in on itself by pinching the top with a pair of mole grips, spraying the “shock and unlock” down as I went. Finally, I grabbed the top of the stem in the jaws of some mole grips, put the front wheel back on to give myself something to turn against, and bugger me, was I surprised when it turned slightly! I gritted my teeth and managed to twist the old stem out 🙂

I got myself a quill stem to ahead adapter but it refused to go down the steerer tube. Before I could fit it I needed to polish the remaining crap out of the steerer tube using some wet and dry.

Maslina and Rakija updates for NUnit 2.4.7

The above NUnit extensions have been updated, Rakija (the data-driven tests extension) has simply been rebuilt against NUnit 2.4.7.

Maslina (the theory extension) has had some changes.
Firstly, inline Assumptions have been added (i.e. you can Assume.IsTrue()) in the code.

The most important change, however, is that a failure of the pre-called assumptions now results in the body of the theory method not being called, i.e. theories are no longer validated (and similarly, Assume.IsTrue() exits the theory method immediately).

There were some very interesting discussions on the NUnit developer list recently which persuaded me of the error of my ways, most specifically, it can be too ‘dangerous’ to continue execution of a theory if the user has signalled that the data is invalid.

You should validate that your assumptions are valid when writing theories, but this should be done in either another theory, or plain vanilla unit test.

Comments welcome as always 🙂

Comparing Theories to more traditional testing

My old work colleague Tim has recently blogged about using NSpec to specify a stack.

NSpec has the same sort of functionality as a unit testing framework such as NUnit. The terminology has been changed to get over the roadblock that some people have in adopting tests.

Theories actually give something over and above normal unit testing, and that’s what I’m going to look at in this blog post. I’m going to look at Tim’s example and show how using theories actually differ from Tim’s more traditional example.

The stack interface for which the implementation was arrived at via speccing is as follows:


public class Stack<t>
{
public Stack();
public void Clear();
public bool Contains(T item);
public T Peek();
public T Pop();
public void Push(T item);

// Properties
public int Count { get; }
}

The following tests were arrived at:


namespace Stack.Specs
{
[Context]
public class WhenTheStackIsEmpty
{
Stack _stack = new Stack<int>();

[Specification]
public void CountShouldBeZero()
{
Specify.That(_stack.Count).ShouldEqual(0);
}

[Specification]
public void PeekShouldThrowException()
{
MethodThatThrows mtt = delegate()
{
_stack.Peek();
};

Specify.ThrownBy(mtt).ShouldBeOfType(typeof(InvalidOperationException));
}
}
}

That’s ample for us to discuss the difference between theories and more normal testing.

For the PeekShouldThrowException test/specification, we can see from the naming of the context that the developer intends to show that for an empty stack, the Peek operation throws an exception. However, what the developer has actually shown is that calling Peek on a newly-created stack throws an exception.

Developers tend to think in fairly general terms, and express this generality by using more specific cases. However, some of this generality can get lost. Theories aim to keep more of that generality.

We can demonstrate this in a theory (don’t take much note of the syntax, just the concepts)


[Theory]
public void PeekOnEmptyStackShouldThrow(Stack<int> stack)
{
try
{
stack.Peek();
Assert.Fail(ExpectedExceptionNotThrown);
}
catch (InvalidOperationException) { }
}

This states that calling Peek() on ANY stack should fail, we need to show that this is only true for an empty stack. We could do this by simply checking for this:


[Theory]
public void PeekOnEmptyStackShouldThrow(Stack<int> stack)
{
try
{
if (stack.Count == 0)
{
stack.Peek();
Assert.Fail(ExpectedExceptionNotThrown);
}
}
catch (InvalidOperationException) { }
}

But as we’ll see in a bit, using assumptions gives us some extra feedback (again, don’t focus on the syntax).


[Theory]
[Assumption("AssumeStackIsEmpty")]
public void PeekOnEmptyStackShouldThrow(Stack<int> stack)
{
try
{

stack.Peek();
Assert.Fail(ExpectedExceptionNotThrown);
}
catch (InvalidOperationException) { }
}

public bool AssumeStackIsEmpty(Stack<int> stack)
{
return stack.Count == 0;
}

This is a much more general statement than the original specification/test, we’re saying that the stack should fail if we try to Peek on it for ANY empty stack.

We don’t care whether this is a newly-created stack, or it is a stack which has been manipulated via its public interface. Also, Liskov Substitution Principle states that we should be able to use any classes derived from Stack, and the theories should hold true.

We validate this theory with example data, in much the same way as when we’re doing test-driven development. The extra power comes from the generality in the way that the theory is written – we can imagine a tool that performs static code analysis on the Stack class to confirm that it obeys this.

However, the literature mentions that the most likely way to validate a theory is via an exploration phase, via a plug-in tool that will try various combinations of input data to look for anything that fails the theory.

It is prohibilively expensive to explore every possible combination of inputs, imagine all the possible values of a double, or in our example, there are an infinite number of operations that could happen to a stack that gets passed in.

This fits in nicely with the name theory with parallels with science – it’s not feasible to prove it, but we look for data to disprove it.

The example data is important for the red-green-refactor cycle. The exploration phase sits outside that – it finds which input data doesn’t fit the theory, allowing the theory to be modifed. There are exploration tools in Java, and I haven’t looked too much into it, but it may be possible to use Microsoft’s Pex as an exploration tool?

Before I forget, this is a possible way to specify the example data for our stack:


[Theory]
[Assumption("AssumeStackIsEmpty")]
[InlineData("EmptyStack", new Stack())]
[PropertyData("EmptiedStack")]
public void PeekOnEmptyStackShouldThrow(Stack<int> stack)
{
try
{
stack.Peek();
Assert.Fail(ExpectedExceptionNotThrown);
}
catch (InvalidOperationException) { }
}

public List<exampledata> EmptiedStack
{
get
{
List<exampledata> data = new List<ExampleData>();
Stack stack = new Stack();
stack.Push(2);
stack.Push(3);
stack.Pop();
stack.Pop();
data.Add(stack);
return data;
}
}

In my prototype extension, the assumptions are important and are validated, as they tell us something vital about the code. I think that all the information about the behaviour of the system is vital, and should be documented and validated, but there are varied opinions on the list. That’s why I’m blogging – give me your feedback 🙂

If the user changed the behaviour of Peek() such that it was valid on an empty stack (it may return a Null Object for certain generic types), then our assumption would not detect this if it was simply filtering the data – the assumption would say “Peek() fails, but only on empty stacks”, whereas Peek() would not fail on empty stacks. See my previous post for the behaviours I have implemented.

Notice in Tim’s implementation how his stack is hardcoded to have at most 10 items. When TDDing we may make slightly less obviously limited implementations to get our tests to pass, but forget to add the extra test cases to show this limitation (the process of progressively adding more and more general test cases is called triangulation).

When writing theories, the same process happens, but writing the theories as a more general statement means that a code reviewer/automated tool can see that the developer intended that we intended that we can push a new item onto ANY stack, not just a stack that contained 9 or less items.

Any thoughts? Have I got the wrong end of the stick? If anyone found this post useful, I might full flesh out the equivalent of Tim’s example.

Sample Theory Implementation as NUnit Extension.

There’s been lots of comments bouncing around on the NUnit mailing list about what exactly constitutes a Theory, and what the desired features are, so I’ve created an NUnit extension with a sample Theory implementation – you can get it, Maslina version 1.0.0.0, from www.taumuon.co.uk/rakija

xUnit.Net implements theories but does not have any in-built Assumption mechanism (you can effectively filter out bad data, which is the same as a filtering assumption). JUnit 4.4, I think, only filters out data – it doesn’t tell us anything about the state of an assumption.

Anyway, from reading the literature on theories (see my previous blog posting), I quite like the idea of having assumptions tell us something about the code, that those assumptions are validated.

The syntax of my addin is quite poor, and there’s not really enough validation of user input, but I’m aiming to try to do some theory-driven development (theorizing?) using it, to see what feels good and what grates.

Any feedback gratefully received (especially – is it valid to say that this is an implementation of a Theory, are validation of assumptions useful or unnecessary fluff?)

Here is the syntax of my extension.


[TestFixture]
public class TheorySampleFixture
{
[Theory]
[PropertyData("MyTestMethodData")]
[InlineData("Parity", new object[] { 1.0, 1.0, 1.0 })]
[InlineData("Parity 2", new object[] { 2.0, 2.0, 1.0 })]
[InlineData("Double Euros", new object[] { 2.0, 1.0, 2.0 })]
// This does not match the assumption, and will cause this
//specific theory Assert to fail, in which case we will get a pass overall.
// If the unit under test were changed to somehow handle zero exchange rate,
// the body of the theory method would pass, but the
// assumption would still not be met and overall we will register a failure.
[InlineData("ExchangeRate Assumption Check", new object[] { 2.0, 1.0, 0.0 })]
// This case will fail, there is an assumption that the dollar value is not three,
// but passing in a value of 3 doesn't cause a failure in the code, demonstrating
// that the assumption serves no purpose
[InlineData("This should fail, assumption met but no failure in method", new object[] { 3.0, 1.0, 3.0 })]
[Assumption("ConvertToEurosAndBackExchangeRateIsNotZero")]
[Assumption("DollarsNotThree")]
public void MyTheoryCanConvertToFromEuros(double amountDollars, double amountEuros, double exchangeRateDollarsPerEuro)
{
// Should check are equivalent within a tolerance
// Calls static method on Convert method
Assert.AreEqual(amountDollars, Converter.ConvertEurosToDollars(Converter.ConvertDollarsToEuros(amountDollars,
exchangeRateDollarsPerEuro), exchangeRateDollarsPerEuro));
}

// Assumption is that the exchange rate is not zero
public bool ConvertToEurosAndBackExchangeRateIsNotZero(double amountDollars, double amountEuros, double exchangeRateDollarsPerEuro)
{
// Should have a tolerance on this
return exchangeRateDollarsPerEuro != 0.0;
}

// Assume that dollar value not equal to three
// This is just to demonstrate that an invalid assumption results in a failure.
public bool DollarsNotThree(double amountDollars, double amountEuros, double exchangeRateDollarsPerEuro)
{
return amountDollars != 3.0;
}

/// Returns the data for MyTestMethod
///
public IList MyTestMethodData
{
get
{
List details = new List();
details.Add(new TheoryExampleDataDetail("Some other case should pass", new object[] { 2.0, 20.0, 5.0}));
return details;
}
}
}

public static class Converter
{
public static double ConvertEurosToDollars(double amountDollars,
double dollarsPerEuro)
{
return amountDollars * dollarsPerEuro;
}

public static double ConvertDollarsToEuros(double amountEuros,
double dollarsPerEuro)
{
return amountEuros / dollarsPerEuro;
}
}

A nicer syntax/api would be to have the assumptions inline:


public void CanConvertToEurosAndBack(double amountDollars, double amountEuros, double exchangeRateDollarsPerEuro)
{
Assume.That(exchangeRateDollarsPerEuro != 0.0);
Assume.That(amountDollars != 0.0);

// Checks are equivalent within a tolerance
// Calls static method on Convert method
Assert.AreEqual(amountDollars, Converter.ConvertEurosToDollars(Converter.ConvertDollarsToEuros(amountDollars,
exchangeRateDollarsPerEuro),exchangeRateDollarsPerEuro));
}

Here’s the rules of my Theory Implementation

If there is no example data, the theory passes (we may want to change this in the future).
If there are no assumptions for a theory, then each set of example data is executed against the theory each producing its own pass or fail.

If assumptions exist, the each set of data is first validated against the assumption – if it meets the assumption, then the test proceeds and any test failure is flagged as an error.
If the example data does not meet the assumption, then if the test passes it indicates that the assumption is invalid, and that case is marked as a failure, with a specific message “AssumptionFailed”. Any assertion failures or exceptions in the actual theory code are treated as passes. (in the future, would we want to mark the specific exception expected in the test methdo if an assumption is not met?).

NOTE: we may want to mark as a failure any theory for which ALL example data fails the assumptions, as a check that the
actual body of the theory is actually being executed. I’ve not done this for now as it would be trickier with the current
NUnit implementation.

Similarly, I was thinking of failing if any of the assumptions weren’t actually executed, but again, this is tricky in the current NUnit implementation (and may not give us much).

Automated exploration would not follow the last two suggested rules. The automation API would need to generate its data and execute it as if it were inline data. It may be helpful for the automated tool to be able to retrieve the user-supplied example data, so it doesn’t report a failure for any known case, but this is probably not necessary.

Feedback on these rules would be most welcome. If you want to change the behaviour of the assumptions (i.e. have assumptions only filter and nothing more), then the behaviour can be changed in TheoryMethod.RunTestMethod()

Here’s the output of the above theory: