Faster boids

I used the built in performance profiler in VSTS to look at speeding the boids up (and there seems to be a bug in the profiler – when memory profiling was turned on, it seemed unable to resolve symbols, so I switched to the CLRProfiler to look at the memory being used).

Most of the time in the application was spent in calculating the boids’ steering behaviours.
The optimisations here were to try to reduce the time spent in calculating the forces – one optimisation was that instead of each behaviour (cohesion, separation, alignment) each scanning through the entire set of boids, the results of one of the behaviours could be used by both others.

Secondly, the cohesion force was not updated on each frame – the value is cached for 10 iterations. This should not be a fixed value – there should be some heuristic to determine how it should be varied once work is in to measure the “quality” of the flock.

This boosted the frame rate from approx 3fps to 10fps (dropping to 5 fps after some time).

Next came some micro-optimisations.

Changing the Vector to be a struct instead of a class (to avoid over-stressing the garbage collector) resulted in 14fps dropping to 8 fps after running.

Other micro-optimisations that I tried, but didn’t result in any real performance difference included:

Making the trail generator to use a fixed size array (instead of adding/removing from a list).

Removing IEulerRigidBody, calling onto objects directly instead of through an interface.

Changing all doubles to be floats (didn’t actually keep this change in the code).

For Scene8, the drawing code was the real bottleneck, changing the gluSphere to be drawn using VertexBufferObjects resulted in double the frame rate (and it isn’t particularly optimised – could use glDrawElements).

For Scene7, it’s limited by the O(n2) collision detection (this is the next thing to tackle).

Boids

An improvement of the flocking behaviour in Jabuka is available.

There are now very simplistic birds drawn, to give an idea of the boid’s orientation.

The separation behaviour has been fixed. The behaviour was garbage before, the force away from another boid was proportional to the distance away from it. Now the force increases the nearer another boid is.

The boid’s orientation is taken as the smallest angle between being aligned with the World’s x axis, and the boid’s velocity. The roll doesn’t behave realistically (see scene 13), but the boid’s behaviour isn’t realistic. A boid would really change its velocity by adjusting its orientation (change the wing shape to apply a torque to allow it to rotate…)

This paper describes how the steering behaviours should change the intention of the boid, instead of providing actual forces onto the boid.

A simple particle system has been introduced to allow the path of a few individual boids to be more easily seen.

Performance.
Running under the Visual Studio profiler, most of the time was seen to be spent finding the flockmates in proximity of the boid.

The same calculation was being done twice, for both the cohesion and alignment behaviours. Adding a ProximityFinderCache increased the frame rate by 50%, with no change in the calculations. Adjusting the code so that the boids in proximity are only calculated on every third iteration added another 20% to the framerate (giving 15fps with only two TrailGenerators attached).

A change of approach/jabuka 0.6

As mentioned on the Jabuka website, I’m moving away from a tutorial-based approach, mainly because of lack of time. It’s much quicker to implement new code and then blog retrospectively, highlighting new features and interesting code/design decisions. In the tutorial based approach, making a change/bug fix in an earlier tutorial forces all later tutorials to be updated.

That being said, it’s worth discussing changes are in the latest version, 0.6, of Jabuka.
This code introduces a couple of simple steering behaviours, and flocking boids. Given the roadmap, why did I suddenly go off on a total tangent and implement boids? Well, apart from flocking behaviour being especially cool (look on youtube for the Carling Belong commercial, or starling flocks), I wanted to stress the engine (I use that term very loosely given the current functionality of Jabuka) as I’m itching to demonstrate multithreading and performance enhancements. This version of Jabuka has shown that it is still too early to look at performance, however, as the application still has some fundamental flaws to be addressed.

The flocking application implements a flock (based on the algorithms on Chris Reynolds website, and roughly based on the algorithms in the Killer Game Programming in Java book). The changes to Jabuka were that the force for a rigid body is obtained in the Update() method from introducing an IForceProvider interface, implemented by various behaviours.
If you look at the implementation of the steering behaviours, you can see that there is a large number of classes implementing this interface, each performing a simple operation such as aggregation/limitation. This does lead to a proliferation of classes, but I’m happy to live with it for now as it makes unit testing easier, and may lead to more easily changing behaviour dynamically at runtime.

The Update() method calling on the IForceProvider is not perfect in this case – the objects are iterated over, and each object’s Update() method calls on its IForceProvider. For the steering behaviours, the behaviour compares the body’s location to its neighbours to calculate its force and update its position. The problem is that the resulting locations are dependent on the order in which the bodies are updated. This has problems in measuring the flocking quality (more on this later), and it may be desired to update the steering forces independent of Update(), so we don’t do this in every time step (as it’s an expensive operation – O(2)?). Also, having the state of an object dependent on every other object on every timestep will get really messy when we introduce multithreading. Having each body’s Update() method independent of any other means that it’s much easier to parallelise/vectorise/split into multiple threads – it’s likely we’d want to avoid making every IEulerRigidBody thread safe, and would want to implement the thread safety at a higher level (though I won’t jump the gun, we’ll get there soon enough).

The sort of changes above hint that it may be better to have a Physics Manager class, that owns the Collision object, the rigid bodies, and also has knowledge of the IForceProviders and manages the overall sequence of operation.

The larger problem with having the currently flocking behaviour, including collision detection, is that I haven’t yet implemented resting contacts (well, any contact forces) apart from collisions, and as objects are forced towards each other, the simulation can halt. I’ve temporarily fixed this by introducing a scene where the boids don’t have any collision detection. I’ve also introduced a scene which demonstrates the halting, with only two spheres.

As to steering behaviours and flocking itself… I’ve introduced a scene for the seek behaviour, demonstrating that having the steering force directed towards the target is insufficient, the velocity tangential to the targed needs to be taken into account. The idea is to find the optimal trajectory towards the target. Reynolds copes with this by providing a force such that the body’s velocity is adjusted to be the maximum velocity towards the target. This is better, but does mean that the body needs a concept of its maximum velocity (this works OK on Earth, as air friction gives a terminal velocity, but in space this is arbitrary). Also, the maximum velocity may vary – the maximum velocity into the wind is less than having a following wind, and may depend on a body’s shape (or its orientation with respect to its velocity).

Looking at the arrive behaviour, where the speed is arbitrarily slowed down, the first thought that comes to mind is that a PID control towards the target would be desired. However, the problem is that the velocity tangential/perpendicular and distance from the target all need to be taken into account – a more complicated control scheme would need to be used, and that may be too computationally expensive.

The current boid implementation currently uses the same proximity distance for all individual behaviours (separation, alignment, cohesion). It may be that for separation, a smaller proximity should be used. Also, currently, I’m using a simple sphere without taking into account the blind spot behind the boid.

There are lots of tweaks to be made (and a Genetic Algorithm could be used to change these dynamically), but many of them are pointless without a method of measuring the quality of the flock. There is an interesting paper on that here. Some background on the flock theory / the underlying physics of flocking, can be found here, and here.