# .NET SIMD to solve Burgers’ equation

I’m enrolled on the mooc Practical Numerical Methods in Python, and module 2 has covered the Burgers’ Equation. This is a great course by the way, and it’s not too late to join up!

I am impressed with the productivity of Python. ipython notebooks, SymPy, plotting etc. are all great and I feel that .NET could benefit from a similar ecosystem.

That said, I’ve been itching to play with the new .NET SIMD support and this seemed to be a suitable example.

The Python notebook demonstrated how using Python’s array notation to solve the Burgers’ equation using finite difference resulted in a 9 fold speed increase, on  my machine taking 25ms instead of 200ms.

I translated the code into C#, and it completed in 2ms, so I immediately was impressed (not surprised though – an interpreted versus (JIT) compiled language…). This wasn’t going to tell me much about SSE improvements though, so I increased the number of spatial samples a hundred fold, and the number of timesteps by 10.

The time was then increased to 2580ms.

Installing .NET 4.5.2 and using RyuJIT CTP4 to compile and run, the running time reduced to 1880ms. RyuJIT can’t arrive fast enough!

I then implemented array swappping, instead of continually creating and copying arrays, to see if it would reduce pressure on the GC (Calculate2 in the gist), but it had no significant difference.

I then pulled the calculation of (nu * dt / dx) into a variable (Calculation3())

This resulted in a calculation time of 230ms.

Then, I felt it was time to implement SIMD. The documentation is a bit scarce, but it wasn’t too tricky to figure out. My machine has 128bit SSE registers, so it can add two doubles (64 bit) in a single operation.

Running this resulted in a calculation time of 90ms, so slightly greater than 2x speedup. I’m not sure whether it was able to make further optimisations based on the changed structure of the code, but this is a great win. This is implemented in Calculation4() in the gist.

The gist is available here: https://gist.github.com/taumuon/5d4db567c57f9846971d

## 3 thoughts on “.NET SIMD to solve Burgers’ equation”

1. Nice work. I've been thinking of dusting off my very old ray tracer and seeing what I can do with it using the new SSE capabilities. I had initially thought RyuJit was limited to Windows 8 but I can see CTP4 supports 7, so I have one less excuse not to do it…

2. Nice post, it’s cool to see the speed gains that RyuJIT provides

BTW if you want some help when benchmarking C# code, you should take a look at BenchmarkDotNet, http://benchmarkdotnet.org (disclaimer, I’m one of the authors)

1. gary_m_evans says:

Thanks Matt,

I’ve been aware of BenchmarkDotNet for a while, it looks great – I keep meaning to find chance to use it in anger! Hopefully one of the things I’m aiming to play with soon…