The Sequencer (part 2.2)

In my previous post, I mentioned that I had actually 4 implementations proposal and I commented two of them, which were within the drafted requirements but had an issue with fairness: in both cases, one could bypass the thread pool queue altogether.

Let’s turn our attention to the other two, which are interesting because they share a common issue while having radically different approach.

C#, very short one

Java, long implementation

Clearly, the .Net version benefits from the TPL features. That being said, they are actually very close as they share the same algorithm: ordering is secured by creating a private queue, and non concurrency is ensured by making sure at most one task is enqueued or executing in the .net threadpool.

Contract is fulfilled, albeit with a steep price in terms of performance, or more precisely latency. As a task is actually dispatched only after the previous one have been executed, there is a minimum delay between two tasks. So, once again there is a fairness issue. It is obvious that non sequenced tasks would be executed faster than sequenced ones.

Continue reading “The Sequencer (part 2.2)”

Advertisements

The Sequencer (part 2.1)

Update

I received two more proposals I will comment soon.

Original

I closed the last episode with a little exercise for my readers: suggest me how to complete my requirements, namely by ensuring guaranteed ordering for the sequencer.

Alas, only two were skilled or brave enough to face the challenge and try an answer, and I thank them for that.

The two proposal were similar, but freeman did provide a gist, so let’s discuss it

Definitely, this will capture and secure the order of execution. And I like the smart use of TryEnter, allowing to get rid of the boolean used to store the state

But, and this is a big but, this solution violates another (implicit) requirement. I have to apologize for this one, as I failed to state it earlier.

But you know customers: they understand what they need when the development is over :-).

That requirement is fairness: fairness between Sequencer instances as well as fairness between Sequencers and other tasks. Fairness is to be understood as the guarantee that all submitted tasks will eventually executed,that they have equivalent access to execution units (i.e. cores), and with similar delays.
It means that no system can gain exclusive access to execution unit(s) ans that tasks are executed roughly in the order they are submitted.

This being defined, this solution is not fair. Can you tell me why?

Note: here is a gist for the first proposal

The Sequencer (part 2)

In my last post I made a promise: showing the implementation of Sequencer

So please, bear with me while I review the requirements. The sequencer

  1. Is a facade: wraps an execution context, such as a thread or a pool of threads
  2. Prevents race conditions: enforces sequential execution of submitted tasks
  3. Guarantees ordering: submitted tasks are processed in the order they are submitted
  4. Avoid useless consumptions of resources.

The first one is simply a design choice, no need to elaborate. Next is how to prevent race conditions: as the execution basis is a lambda/delegate/runable of some sort, we need to prevent concurrent execution of those.

A simplistic state machine can help us. The possible states are running and waiting; transitions are triggered by incoming tasks.
If the machine is in the ‘waiting‘ state, it transitions to ‘running‘ and executes the task; when the task is done, it transitions back to ‘waiting‘.
But if the machine is in the ‘running‘ state, execution cannot start, it needs to wait for the appropriate state.
It can be implemented with a lock, the ‘running‘ state being embodied by the lock being held. What interests me is how we deal with conflicts, i.e. when a task comes in while the state is still ‘running‘.
If we respect the lock semantic, the incoming request is blocked until we reached the desired state. Blocking implies waste and potential deadlock. So no go there.

We need to complexify a bit our logic. So, when a task is submitted for execution and the state is ‘running‘, we can store it for a differed execution. There is no need to keep to current thread blocked.
As a consequence, the transition out of ‘running‘ is now: when the task is executed, if there is at least another task pending execution, we execute it (state remains ‘running‘). When there are no pending task, we can safely transition to ‘waiting‘.

Ok, time to look at some code. Disclaimer: this implementation is for educational purpose, I know it will look ugly to some, slow to others,… you name it. But I do not care, so bear with me.
For non C# readers: an Action is a parameter-less procedure ( i.e. no return value).


As you can see, the actual implementation is slightly more complex than described earlier. Indeed, we need to cater for two atomic transactions:

  1. On a new task: starting it right away or queuing it
  2. When the task is done: fetching a new one or ‘freeing’ the sequencer

Lets look at the requirements

  • Requirement #1: we wrap an existing execution mechanism (the .Net thread pool). Nothing fancy here, it will be improved in a future revision
  • Requirement #2: done as well, it has been specifically for that purpose
  • Requirement #3: we did nothing specific here, but on the other hand it looks ok
  • Requirement #4: by using a differed execution mechanism, we prevented the need to block threads, something we know is expensive

So, are we done now?

Nope, requirements #3 is definitely not covered. On a multicore machine (basically any machine nowadays), multiple threads (from the thread pool) can start to work on tasks at the same time, and there will be a race to queue them in the sequencer.

We need to refine this implementation.
What would be your proposal to have ordering guarantee?

The Sequencer (part 1)

Several years ago, I was lead on a new mass market making system. One of the objectives of the project was to have low latencies and controlled throughput, target latency was < 1 ms isolated event latency and have a 99 percentile below 10 ms.

Throughput had to be controlled because downstream bandwidth was reduced to a trickle, a mere 50 updates/sec while the inputs were in the order of 2000 updates/sec. So the system required an arbitration algorithm for this output decisions.

At that time, I still had a burning memory of my time as a lead C++ dev on a in house multithreaded fat client app: I basically spent the best part of two years rebuilding the threading model from scratch, a long and painful process. This definitely was an apprentice journey towards multithreading craftsmanship.

Sadly, I ultimately came to the conclusion that, at that time, building and maintaining a multi threaded app was a daunting task.

Therefore I decided to push for a radically different approach: managing concurrency without any locks.

When you think about it, what is the best way to manage concurrency?

First of all by realizing there is no concurrency per se. 99.99% of the time you’ll have locks that enforce mutual exclusion. Managing concurrency means preventing actual concurrency from happening in the first place.

And then was born the ‘Sequencer’.

The ‘Sequencer’ is a design pattern that has been created in C# and replicated in Java and C++. The contract is simple:

The Sequencer:OLYMPUS DIGITAL CAMERA

  • Is a facade: wraps an execution context, such as a thread or a pool of threads
  • Prevents race conditions: enforces sequential execution of submitted tasks
  • Guarantees sequentiality: submitted tasks are processed in the order they are submitted

And that’s it. The Sequencer was coupled with another simpler pattern: the ‘Dispatcher’ in charge of providing actual execution capacity. There are several variant implementations of IDispatcher but I will present them on a future post.

Actually, the Sequencer is a specific implementation of an IDispatcher.

It looks a lot like an actor without the messaging infrastructure.

Let’s talk implementation.

Behind its apparent simplicity, the Sequencer hides moderate complexity. But I’ll talk further about it in an upcoming post.

The secret for 100% test coverage: remove code

Update note

Based on the interesting feedback I got (can be seen on Tom’s ramblings), I realized this post probably needed some tweaking and scope precision. I did put them at the end.

What is the adequate objective for test coverage?

60%?

80%?

99%?

100%?

Like many others, I have often pondered this question, like many before me, and many after I suppose. Why aiming for 100%? The 80/20 law clearly applies to test coverage: to try to cover every corner cases that lie in the code is going to require a significant investment in time and brain cells. Plus integration point can never really be properly covered.

On the other hand, having 100% coverage provides huge benefits:

  1. Every single line of code is constantly tested
  2. Trust in code is high
  3. Any uncovered line is a regression

What happens if the target is, 80%:

  1. Significant part of the code is never tested
  2. Trust in code is moderate and can degrade
  3. 20% uncovered line codes is significant, 2000 lines for a 10K code base. That means full namespaces can hide in there.

For me, there is no question 100% coverage is the only worthy objective, do not settle for less.

Yes there are exceptions, usually at integration points. Mocks are not a real solution either, they can help you increase your coverage but not by that much. The pragmatic solution is to wrap them into isolated modules (jars/assemblies). Think hexagonal architecture there. You will have specific coverage target for those, you also need to make sure that no other code creeps in and finally, understands those are weak points in your design.

While I am working on nFluent, I constantly make sure unit tests exert every single line of code of the library. It also means that I help contributors reach the target. It is not that difficult, especially if you do TDD!

There is one simple golden rule: to reach and maintain 100% coverage, you do not need to add tests, you have to remove not covered lines!

This is important, so let me restate that: a line of code that is not covered is not maintainable, must be seen as not working and must be removed!

Think about it:

  1. The fact that no automated test exists means that the behavior can be silently changed, not just the implementation!
  2. Any newcomer, including your proverbial future self, will have to guess the expected behavior !
  3. What will happen if the code get executed some day in production?
  4. If you are doing TDD you can safely assume the code is useless!

So, when you discover not covered lines, refactor to remove them or to make sure they are exerted. But do not add tests for the sake of coverage.

Having 100% coverage does not mean the code is bug free

Tom’s comments implied that I was somehow trying to promote this idea. 100% coverage is no bug free proof at all, and do not imply this at all. Quality and relevance of your tests are essential attributes; that is exactly why I promote removing non tested lines. Any specially crafted test will not be driven by an actual need and would be artificial. The net result would be a less agile code base.

On the other hand, if you have 100% coverage and you discover some reproducible bug, either by manual testing or in production, you should be able to add an automated test to prevent any re occurrence.

When coverage is insufficient, there is a high probability that you will not be able to add this test, keeping the door open for future regression!

If you want to build trust based on coverage metrics, you need to look into branch coverage and property based testing at the very least. But I do not think this is a smart objective.

Note

  • This post focuses  on new code! For legacy code, the approach should be to add tests before anything else, and never remove working code 🙂