The Sequencer (part 2.2)


In my previous post, I mentioned that I had actually 4 implementations proposal and I commented two of them, which were within the drafted requirements but had an issue with fairness: in both cases, one could bypass the thread pool queue altogether.

Let’s turn our attention to the other two, which are interesting because they share a common issue while having radically different approach.

C#, very short one

using System;
using System.Threading.Tasks;
namespace Seq
{
public class Sequencer
{
private readonly object _lock = new object();
private Task _task = Task.FromResult(0);
public void Dispatch(Action action)
{
lock (_lock)
{
_task = _task.ContinueWith(_ => action());
}
}
}
}
view raw Sequencer V1 hosted with ❤ by GitHub

Java, long implementation

Clearly, the .Net version benefits from the TPL features. That being said, they are actually very close as they share the same algorithm: ordering is secured by creating a private queue, and non concurrency is ensured by making sure at most one task is enqueued or executing in the .net threadpool.

Contract is fulfilled, albeit with a steep price in terms of performance, or more precisely latency. As a task is actually dispatched only after the previous one have been executed, there is a minimum delay between two tasks. So, once again there is a fairness issue. It is obvious that non sequenced tasks would be executed faster than sequenced ones.

What should be the proper algorithm

For order guarantee: you MUST capture the order when the tasks is submitted. IMO the best approach is to queue the task into a sequencer’s queue (hence private). Then you submit to the threadpool a task that will ensure sequenced execution. That task , when executed, will need to:

  1. Pop the next task to be executed
  2. Push it in the execution queue
  3. If no execution is in progress start the execution, otherwise end here
  4. Execution: while there is task in the execution queue, pop them and execute them

That’s it. I am not saying this is the most efficient implementation you can achieve (far from it), but those are the steps that must be executed to achieve the contract.

  1. Capture the order on submission
  2. Ensure non concurrence
  3. Ensure tasks are executed roughly as fast as they are poped from the pool
  4. Allow tasks to be delayed only to prevent concurrent execution and executed as soon as it is possible

 

Guys, you have submitted excellent work so far, now I ask you to take those comments into account and submit improved version.

9 thoughts on “The Sequencer (part 2.2)

  1. “Execution: while there is task in the execution queue, pop them and execute them”

    I’m not sure to understand. If the threadpool/executor is shared among several sequencers, according to your statement the first sequencers can then monopolize the threads if they have plenty of tasks to execute. Since sounds really similar to have a dedicated thread per sequencer.

    Like

  2. Hi Arnaud.
    Your statement is correct´m, but I am not sure where to go from here.
    If you push enough work to keep a thread/cpu busy at 100%, then a sequencer provide litlle to none added value vs decicated thread.

    But the use cas of the sequencer is to have many instances of them. I have seen production systems having 20000 sequencers. You can not do that with threads.

    If it does not answer your question, please tell me your concern

    Like

  3. Hello Arnauld,

    Just in case you missed the task fairness point. The lifecycle of a task/lambda dealing with a sequencer has 2 main steps. Firstly, the task/lambda is dispatched (ie. delivered) to the sequencer. Secondly, the task is executed, but only once the underlying executor/thread pool used by the sequencer (we call this pool the “root dispatcher”) is available for it.

    It means that every task executed through a sequencer is always coming from the underlying thread pool at the end of the day (the one that execute any kind of task/lambda and not only the sequencer’s one).

    Hence, no thread monopolization, and fairness (among the various kind of tasks) guaranteed.

    Hope this clarifies a little bit more the sequencer approach.

    Like

    1. That’s clear, but then i don’t understand the statement:
      “Execution: while there is task in the execution queue, pop them and execute them”

      which sounds to me like (i’m referring to the java implementation 😉 ) like replacing:
      https://gist.github.com/Arnauld/e0dd490671a46694ce92#file-sequencer-java-L66

      with:

         while(hasTasks()) {
            executeNextTask();
         }
      

      the last statement can then monopolize the thread.

      I probably miss something, and i don’t understand what it is your next step to change the implementation.

      Like

    2. Ok think i got it 🙂

      “So, once again there is a fairness issue. It is obvious that non sequenced tasks would be executed faster than sequenced ones.”

      Like

  4. Right now I am not really comfortable with the sequencer contract.

    You have a latency and throughput sensitive application with 20k components randomly creating tasks that need to be sequentially executed. Honestly, if I had to work on this problem, I would start by questioning the design of the application itself. Most of the time, when latency is an issue, you need to have a very precise control over the execution paths and the dedicated threads solution is the best one. And when latency is not an issue, the Task solution is the way to go.

    That said, if your application design is perfectly legitimate and if your latency requirements are not extremely high, we have an interesting problem to solve 🙂

    Like

    1. Thanks for this feedback,
      If you want the lowest latency you can achieve AND you have a ton of money, then the sequence will not help you. The fastest approach is to use dedicated threads and ensuring you have ONE thread par agents/stream.
      If you have 20k of them, it means you need 6000-1200 servers.

      But as soon as you need to share threads amongst agents, the Sequencer is both the fastest and the most flexible approach, mostly because it secures the shortest path of execution (with some overhead, of course).
      Can you describe a system you have in mind, we could refine a design together.

      Like

    2. I did not mean it that way. I do not think there is an issue with the sequencer design itself. Most of the time when I face a hard coding problem (for example the sequencer), I start by looking for a way to avoid it (for example, by refactoring the application so that it does not have 20k agents pusing tasks).

      I would also start by considering naive solutions, like using one thread to run all the tasks or sharding the sequencers with a few dedicated threads, etc.

      But now that I understand the context of your problem, I can start working on it with a clear mind.

      Like

      1. The ‘first naive’ then ‘refine’ approach is impracticle for the threading dimension.
        TDD sadly does not work to cover concurrency issue. The world needs a magic testing framework for that. But nothing is available yet.
        Ok, but once again, I’ll be happy to discuss any concrete design.

        Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.