The problem with locks (2)


In Why locks suck? I stressed several issues with locks. Today, I will elaborate on their main issues, which is they do not compose. But I need to finish my list first, as I skipped performance related issues.

Locks hurt performance badly,

it is a universally known fact. But what is less known is how bad they hurt performance and why. The main mythos that locks are expensive because they are kernel objects and that user/kernel transition is expensive.

This argument still hold some truth, but the fact is that both Intel and AMD worked on improving this situation in their recent CPU lines, so now the transition cost (back and forth) is less than 300 cycles, same order of magnitude that access to non cached memory.

But the main reason that locks hurt performance, is that they simply stall a core, by triggering a context switch in case of contention, by trashing various CPU caches. Basically, a context switch is quite a catastrophic event from a performance point of view. Here is a complete list of what it may cost:

  • Kernel user transition (always)
  • Stacking and un-stacking of the core state: all registers to be stored for current context + restore for target context (always)
  • Loss of execution context (very likely, unless target context uses same code)
    • loss of branch prediction caches
    • flush of the execution pipeline
    • stack entries are lost from the cache
  • Loss of cache entries for recently accessed data (likely, unless switching to a similar context within the same process)
  • Loss of cache entries for recently accessed code(likely, unless switching to a similar context within the same process)
  • Loss of TLB entries (likely). As a reminder, TLB stands for Translation Look-aside Buffer; this used for address translation computation that is required to implement virtual memory. This happens if you switch to a different process.
  • Scheduling cost: you have to factor in the execution time to elect the new thread to be ran.

When you take all those into account, you realize that the actual latency cost of an execution context is a matter of thousands of cycles, far above the user/kernel transition cost.

They do not compose

I assume most of you had first hand experience with some ‘thread-safe’ framework. Therefore I can confidently state that I have yet to see a ‘thread friendly’ framework. Most of the thread safe framework relies on lock and offer you events or subscription mechanisms which are basically deadlock pitfalls.

You want to see a thread safe framework fail: try to unsubscribe while being notified on that very same subscription. At best, it will raise an exception, but it will probably simply deadlock.

So why locks do not compose? Simply because you need to order them properly! And any significant chunk of code relies on some event/observer/inversion of control/interface implementation/virtual method overloads. We use those as extension points to alter and complete the behavior of existing objects.
And to develop those extensions, one need to be aware of the locks used by those classes. The bais situation is that the classes are not thread aware and use no locks, is typically called non thread safe. It can be a blessing, but it means you have to wrap them somehow. Why extra work?

Then you have the so called thread safe classes, that typically use ‘synchronized’ (in Java) or ‘lock’ ( in C#) extensively; you need to be careful when using their extension points. But the truth is, even if you are careful enougth, the next version of the library, or the next developer that maintains this codebase will probably encounter deadlocks, or even worse, race conditions.
I have been the initial ‘clever’ developer, I have also been the next and even the nth. I have made many, many mistakes, enough of those to be considered as an expert in the multithreading field.

The bottom line is: whatever the locking model is, there is no way you can make it foolproof; even sadder, there is no way you can document it properly. The best you can do, is ensure long knowledge transfer session that would cover both how the locks are used and the general threading model to allow newcomers to add their own locks. And pray for the day when a ‘clever’one will try to refactor the model!

Key takeway: locks are a dead end!
I came to this gloomy conclusion 7 years ago. I understood then than we needed a radically new approach and as none were available at the time. Therefore I decided to act upon this!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.