Multithreading issues 2: Fun with locks

Locks and deadlocks

Remember the Scrabble game used to illustrate race conditions? Today we will fix the mess I left the players with using exclusivity. But I will also show you the serious limitations this approach has.

Step one: locks and exclusivity

So, we left the game when everybody was playing simultaneously, leading to a messy board full of meaningless words, plus angry players around. Well, fixing this is easy: we just need to ensure that players take turns! Now, everything falls back into place. No more conflicts, only real words on the board. If we look a bit closer into the process, we realize that a player first wait until nobody is playing before laying its own word, and that no other can proceed until he is finished. Another way to express this is that a player needs to have exclusive access to the board before playing, and this happens whenever a player states ‘it’s my turn’.

Let’s diverge a bit from the standard rules…

Now players are playing one after the other, nicely. To be honest,  there is one point that I will not cover today, which is to ensure they do play in proper order, we will get back to that later.

Now assume I want the players to fill the board as fast as possible. There is one thing that is clear: having only one player laying letters at any given time means I do not leverage having multiple players. Even worse, as player has to check when then can play, it just slows the game down. I would fare better with a single player.

So if I want to rip benefits from having multiple players, I must make sure that more than one can act at the same time, meaning it is time to reassess the rules (= algorithms) we put into place at the previous paragraph, starting with the sub-steps:

    1. read the game board
    2. look at one’s letter
    3. identify the best word
    4. lay down the letter
    5. pick random new letters

Only steps (A) and (D) requires access to the board, and step (A) only requires ‘read’ access.  We can try to leverage on that analysis and change the rule in: only one player can be doing either (A) or (D) at any given time.

Let’s evaluate this proposal:

  1. The rule is more complex (bad)
  2. Exclusive part as been reduced to roughly 40% (2 out of 5) of the activity (good)
  3. The player can do stupid things (bad).

A word as explanation is probably required for item 3. Here is an example:

  1. Alice (player one) reads the board.
  2. Bob (player two) reads the board too.
  3. Alice lays down her word
  4. Bob tries but cannot because Alice took his spot (include sexist comment here :-))

In the real world, we can expect Bob to figure out either another word or another location, but computer algorithms are not that smart, yet, and so they would probably ‘butcher’ the board by force replacing the letters or raising an exception.

The obvious fix is to keep the exclusive access not only during steps A and D, but from A to D. Then we will ensure consistency between when the player look at the board and when he/she lays down his/her word. But we can release the lock for step ‘E’. Exclusive access is now 80%, while only 40% was required. At least, while Alice is picking up random new letters, Bob can start thinking: Alice’s E steps can happen simultaneously with Bob’s A step.

From a performance perspective, we have ripped a 20% speed gain with 2 players instead of one. But adding new players will not give us any further boost as they will just be waiting on one another.

This bring us to an important element of parallel programming: Amdahl’s law which sates how much performance you can get by running an algorithm on multiple CPUs. We will discuss it later.

Conclusion

In this post, I did give a quick view on what locks are all about: removing concurrency related risks by permitting exclusive access to critical data. They were also some hints about the inherent complexity of locks, something I will dug further next.

Advertisements

Multithreading issues 1: race conditions

I was planning to post about the context and assumptions of this blog, but I realized that a page is a better format as I anticipate it will need completion and amendments.


Today, we will go through the 101 on multithreading issues.


The first ones every developer is bound to encounter are race conditions.


It covers all cases where two (or more) threads are competing for a variable (or object, or any resource for that matter): as threads are ruthless this often ends in a gory way.


Let’s use Scrabble as an analogy: a player first identify a word fitting the current game, then he put the required letters. Finally he picks as many letters he as used.
The normal rule is that players play one after the other. But suppose we remove that rule and that now players plays simultaneously. There will be of course a situation the word a player as selected is now longer fitting the game because of some other players action. And if two players are laying letters in the same area at the same time, it will get really messy, we can probably assume that those players would start to argue, and the game would be over, leaving a board with both meaningful and nonsense words on it.


That perfectly describes a race condition: the players represent the various threads and the board represents the application memory, players argument is an exception which leads to program termination.


Scrabble BoardWell, of course, actual players would find out fast that something is not right, but race conditions in a program may take some time to surface. The reason is simple, an actual scrabble board is 525 cells and a human will take several seconds to lay down a word of five to seven letters, while a computer program needs fraction of a milliseconds to modify a hundred bytes within a multimegabytes space!


What does it mean for the user of the system: that sometimes the application behaves strangely, displaying inconsistent data, garbage or simply crashes.


For the developer, it just means that he is entering a world of pain trying to chase the bug, because race conditions are as difficult to chase than the dreaded memory bugs, possibly more.
This is something that must be avoided at all costs, as no tolerance mechanism can be implemented against this. You need to be scared to death of race conditions.


I did face my share, directly or indirectly, most of them dragged on for weeks or months, usually until the code is significantly refactored.


Next post will talk about the traditional solution to race conditions, namely locks. It wil give us also the opportunity to talk about deadlock.