Multi-threading

Multi-threaded concepts are important: e.g., atomics, locks, issues with different designs, how to make things thread safe. Cache locality is another huge thing these days. Asynchronous architectures and callbacks are what you will be dealing with every day.

  • What is cache locality?
  • How do multicore systems ensure their caches are in sync?
  • How do you get around this problem?
  • Why are signals slow and why is context switching bad?
  • What exactly happens during a context switch?

"A computer program or subroutine is called reentrant) if multiple invocations can safely run concurrently on a single processor system."

"Race condition: an unfortunate order of execution causes undesirable behaviour."

Starting in C++11, scoped static initialization is now thread-safe, but it comes with a cost: reentrancy now invokes undefined behavior.

Issues

  • Ensure critical sections are as small as possible
  • Only a problem for multiple writers (multiple readers OK)
  • Too few threads: algorithm is sub-optimal
  • Too many threads: overhead of creating/managing and partitioning the data is greater than processing advantage; software threads outnumber the available hardware threads and the OS must intervene
  • Data races, deadlocks and livelocks -- unsynchonised access to shared memory can introduce race conditions and undefined behaviour (program results depend non-deterministically on the relative timings of two or more threads)
  • Threads versus processes
  • STO: strict timestamp ordering (younger transactions wait for older transactions)

DCLP

  • Singleton isn’t thread safe

Deadlocks

  • Deadlocks
  • "Livelock is a risk with some algorithms that detect and recover from deadlock"
  • Resource starvation)
  • Spinlock -- busy waiting, but useful in situations where the lock will become available in a very short time

Detecting deadlocks

Execution policy

  • std::execution::sequenced_policy: The execution policy type used as a unique type to disambiguate parallel algorithm overloading and require that a parallel algorithm's execution may not be parallelized. The invocations of element access functions in parallel algorithms invoked with this policy (usually specified as std::execution::seq) are indeterminately sequenced in the calling thread.
  • std::execution::parallel_policy: The execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be parallelized. The invocations of element access functions in parallel algorithms invoked with this policy (usually specified as std::execution::par) are permitted to execute in either the invoking thread or in a thread implicitly created by the library to support parallel algorithm execution. Any such invocations executing in the same thread are indeterminately sequenced with respect to each other.
  • std::execution::parallel_unsequenced_policy: The execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be parallelized, vectorized, or migrated across threads (such as by a parent-stealing scheduler). The invocations of element access functions in parallel algorithms invoked with this policy are permitted to execute in an unordered fashion in unspecified threads, and unsequenced with respect to one another within each thread.
  • std::execution::unsequenced_policy: The execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be vectorized, e.g., executed on a single thread using instructions that operate on multiple data items.

Prevention

  • Try to avoid calling out to external code while holding a lock
  • Try to avoid holding locks for longer than you need to
  • If you ever need to acquire two locks at once, document the ordering thoroughly and make sure you always use the same order
  • Immutability is great for multi-threading: immutable means thread safety; functional programming works well concurrently partly due to the emphasis on immutability

Mutex

See std::mutex or std::atomic in C++.

Exercise: write a producer-consumer solution.

  • A mutex is a locking mechanism used to synchronize access to a resource
  • A semaphore is signaling mechanism: "I am done, you can carry on" kind of signal

See Mutex vs Semaphore

C++ threads

Launch with std::thread, attach with std::join. C++20 introduced std::jthread which automatically rejoins on destruction.

std::async runs the function f asynchronously (potentially in a separate thread which might be a part of a thread pool) and returns a std::future that will eventually hold the result of that function call. Not entirely intuitive, see std::async.

Threads versus processes

A thread is a branch of execution. A process can consist of multiple threads.

Threads

  • Don't try to go lock-free unless you really have to. Locks are expensive, but rarely the bottleneck.
  • Document the thread-safety of your types. Most types don't need to be thread-safe, they just need to not be thread hostile: i.e., "You can use them from multiple threads, but it's your responsible for taking out locks if you want to share them.
  • Don't access the UI (except in documented thread-safe ways) from a non-UI thread. In Windows Forms, use Control.Invoke/BeginInvoke
  • Use locks when you access mutable shared data, both for reads and writes.

C++ algorithms that can take an execution policy

  1. std::sort
  2. std::copy
  3. std::transform
  4. std::accumulate
  5. std::for_each
  6. std::reduce
  7. std::inclusive_scan
  8. std::exclusive_scan
  9. std::transform_reduce
  10. std::replace_if
  11. std::replace
  12. std::remove_if
  13. std::remove
  14. std::count_if
  15. std::count
  16. std::max_element
  17. std::min_element
  18. std::find_if
  19. std::find
  20. std::generate

References

results matching ""

    No results matching ""