Multi-Threaded C++

	Multi-Threaded C++

Problems	Pthreads	Thread-safe Libraries	Tutorial Index

Shared Memory Parallel (SMP) programs permit several active agents, called threads, to share one memory address space. Within the flow of control of each thread, side effects on memory continue to be stabilized at each "sequence point" as per the C++ standard. But the side effects from different threads may become interleaved at the finest granularity possible in the hardware. Code is said to be thread-safe when it meets its (serial) specifications even when used concurrently by several threads of an SMP program. The language and library features of KAI C++ can be rendered thread-safe by using the --thread_safe command line option. Making your application's components thread-safe may require further effort.

What is so difficult?

Thread-safety requires enforcing several programming restrictions throughout the parts of your application that will be active in more than one thread. A thread-safe algorithm cannot completely ignore the implementations of the objects it uses. Not only must they too be thread-safe, but their use of shared resources must coordinate with synchronizations done by the parent algorithm.

Data Races

In SMP programs, any action which can be decomposed into several separate updates of the memory might be observed in its partially complete phases by reads made from another concurrent thread. Likewise the value of a compound object or data ensemble cannot be read without some risk that another thread will intervene partway through the several read operations and change the data elements. There are two kinds of protection against these data race conditions found in thread-safe programs:

Concurrent threads must only use algorithms that cooperate with each other by using some form of synchronization.
Knowledge of the address of a shared object can be confined to one thread whenever an unsynchronized algorithm is active (e.g. during its construction or destruction).

There is always some risk that incorrect or uncooperative code will interfere with a compound action. The protection policies must be enforced program-wide.

Deadlock

To "serialize" their side effects, threads wait for each other; most often by waiting for the resource(s) they share to stabilize in a recognizably "available" condition. Then, by using indivisible hardware instructions, each thread tries to acquire exclusive access to the resource(s) it needs one-by-one; success renders the resource temporarily unavailable to all other threads.

A thread that waits for a resource which can never become available is deadlocked. Although possibilities for deadlock abound, a typical example would be where Thread 1 tries to acquire two lockable resources in the order {A;B;} while Thread 2 tries to acquire them in the order {B;A;}. If they start in synchrony, they can both stall waiting for the resource now held by the other.

In addition to the obvious rule that you should (eventually) release every resource to which you acquire exclusive access, there are several sophisticated ways to avoid deadlock. The simplest is to stick to a (partial) ordering of the shared resources:

If a thread is (or could be) holding resource A, it can only acquire resources ranked below A in the ordering.

For example: only acquire multiple locks in alphabetical or numerical order; no fair passing them to formal arguments to change the spelling of their name. There is no standard syntax to declare that an object will use particular shared resources. Accurate comments are recommended for now. Beware of holding locks while calling functions and while constructing or destroying complex objects.

Whose Problem Is This?

Obviously writing --thread_safe on your command line is not going to magically make your program run correctly in parallel. What it does is ask KCC and its runtime library to hold up one end of a bargain, to wit:

Thread-safety:: Objects allocated by KCC or its runtime Library will be used in thread-safe ways by all language components and library methods.
Objects allocated by an application are the responsibility of their author, even if they are used entirely via C++ standard library templates (e.g. container classes).

For sanity's sake, a few application entities (primarily the I/O buffers) are locked when used via library calls, since nearly all such manipulations are via the library.

POSIX Threads and Mutexes

Of the many possible ways to package parallelism, on Unix platforms, the POSIX thread standards have been most widely adopted. Early implementations built atop the Common Multithreaded Architecture (CMA) have matured into stable offerings supported by many Unix vendors.

The pthreads Library

The IEEE Std 1003.1c-1995 POSIX System Application Program Interface (API) has been used by KAI C++ to synchronize the shared resources within the KCC Runtime Library. On our HPUX 10.20 port, a somewhat older level of this interface is used. There is no standard C++ binding to the pthreads API beyond the ordinary extern "C" interfaces layed out in the <pthreads.h> system header file. Some library vendors have introduced object-oriented support for multi-threading. Chances are that however you program threads, you will actually be compiled down into the POSIX pthread interface. When linking executables, KCC's --thread_safe option causes the appropriate thread support library to be linked with your application. Note that on many systems this is available only as a shared object library.

The <mutex> Header file

Several shared objects in the KCC Runtime Library must be used from within template instantiations that are not built until you link your application. These templated functions use the POSIX threads library via a system header file named <mutex>. This file is not part of the C++ standard. It provides a very low level C++ cover for pthread's mutually exclusive locking mechanism known by the contraction "mutex". This header file only becomes active under the preprocessor setting -DMSIPL_MULTITHREAD, which is defined when you compile with the --thread_safe option.

The classes simple_mutex and mutex implement ownerless-locking and nestable (owned) locking. Because not all pthreads implementations support static initialization of mutexes, a separate templated class static_mutex is also defined to isolate locks which may require special attention at process start-up. The thread-safe KCC library initializes the static locks it needs, but on systems without true static initialization, your code will have to insure that static mutex objects are constructed before their first use.

The acquire and release mutex methods are adequate for classical locking methodologies. The mutex header file also defines a handy template: block_mutex. The block_mutex<> class declares an auxiliary object that acquires a given mutex when it is constructed, and releases that mutex when it is destroyed. This gives threaded code a satisfying block structure that is much easier to prove is deadlock-free in the presence of exceptions.

Mutex methods and the <mutex> header file are subject to change as both standards and programming practice progress.

Re-entrant Library Calls

Every support library which you use from within concurrently executing threads needs to be thread-safe. On many systems the header files for these libraries require a preprocessing macro like -D_REENTRANT to insure you are served by thread-safe entrypoints. KCC defines the appropriate macro for your platform when you use the --thread_safe option. But you still must adapt to using the right entrypoints in these libraries, and to holding locks when you want compound actions to be indivisible.

B.Y.O.B.

On their original course from Multics to Unix, libc and other support libraries picked up some uniprocessing habits, notably a fondness for relying on static memory declared within the library to simplify the interfaces to some library functions. Consider, e.g., the definitions your system's libc may provide for strtok() vs the "reentrant" form strtok_r(). The strtok function keeps its place in the string you are working on in a static pointer in the library's scope. Both multithreaded and recursive use of this function require more than one placeholding pointer be in use at a time, so strtok_r takes a pointer to this pointer as an extra argument.

This pattern is repeated throughout older libraries: a get_system_data() call returns a pointer to static struct, which must be read "before the next call". If you go back and check the current documentation you'll often find there is now a get_system_data_r() entrypoint to which you must "Bring Your Own Buffer". Welcome to the parallel universe!

Parallel I/O

The buffers used for I/O and other "streams" operations are static memory. The thread-safe versions of libKCC*-ts.* and libc.* lock their uses of stream buffers with the net effect that individual I/O operations are done atomically. That is to say that in a mix of calls to C++ iostreams via << and >>, or from the printf(), scanf() family of standard C, each call has exclusive use of its I/O buffer until its side effects (on the buffer) are completed. However, multiple calls (even within the same expression) may become interleaved with operations on the same buffer done by other threads.

This level of locking (the buzzword is "granularity") allows operations on distinct buffers to proceed in parallel. Stay tuned for future developments in parallel I/O, since this design serves as only the lowest level building block in a user-friendly parallel I/O package.

Parallel Memory Management

Control over allocation points in the Heap Storage is maintained by static variables in libc. The fundamental operations of malloc() and free() are usually thread-safe within the right version of libc on your platform. Constructors for heap allocated objects are usually thread-safe by virtue of the fact that the address of the object has not yet been returned from new.

Container Classes

Standard Template Library container classes allow high-level compound operations on composite objects. When such objects are shared, concurrent accesses must be mutually exclusive. Since the containers themselves reside in your application's domain, you are responsibile for synchronizing access to them. Your application should be able to tell when synchronization is needed, and only you can choose the most appropriate locking discipline for each container.

Some early STL implementations of the multiset containers used thread-unsafe algorithms long ago abandonned by KAI C++. In particular, the bug pointed out in the July 1998 C++ Report in the red/black trees used to implement the set template class has never been present in KAI C++.

Handler State

Before the days of full C++ exception handling, the standard library provided an interim solution in the form of the set_new_handler(), set_terminate(), and set_unexpected() routines. These each set up a given "handler" routine for different exceptional conditions the program might encounter.

For the purposes of thread-safe programming, these library features should be regarded as anachronisms. The handler routine settings are global variables in the KCC runtime library, shared by all threads. Several alternative definitions for thread-private handlers are possible, but without a standard, you're just writing non-portable code.

Old-fashioned handler routines are suitable only as a way to force global termination in the cases they are assigned to cover. Since every handler routine is globally visible, they are no help in purely local exceptional conditions unless you write an extensively specialized handler routine that can distinguish one thread's local conditions from its siblings'.

Updates to the global handler state are mutually exclusive. So, if their use is inescapable, a conservative policy is to pick one handler for each condition and set it at the beginning of each thread's processing. This setting is redundant on KCC, but on systems that provide thread-private handlers, it forces a known choice of initial value on each thread.

Limits on Thread-Safe Exception Handling

The C++ exception handling mechanism (keywords try, throw, and catch) offers the proper infrastructure for handling exceptions locally. Each thread has its own exception handling stack, which operates almost independently. Exceptions are never thrown between threads.

For the moment (i.e. KCC releases in the 3.x family, and possibly on into 4.x), there are some minor semantic limits on multithreaded exception handling:

The mechanism KCC uses permits only one thread at a time to be in the process of throwing and catching an exception. A library-level nestable mutex enforces this.
Thus, exception handling code (including catch clauses, copy construction and destruction of the exception object, and destruction of the objects in the intervening call chain), should not synchronize with code in another thread whose proper operation may require also throwing and catching an exception.
Threads must terminate outside of any catch clauses, so that the library code will eventually release its exception handling mutex.
Throwing an exception beyond the outermost handler in a thread produces undefined behavior.
The combination --thread_safe and --no_exceptions is not presently supported by the released runtime libraries.

All of these limitations are under review, and may be changed in future KCC releases.

E-Mail KAI C++ Tech. Support E-Mail KAI Contact KAI

This file last updated on 4 June 1999.