Multi-Threaded C++ |
---|
|
||||
Shared Memory Parallel (SMP) programs permit several active agents, called threads, to
share one memory address space. Within the flow of control of each thread, side effects
on memory continue to be stabilized at each "sequence point" as per the C++ standard. But the side effects
from different threads may become interleaved at the finest granularity possible in the hardware.
Code is said to be thread-safe when it meets its (serial) specifications
even when used concurrently by several threads of an SMP program. The language and library features of KAI C++ can be
rendered thread-safe by using the --thread_safe
command line option. Making your
application's components thread-safe may require further effort.
In SMP programs, any action which can be decomposed into several separate updates of the memory might be observed in its partially complete phases by reads made from another concurrent thread. Likewise the value of a compound object or data ensemble cannot be read without some risk that another thread will intervene partway through the several read operations and change the data elements. There are two kinds of protection against these data race conditions found in thread-safe programs:
To "serialize" their side effects, threads wait for each other; most often by waiting for the resource(s) they share to stabilize in a recognizably "available" condition. Then, by using indivisible hardware instructions, each thread tries to acquire exclusive access to the resource(s) it needs one-by-one; success renders the resource temporarily unavailable to all other threads.
A thread that waits for a resource which can never become available is deadlocked. Although possibilities for deadlock abound, a typical example would be where Thread 1 tries to acquire two lockable resources in the order {A;B;} while Thread 2 tries to acquire them in the order {B;A;}. If they start in synchrony, they can both stall waiting for the resource now held by the other.
In addition to the obvious rule that you should (eventually) release every resource to which you acquire exclusive access, there are several sophisticated ways to avoid deadlock. The simplest is to stick to a (partial) ordering of the shared resources:
Obviously writing --thread_safe
on your command line is not going to
magically make your program run correctly in parallel. What it does is ask KCC and
its runtime library to hold up one end of a bargain, to wit:
For sanity's sake, a few application entities (primarily the I/O buffers) are locked when used via library calls, since nearly all such manipulations are via the library.
Of the many possible ways to package parallelism, on Unix platforms, the POSIX thread standards have been most widely adopted. Early implementations built atop the Common Multithreaded Architecture (CMA) have matured into stable offerings supported by many Unix vendors.
The IEEE Std
1003.1c-1995 POSIX System Application Program Interface (API) has been used by KAI C++
to synchronize the shared resources within the KCC Runtime Library.
On our HPUX 10.20
port, a somewhat older level of this interface is used.
There is no standard C++ binding
to the pthreads API beyond the ordinary extern "C"
interfaces layed out
in the <pthreads.h>
system header file.
Some library vendors have introduced object-oriented support for multi-threading.
Chances are that however
you program threads, you will actually be compiled down into the POSIX pthread interface.
When linking executables, KCC's --thread_safe
option causes the appropriate
thread support library to be linked with your application.
Note that on many systems this is available only as a shared object library.
Several shared objects in the KCC Runtime Library must be used from within template
instantiations that are not built until you link your application. These templated functions
use the POSIX threads library via a system header file named <mutex>
. This file
is not part of the C++ standard. It provides a very low level C++ cover for pthread's mutually exclusive
locking mechanism known by the contraction "mutex". This header file only becomes active under
the preprocessor setting -DMSIPL_MULTITHREAD
, which is defined when you compile
with the --thread_safe
option.
The classes simple_mutex
and mutex
implement ownerless-locking and
nestable (owned) locking. Because not all pthreads implementations support static initialization
of mutexes, a separate templated class static_mutex
is also defined to isolate
locks which may require special attention at process start-up. The thread-safe KCC library
initializes the static locks it needs, but on systems without true static initialization, your
code will have to insure that static mutex objects are constructed before their first use.
The acquire
and release mutex
methods are adequate for
classical locking methodologies. The mutex header file also defines a handy template:
block_mutex
.
The block_mutex<>
class declares an auxiliary object
that acquire
s a given mutex when it is constructed,
and release
s that mutex when it is destroyed.
This gives threaded code a satisfying block structure that
is much easier to prove is deadlock-free in the presence of exceptions.
Mutex methods and the <mutex>
header file are subject to change as both standards
and programming practice progress.
Every support library which you use from within concurrently executing threads needs to
be thread-safe. On many systems the header files for these libraries require a
preprocessing macro like -D_REENTRANT
to insure you are served by thread-safe
entrypoints. KCC defines the appropriate macro
for your platform when you use the --thread_safe
option. But you still must
adapt to using the right entrypoints in these libraries, and to holding locks when you
want compound actions to be indivisible.
On their original course from Multics to Unix, libc
and other
support libraries picked up some uniprocessing habits, notably a fondness for relying on static
memory declared within the library to simplify the interfaces to some library functions.
Consider, e.g., the definitions your system's libc may provide for strtok()
vs
the "reentrant" form strtok_r()
. The strtok
function keeps its place in the
string you are working on in a static pointer in the library's scope. Both multithreaded
and recursive use of this function require more than one placeholding pointer be in use
at a time, so strtok_r
takes a pointer to this pointer as an extra argument.
This pattern is repeated throughout older libraries: a get_system_data() call returns a pointer to static struct, which must be read "before the next call". If you go back and check the current documentation you'll often find there is now a get_system_data_r() entrypoint to which you must "Bring Your Own Buffer". Welcome to the parallel universe!
The buffers used for I/O and other "streams" operations are static memory.
The thread-safe versions of libKCC*-ts.*
and libc.*
lock their uses
of stream buffers with the net effect that individual I/O operations are done atomically.
That is to say that in a mix of calls to C++ iostreams via <<
and >>
,
or from the printf(), scanf()
family of standard C, each call has exclusive use of
its I/O buffer until its side effects (on the buffer) are completed.
However, multiple calls (even within the same expression) may become
interleaved with operations on the same buffer done by other threads.
This level of locking (the buzzword is "granularity") allows operations on distinct buffers to proceed in parallel. Stay tuned for future developments in parallel I/O, since this design serves as only the lowest level building block in a user-friendly parallel I/O package.
Control over allocation points in the Heap Storage is maintained by static variables in libc.
The fundamental operations of malloc()
and free()
are usually thread-safe
within the right version of libc on your platform. Constructors for heap allocated objects
are usually thread-safe by virtue of the fact that the address of the object has not yet
been returned from new
.
Standard Template Library container classes allow high-level compound operations on composite objects. When such objects are shared, concurrent accesses must be mutually exclusive. Since the containers themselves reside in your application's domain, you are responsibile for synchronizing access to them. Your application should be able to tell when synchronization is needed, and only you can choose the most appropriate locking discipline for each container.
Some early STL implementations of the multiset containers used thread-unsafe algorithms
long ago abandonned by KAI C++. In particular, the bug pointed out in the July 1998 C++ Report
in the red/black trees used to implement the set
template class has never
been present in KAI C++.
Before the days of full C++ exception handling, the standard library provided an interim solution
in the form of the set_new_handler(), set_terminate(),
and set_unexpected()
routines. These each set up a given "handler" routine for different exceptional conditions
the program might encounter.
For the purposes of thread-safe programming, these library features should be regarded as anachronisms. The handler routine settings are global variables in the KCC runtime library, shared by all threads. Several alternative definitions for thread-private handlers are possible, but without a standard, you're just writing non-portable code.
Old-fashioned handler routines are suitable only as a way to force global termination in the cases they are assigned to cover. Since every handler routine is globally visible, they are no help in purely local exceptional conditions unless you write an extensively specialized handler routine that can distinguish one thread's local conditions from its siblings'.
Updates to the global handler state are mutually exclusive. So, if their use is inescapable, a conservative policy is to pick one handler for each condition and set it at the beginning of each thread's processing. This setting is redundant on KCC, but on systems that provide thread-private handlers, it forces a known choice of initial value on each thread.
The C++ exception handling mechanism (keywords try
, throw
, and catch
)
offers the proper infrastructure for handling exceptions locally.
Each thread has its own exception handling stack, which operates almost
independently. Exceptions are never thrown between threads.
For the moment (i.e. KCC releases in the 3.x family, and possibly on into 4.x), there are some minor semantic limits on multithreaded exception handling:
throw
ing and catch
ing an exception. A library-level nestable mutex
enforces this.
--thread_safe
and --no_exceptions
is not presently
supported by the released runtime libraries.