Thread Safety

From Mpich
Revision as of 17:31, 18 November 2012 by Balaji (talk | contribs)

Jump to: navigation, search

Published papers refer to the following links. Rather than duplicate the information on the wiki, we will refer you to the canonical pages:

Explanation of Categories in the table of Thread Safety Needs of MPI Routines

  • Comm/IO: The routine needs to access the communication or I/O system in a thread-safe way. This is a very coarse-grained category but is sufficient to provide thread safety. In other words, an implementation may (and probably should) use finer-grained controls within this category.
  • Collective: Collective routines require that the user not call collectives on the same communicator in different threads in a way that my make the order of invocation depend on thread timing (race). A production MPI need not separately lock around the collective routines but a debug version may want to detect races. The communication part of the collective routine is assumed to be handled separately through the communication thread locks
  • Access Only" Access fixed data for an MPI object, such as the size of a communicator. This differs from the "none" case because an erroneous MPI program could free the object in a race with a routine that accesses the read-only data. A production MPI implementation need not guard this routine against changes in another thread. This may also include replacing a value within a routine, such as setting the name of a communicator.
  • Update Ref: Update the reference count of an MPI object only. Typically used by a routine that returns a reference to an internal object, such as an errhandler or datatype.
  • Read List: Return an element from a list of items, such as an attribute or info value. A correct MPI program will not contain any race that might update or delete the entry that is being read. This allows the implementation to use a lock-free, thread-safe set of list update and access operations in the production version; a debug version can attempt to detect improper race conditions
  • Update List: Update a list of items that may also be read (see the Read List entry). Multiple thread are allowed to simultaneously update the list, so the update implementation must be thread safe.
  • Allocate: Allocate an MPI object (may also need memory allocation such as malloc)
  • Own: The routine has its own thread-safety management. Examples are "global" state such as bsend buffers
  • None: The routine has no thread safety issues, or the routine has no thread-safety issues in correct programs and the routine must have low overhead and so an optimized (non-debug) version need not check for race conditions.
  • Other: Special cases, such as MPI_Abort and MPI_Finalize

Some routines fall into multiple categories; only the "major" category is marked. For example, the communicator creation routines must all access the global state for the context id, this has marked them in the "Own" category. They also are relevant for the "Collective" and "Access Only" categories.