Internal Instrumentation

From Mpich
Revision as of 23:07, 7 March 2011 by Goodell (talk | contribs) (Internal Instrumentation for MPICH2)

Jump to: navigation, search

This text is out-of-date but is provided as a starting point for discussions. The major update needed is to make this interface compatible with the MPIT interface, which (currently) defines a handle to be passed to the routines that access or update performance information.

Internal Instrumentation for MPICH2

To understand and tune the performance of MPICH2, there is a need for a uniform way to instrument and report on the MPICH2 code. This section suggests an approach similar to that used for adding debug messages, which is Debug Event Logging.


The design of the implementation is based on a clear set of requirements.

  1. Low to zero overhead for all operations that may be in a critical path.
    1. Compile-time selection for no overhead in the production version. That is, it must be possible to build MPICH2 with no instrumentation at all.
    2. Run-time selection with low overhead. This allows the inclusion of instrumentation in the "typical" builds. The Run-time selection must also include turning the instrumentation on and off in response to a number of events, including explicit control and through automatic controls such as limits on the amount of data.
    3. Thread-safe as an option (see below).
  2. Simple instrumentation of the common cases. This is to both encourage the inclusion of instrumentation and to ensure that the presence of instrumentation does not harm the readabilty or maintainability of the code.
  3. Easy method for adding or changing instrumentation.
  4. Modularity for the instrumentation (definitions must be local to the module that requires them)
  5. Easy hook for adding performance callbacks (but without adding overhead when callbacks are not required).
  6. Compatible with the proposed MPIT tool interface in MPI-3.

The requirement for compile-time selection implies that macros be used for any operations that may be in a performance-critical path.

The requirement for compatibility with MPIT suggests that the macros take a handle that specifies the counter, which can be implemented as a pointer to the variable to update, or a structure containing the pointer.

Thread safety can introduce significant overheads that may be unnecessary in accomplishing the purpose of the interface - tuning MPICH2. That is, in some cases, the extra overhead of ensuring thread safety may make the data less valuable than data that may have some errors (e.g., missing updates) due to thread races. Thus, the interface should allow the developer to make that tradeoff.

Possible Design

  • MPIU_INSTR_DURATION_DECL(handle) - Declare an instrumentation handle
  • MPIU_INSTR_DURATION_INIT(handle,ncounter,description) - Initialize a named duration and provide a text description
  • MPIU_INSTR_DURATION_START(handle) - Begin a timing "epoch" for name
  • MPIU_INSTR_DURATION_END(handle) - End a timing "epoch" for name and increment the time in the duration by the time since the corresponding start.
  • MPIU_INSTR_DURNATION_INCR(handle,index,amount) - Increment the index'th counter in the named duration by amount

The description field is used to create the code that writes out the summary. Combined with the extractstrings script, this allows instrumentation to be added in a single location.

A sample implementation for the single-threaded case might be:

#define MPIU_INSTR_DURATION_INCR(name,index,amount) \
    MPIU_INSTRUM[MPIU_INSTRUM_##name].val += amount

A script, similar to the extractstates script, would determine the size of the array and define the various MPIU_INSTR_name values. A more complex version could be

#define MPIU_INSTR_DURATION_INCR(name,amount) \
{ MPIU_Instrum_t *_p = MPIU_INSTRUM + MPIU_INSTRUM_##name; \
  _p->val += amount; _p->count++; if (amount > _p->max) _p->max = amount; \
  if (amount < _p->min) _p->min = amount; }

Next steps: Determine if these are adequate for the needed instrumentation. Note that the code that handles initialization and finalization is generated by reading the source code, in the same manor as extractstrings.


The original design document included mechanisms to instrument important internal states. This information is in the file stat.tex in the archived MPICH2 document (in /home/MPI). However, while designed and documented, it was not used in the initial implementation.

The original design has limitations; since that original design, there have been published papers on instrumentation of MPI, including one at IEEE Cluster 2006.Link title