CH4 Overall Design

Shortcomings of CH3

MPICH has relied on the CH3 device as the primary communication device all through the "MPICH2" series and a part of the "MPICH-3.x" release series. Unfortunately, over time, the device has accumulated a number of hacks to accommodate newer communication models and network architectures, much further than what it was originally designed to do. Some of the shortcomings of the CH3 design are listed here:

  • VC model: CH3 relies on communication in the context of "virtual connections" (VCs), where each peer process has a VC associated with it. This architecture matched networks that relied on a connection-oriented protocol, where VCs were a convenient way to keep track of the connection state and other peer-related information. Over time, VCs have accumulated additional fields, not all of which are useful to the same degree. Some of these can be cleaned up to reduce the size of each VC. Also, there has been some effort to make the allocation of VCs more dynamic to only create VCs to the processes we are communicating with. However, none of these approaches solve the fundamental scalability limitation of the VC structures, which scale with the number of peer processes.
  • Netmod API: