BGQ

From Mpich
Revision as of 22:05, 12 December 2013 by Robl (talk | contribs) (How to migrate commits from mpich master to Blue Gene /Q release branch)

Jump to: navigation, search

This page describes how to build mpich from the master branch of the mpich.git repository on git.mpich.org.

Blue Gene/Q build instructions

The bgq toolchain must be in the $PATH before configure, otherwise the gnu cross-compiler will not be found.

export PATH=$PATH:/bgsys/drivers/V1R2M0/ppc64/gnu-linux/bin

The alternative is to provide the location of all compiler binaries using environment variables. For example,

CC=/bgsys/drivers/V1R2M0/ppc64/gnu-linux/bin/powerpc64-bgq-linux-gcc

The configure is simpler when specifying the $PATH, however the generated mpi compile scripts such as ${prefix}/bin/mpicc will also not contain the path information to the cross compiler. This means users of mpicc must also have the cross compiler in their $PATH for the compile script to work.

If one wishes to build MPICH with the XL compilers, use the environment variable approach.

Required

Specify the bgq cross compile and pamid device

--host=powerpc64-bgq-linux
--with-device=pamid

Customize the ROMIO file system.

--with-file-system=bg+bglockless

Optional

Customize the required bgq system software libraries

The latest installed bgq system software is used by default. The location of the bgq system software can also be specified with the configure option below or the BGQ_INSTALL_DIR environment variable.

--with-bgq-install-dir=/bgsys/drivers/V1R2M0/ppc64

A pami installation outside of the bgq system software directory may be specified using the --with-pami configure option(s). For example:

--with-pami=/bgsys/drivers/V1R2M0/ppc64/comm/sys
--with-pami-include=/bgsys/drivers/V1R2M0/ppc64/comm/sys/include
--with-pami-lib=/bgsys/drivers/V1R2M0/ppc64/comm/sys/lib

Customize the bgq cross compile settings

A different cross compile settings file for bgq pamid can be specified using the --with-cross-file configure option. Below is the configure option that specifies what is the default cross file for a bgq pamid configuration.

--with-cross-file=src/mpid/pamid/cross/bgq8

Disable rpath

When shared libraries are installed it is recommended to also disable the "wrapper rpath" configure option in order to take advantage of a shared library load optimization on the bgq io nodes.

--disable-wrapper-rpath

When a million processes each individually read from the filesystem the performance of the shared library load will be poor. The io node shared library optimization is a way to "stage" shared libraries on a bgq io node ramfs directory that is, in the absence of rpath information, searched first by the bgq loader. Any rpath information will be searched before this io node ramfs location and will result in a query all the way down to the filesystem.

Shared libraries can be added to the io node ramfs directory by packaging the libraries into a *.tar.gz file and copying that file into the /bgsys/linux/bgfs directory.

Enable common "no debug" and "performance" options

The xl.ndebug and xl.legacy.ndebug mpich versions installed with the bgq system software use the following options to eliminate debug and other error checks that would cause performance degradations.

--enable-fast=nochkmsg,notiming,O3
--with-assert-level=0
--disable-error-messages
--disable-debuginfo

Enable fine grain locking

The gcc, xl, and xl.ndebug mpich versions installed with the bgq system software use the following options to enable fine grain locking and synchronous progress mode.

--enable-thread-cs=per-object
--with-atomic-primitives
--enable-handle-allocation=tls
--enable-refcount=lock-free
--disable-predefined-refcount

Blue Gene/Q mpich testsuite instructions

From a filesystem location that is accessible to the Blue Gene/Q io nodes, for example /bgusr/johndoe, invoke the configure script in the test/mpi directory of the mpich source.

$ /home/johndoe/mpich/test/mpi/configure --srcdir=/home/johndoe/mpich/test/mpi --disable-spawn --with-mpi=/home/johndoe/mpich/install

The --srcdir configure option specifies the location of the testsuite source, the --with-mpi configure option specifies which mpi installation to use when compiling the tests, and the --disable-spawn configure option is required on Blue Gene/Q to skip unsupported functions.

Once configured, the tests can be compiled and executed using the make testing makefile rule. Specific make variables need to be specified depending on how the jobs are to be launched on a Blue Gene/Q system.

runjob

Before testing with runjob, and directly launching the jobs on a Blue Gene/Q system, the compute block must be allocated. Typically this is done using the bg_console command shell. For more information on bg_console see section "Creating and booting I/O blocks and compute blocks" in the IBM System Blue Gene Solution: Blue Gene/Q System Administration redbook.

To begin testing, change to the directory where the configure command was run (/bgusr/johndoe in this example) and invoke the following command:

 make testing MPITEST_PROGRAM_WRAPPER=" --block R00-M1-N06 : " MPIEXEC=runjob

The MPIEXEC variable is needed to specify the job launch mechanism, which on Blue Gene/Q is the runjob command. For more information on the runjob command see chapter 6, "Submitting jobs" in the IBM System Blue Gene Solution: Blue Gene/Q System Administration redbook.

The MPITEST_PROGRAM_WRAPPER variable is needed to supply additional information to the runjob command. This "wrapper" text is inserted after the $MPIEXEC command and its arguments, such as the number of processes in the job, and before the name of the test binary to launch. At a minimum the runjob command needs to have the compute block specified and the ':' separator character specified. Other runjob options can be specified as well, such as --timeout, although these are not required to launch the job.

Blue Gene/Q development instructions

The product release branches in the mpich-ibm.git git repository are based on the mpich2 1.5 release, and for esoteric historical reasons, the code in the repository is located in a mpich2 subdirectory that does not exist in the original mpich source. This extra directory makes a simple git cherry-pick of a commit on a Blue Gene/Q release branch on to another mpich branch challenging.

How to migrate commits from a previous Blue Gene/Q release branch

Use `git format-patch` to create patch files for each commit

For example:

% git checkout BGQ/IBM_V1R2M0
Checking out files: 100% (8574/8574), done.
Branch BGQ/IBM_V1R2M0 set up to track remote branch BGQ/IBM_V1R2M0 from origin.
Switched to a new branch 'BGQ/IBM_V1R2M0'

% git format-patch HEAD~4
0001-CPS-92XKPE-remove-fortran-interface-for-MPIX_Pset_io.patch
0002-CPS-92XKPE-Do-not-use-the-MPIX_Pset_io_node-function.patch
0003-CPS-97VH5U-do-not-disable-short-synchronous-sends.patch
0004-CPS-97RGJN-PAMID-only-fix-for-multi-threaded-MPI_Ibs.patch

Use `git am` to apply each commit

You may need to edit the commit message into an acceptable format using `git commit --amend`.

  • If the patch contains the leading `mpich2/` directory then this directory must be removed as the patch is applied by using the `-p2` option; for example:
% git am -p2 0001-CPS-92XKPE-remove-fortran-interface-for-MPIX_Pset_io.patch
  • The "summary" line of the commit message must not contain any IBM "breadcrumbs" such as "Issue 1234", "CPS WXYZ", or "D12345". These breadcrumbs need to be moved to the body of the commit message and prepended with the "(ibm)" namespace. It is good form to add the original commit as well. This helps when tracing the history of the code change via gitweb, etc. For example,
% git log -n1 b68401e3c6ba3bbd2cc0626dac1604242a20f989 # The original commit to be migrated
commit b68401e3c6ba3bbd2cc0626dac1604242a20f989
Author: Michael Blocksome <blocksom@us.ibm.com>
Date:   Mon Apr 29 13:14:35 2013 -0500

    CPS 92XKPE: remove fortran interface for MPIX_Pset_io_node()
    
    The MPIX_Pset_io_node() function has been deprecated.

% git commit --amend
% git log -n1
commit ee5e30e4ed5cddd10e2ebf72087174284d8d590b
Author: Michael Blocksome <blocksom@us.ibm.com>
Date:   Mon Apr 29 13:14:35 2013 -0500

    Remove fortran interface for MPIX_Pset_io_node()
    
    The MPIX_Pset_io_node() function has been deprecated.

    (ibm) CPS 92XKPE
    (ibm) b68401e3c6ba3bbd2cc0626dac1604242a20f989

Use `git apply` to repair any merge conflicts

If `git am` fails to apply a patch it must be applied manually. The `git am` command places the current patch in the `.git/rebase_apply/` directory in a file named `0001`. The `git apply` command must be used on this patch to create "reject" files that can be used to manually repair the files:

% git apply .git/rebase_apply/0001 -p2 --reject
% # edit edit edit
% git add FIXED_FILES
% git am --resolved


How to migrate commits from mpich master to Blue Gene /Q release branch

The process is much the same as above:

  • git format-patch to get the changes in question
  • git am to apply changes, but use the --directory=mpich2 flag to indicate the Blue Gene /Q release branch tree lives one directory lower. No need for the -p2 flag.