mpirun, a Portable Startup Script



Up: The mpich Programming Environment Next: The mpicc and mpif77 Commands Previous: Introduction
Each parallel computing environment provides some mechanism for starting
parallel programs. Unfortunately, these mechanisms are very different from
one another. In an effort to make this aspect of parallel programming
portable as well, mpich contains a script called mpirun. this is
script is partially customized during the configuration process when mpich is
built. Therefore the actual ``source'' for mpirun is in the file
mpirun.sh.in in the mpich/bin directory. The most common
invocation of mpirun just specifies the number of processes and the
program to run:
mpirun -np 4 cpithe complete list of options for mpirun is obtained by running
mpirun -helpThis is the result:
mpirun [mpirun_options...] <progname> [options...]
mpirun_options:
-arch <architecture>
specify the architecture (must have matching machines.<arch>
file in /home/MPI/mpich/bin/machines) if using the execer
-h This help
-machine <machine name>
use startup procedure for <machine name>
Currently supported:
chameleon
meiko
paragon
p4
sp1
ibmspx
anlspx
ksr
sgi_mp
ipsc860
inteldelta
cray_t3d
execer
smp
symm_ptx
-machinefile <machine-file name>
Take the list of possible machines to run on from the
file <machine-file name>. This is a list of all available
machines; use -np <np> to request a specific number of machines.
-np <np>
specify the number of processors to run on
-nolocal
don't run on the local machine (only works for
p4 and ch_p4 jobs)
-stdin filename
Use filename as the standard input for the program. This
is needed for programs that must be run as batch jobs, such
as some IBM SP systems and Intel Paragons using NQS (see
-paragontype below).
-t Testing - do not actually run, just print what would be
executed
-v Verbose - throw in some comments
-dbx Start the first process under dbx where possible
-gdb Start the first process under gdb where possible
(on the Meiko, selecting either -dbx or -gdb starts prun
under totalview instead)
-xxgdb Start the first process under xxgdb where possible (-xdbx
does not work)
-tv Start under totalview
-ksq Keep the send queue. This is useful if you expect later
to attach totalview to the running (or deadlocked) job, and
want to see the send queues. (Normally they are not maintained
in a way which is visible to the debugger).
Special Options for NEC - CENJU-3:
-batch Excecute program as a batch job (using cjbr)
-stdout filename
Use filename as the standard output for the program.
-stderr filename
Use filename as the standard error for the program.
-jid Jobid from Job-Scheduler EASY.
If this option is specified, mpirun directly executes
the parallel program using this jobid.
Otherwise, mpirun requests np nodes from the Job-Scheduler
in interactive or batch mode.
In interactive mode (i.e. option -batch is not
specified), mpirun waits until the processors are
allocated, executes the parallel program and
releases the processors.
Special Options for Nexus device:
-nexuspg filename
Use the given Nexus startup file instead of creating one.
Overrides -np and -nolocal, selects -leave_pg.
-nexusdb filename
Use the given Nexus resource database.
Special Options for Workstation Clusters:
-e Use execer to start the program on workstation
clusters
-pg Use a procgroup file to start the p4 programs, not execer
(default)
-leave_pg
Don't delete the P4 procgroup file after running
-p4pg filename
Use the given p4 procgroup file instead of creating one.
Overrides -np and -nolocal, selects -leave_pg.
-tcppg filename
Use the given tcp procgroup file instead of creating one.
Overrides -np and -nolocal, selects -leave_pg.
-p4ssport num
Use the p4 secure server with port number num to start the
programs. If num is 0, use the value of the
environment variable MPI_P4SSPORT. Using the server can
speed up process startup. If MPI_USEP4SSPORT as well as
MPI_P4SSPORT are set, then that has the effect of giving
mpirun the -p4ssport 0 parameters.
Special Options for Batch Environments:
-mvhome Move the executable to the home directory. This
is needed when all file systems are not cross-mounted
Currently only used by anlspx
-mvback files
Move the indicated files back to the current directory.
Needed only when using -mvhome; has no effect otherwise.
-maxtime min
Maximum job run time in minutes. Currently used only
by anlspx. Default value is 15 minutes.
-nopoll Do not use a polling-mode communication.
Available only on IBM SPx.
-mem value
This is the per node memory request (in Mbytes). Needed for some
CM-5s. ( Default 32. )
-cpu time
This is the the hard cpu limit used for some CM-5s in
minutes. (Default 15 minutes.)
Special Options for IBM SP2:
-cac name
CAC for ANL scheduler. Currently used only by anlspx.
If not provided will choose some valid CAC.
Special Options for Intel Paragon:
-paragontype name
Selects one of default, mkpart, NQS, depending on how you want
to submit jobs to a Paragon.
-paragonname name
Remote shells to name to run the job (using the -sz method) on
a Paragon.
-paragonpn name
Name of partition to run on in a Paragon (using the -pn name
command-line argument)
On exit, mpirun returns a status of zero unless mpirun detected a problem, in
which case it returns a non-zero status (currently, all are one, but this
may change in the future).
Multiple architectures may be handled by giving multiple -arch and -np
arguments. For example, to run a program on 2 sun4s and 3 rs6000s, with
the local machine being a sun4, use
/home/MPI/mpich/util/mpirun -arch sun4 -np 2 -arch rs6000 -np 3 program
This assumes that program will run on both architectures. If different
executables are needed, the string '%a' will be replaced with the arch name.
For example, if the programs are program.sun4 and program.rs6000, then the
command is
/home/MPI/mpich/util/mpirun -arch sun4 -np 2 -arch rs6000 -np 3 program.%a
If instead the executables are in different directories; for example,
/tmp/me/sun4 and /tmp/me/rs6000, then the command is
/home/MPI/mpich/util/mpirun -arch sun4 -np 2 -arch rs6000 -np 3 /tmp/me/%a/program
It is important to specify the architecture with -arch BEFORE specifying
the number of processors. Also, the FIRST -arch command must refer to the
processor on which the job will be started. Specifically, if -nolocal is
NOT specified, then the first -arch must refer to the processor from which
mpirun is running.
For backward compatibility with earlier versions of mpirun, each of
these arguments can also be used with the prefix mr_, as in
mpirun -mr_np 4 myprog



Up: The mpich Programming Environment Next: The mpicc and mpif77 Commands Previous: Introduction
