mpirun, a Portable Startup Script
Up: The mpich Programming Environment Next: The mpicc and mpif77 Commands Previous: Introduction
Each parallel computing environment provides some mechanism for starting
parallel programs. Unfortunately, these mechanisms are very different from
one another. In an effort to make this aspect of parallel programming
portable as well, mpich contains a script called mpirun. this is
script is partially customized during the configuration process when mpich is
built. Therefore the actual ``source'' for mpirun is in the file
mpirun.sh.in in the mpich/bin directory. The most common
invocation of mpirun just specifies the number of processes and the
program to run:
mpirun -np 4 cpithe complete list of options for mpirun is obtained by running
mpirun -helpThis is the result:
mpirun [mpirun_options...] <progname> [options...] mpirun_options: -arch <architecture> specify the architecture (must have matching machines.<arch> file in /home/MPI/mpich/bin/machines) if using the execer -h This help -machine <machine name> use startup procedure for <machine name> Currently supported: chameleon meiko paragon p4 sp1 ibmspx anlspx ksr sgi_mp ipsc860 inteldelta cray_t3d execer smp symm_ptx -machinefile <machine-file name> Take the list of possible machines to run on from the file <machine-file name>. This is a list of all available machines; use -np <np> to request a specific number of machines. -np <np> specify the number of processors to run on -nolocal don't run on the local machine (only works for p4 and ch_p4 jobs) -stdin filename Use filename as the standard input for the program. This is needed for programs that must be run as batch jobs, such as some IBM SP systems and Intel Paragons using NQS (see -paragontype below). -t Testing - do not actually run, just print what would be executed -v Verbose - throw in some comments -dbx Start the first process under dbx where possible -gdb Start the first process under gdb where possible (on the Meiko, selecting either -dbx or -gdb starts prun under totalview instead) -xxgdb Start the first process under xxgdb where possible (-xdbx does not work) -tv Start under totalview -ksq Keep the send queue. This is useful if you expect later to attach totalview to the running (or deadlocked) job, and want to see the send queues. (Normally they are not maintained in a way which is visible to the debugger). Special Options for NEC - CENJU-3: -batch Excecute program as a batch job (using cjbr) -stdout filename Use filename as the standard output for the program. -stderr filename Use filename as the standard error for the program. -jid Jobid from Job-Scheduler EASY. If this option is specified, mpirun directly executes the parallel program using this jobid. Otherwise, mpirun requests np nodes from the Job-Scheduler in interactive or batch mode. In interactive mode (i.e. option -batch is not specified), mpirun waits until the processors are allocated, executes the parallel program and releases the processors. Special Options for Nexus device: -nexuspg filename Use the given Nexus startup file instead of creating one. Overrides -np and -nolocal, selects -leave_pg. -nexusdb filename Use the given Nexus resource database. Special Options for Workstation Clusters: -e Use execer to start the program on workstation clusters -pg Use a procgroup file to start the p4 programs, not execer (default) -leave_pg Don't delete the P4 procgroup file after running -p4pg filename Use the given p4 procgroup file instead of creating one. Overrides -np and -nolocal, selects -leave_pg. -tcppg filename Use the given tcp procgroup file instead of creating one. Overrides -np and -nolocal, selects -leave_pg. -p4ssport num Use the p4 secure server with port number num to start the programs. If num is 0, use the value of the environment variable MPI_P4SSPORT. Using the server can speed up process startup. If MPI_USEP4SSPORT as well as MPI_P4SSPORT are set, then that has the effect of giving mpirun the -p4ssport 0 parameters. Special Options for Batch Environments: -mvhome Move the executable to the home directory. This is needed when all file systems are not cross-mounted Currently only used by anlspx -mvback files Move the indicated files back to the current directory. Needed only when using -mvhome; has no effect otherwise. -maxtime min Maximum job run time in minutes. Currently used only by anlspx. Default value is 15 minutes. -nopoll Do not use a polling-mode communication. Available only on IBM SPx. -mem value This is the per node memory request (in Mbytes). Needed for some CM-5s. ( Default 32. ) -cpu time This is the the hard cpu limit used for some CM-5s in minutes. (Default 15 minutes.) Special Options for IBM SP2: -cac name CAC for ANL scheduler. Currently used only by anlspx. If not provided will choose some valid CAC. Special Options for Intel Paragon: -paragontype name Selects one of default, mkpart, NQS, depending on how you want to submit jobs to a Paragon. -paragonname name Remote shells to name to run the job (using the -sz method) on a Paragon. -paragonpn name Name of partition to run on in a Paragon (using the -pn name command-line argument) On exit, mpirun returns a status of zero unless mpirun detected a problem, in which case it returns a non-zero status (currently, all are one, but this may change in the future). Multiple architectures may be handled by giving multiple -arch and -np arguments. For example, to run a program on 2 sun4s and 3 rs6000s, with the local machine being a sun4, use /home/MPI/mpich/util/mpirun -arch sun4 -np 2 -arch rs6000 -np 3 program This assumes that program will run on both architectures. If different executables are needed, the string '%a' will be replaced with the arch name. For example, if the programs are program.sun4 and program.rs6000, then the command is /home/MPI/mpich/util/mpirun -arch sun4 -np 2 -arch rs6000 -np 3 program.%a If instead the executables are in different directories; for example, /tmp/me/sun4 and /tmp/me/rs6000, then the command is /home/MPI/mpich/util/mpirun -arch sun4 -np 2 -arch rs6000 -np 3 /tmp/me/%a/program It is important to specify the architecture with -arch BEFORE specifying the number of processors. Also, the FIRST -arch command must refer to the processor on which the job will be started. Specifically, if -nolocal is NOT specified, then the first -arch must refer to the processor from which mpirun is running.For backward compatibility with earlier versions of mpirun, each of these arguments can also be used with the prefix mr_, as in
mpirun -mr_np 4 myprog
Up: The mpich Programming Environment Next: The mpicc and mpif77 Commands Previous: Introduction