Software help:sge with simple mpi
From Darwin
This example shows how to compile and run a simple MPI job under SGE. The job script requests 60 processors ( called slots in SGE ) and compiles a programm called mpi_hello_world.F90. It then executes the program through the mpi program launcher command mpirun. The compile command and mpirun used cause the parallel execution to use the clusters high-speed Myrinet network.
The source for the mpi_hello_world.F90 program is shown below
PROGRAM HELLO
#include "mpif.h"
INTEGER RC
CALL MPI_Init( rc )
PRINT *, ' HELLO WORLD '
CALL MPI_Finalize( rc )
END
The text for the job script is as follows
#!/bin/bash
#
# SGE queue system job script that requests some four-core nodes
# and runs a quick program on them.
# SGE directives begin with #$ in column 1
# o Directive # #$ -j y
# means put standard output and standard error in the same file.
# o Directive #$ -cwd
# means start the job in the directory from which it was
# submitted.
# o Directive #$ -pe mpich_mx 60
# means run with exactly 60 "slots" using the parallel
# environment defined by the keyword "mpich_mx". This gives
# the job 60 CPU's with the CPU's selected to be on the
# minimum number of nodes.
# To list available parallel environments use the command
# qconf -spl
# o Directive #$ -N sge_demo_001
# means name this job "sge_demo_001".
#
#$ -j y
#$ -cwd
#$ -pe mpich_mx 60
#$ -N sge_demo_001
#$ -o sge_demo_001.out
#
# Lots more SGE information can be found by reading
# the SGE man pages - see
# man qsub
# man qconf
# man sge_pe
# However, these are very dense reading.
#
echo '*********************************************************'
THEDATE=`date`
echo 'Start job '$THEDATE
echo 'NHOSTS = '$NHOSTS
echo 'NSLOTS = '$NSLOTS
echo '======= PE_HOSTFILE ======='
cat $PE_HOSTFILE
echo '==========================='
echo '======= $TMPDIR/machines ===='
cat $TMPDIR/machines
# Add modules (broken at the moment)
source /etc/profile.d/modules.sh
module load intel
module load mpich-mx
# export PATH=${PATH}:/opt/intel/fce/9.1.051/bin
# Create test directory
\rm -fr sge_job_${JOB_ID}.d
mkdir sge_job_${JOB_ID}.d
cd sge_job_${JOB_ID}.d
# Compile MPI program
cp ../mpi_hello_world.F90 .
echo 'Using mpi compiler script '`which mpif90`
mpif90 mpi_hello_world.F90
# Run MPI program
echo 'Using mpi launcher script'`which mpirun`
mpirun -v \
-np ${NSLOTS} -machinefile $TMPDIR/machines \
--mx-kill 30 --mx-copy-env ./a.out
echo '==========================='
echo 'End job '$THEDATE
echo '*********************************************************'
To try the above job simply cut and paste the Fortran source into a file called mpi_hello_world.F90. Then cut and paste the job script into a file called job.sge. To submit the job type the command
charles@beagle$ qsub job.sge
When this script executes it will write terminal output to a file called sge_demo_001.out. The script will also create
a sub-driectory called sge_job_NNNN.d, where NNNN is a unique job number that the SGE queuing system allocates for the job.
