Software help:sge with simple mpi

From Darwin

Jump to: navigation, search

This example shows how to compile and run a simple MPI job under SGE. The job script requests 60 processors ( called slots in SGE ) and compiles a programm called mpi_hello_world.F90. It then executes the program through the mpi program launcher command mpirun. The compile command and mpirun used cause the parallel execution to use the clusters high-speed Myrinet network.

The source for the mpi_hello_world.F90 program is shown below

     PROGRAM HELLO

#include "mpif.h"
     INTEGER RC

     CALL MPI_Init( rc )
      PRINT *, ' HELLO WORLD '
     CALL MPI_Finalize( rc )

     END

The text for the job script is as follows

#!/bin/bash
#
# SGE queue system job script that requests some four-core nodes 
# and runs a quick program on them.
# SGE directives begin with #$ in column 1
# o Directive # #$ -j y
#   means put standard output and standard error in the same file.
# o Directive #$ -cwd 
#   means start the job in the directory from which it was 
#   submitted.
# o Directive #$ -pe mpich_mx 60
#   means run with exactly 60 "slots" using the parallel 
#   environment defined by the keyword "mpich_mx". This gives
#   the job 60 CPU's with the CPU's selected to be on the
#   minimum number of nodes.
#   To list available parallel environments use the command
#   qconf -spl
# o Directive #$ -N sge_demo_001
#   means name this job "sge_demo_001".
#
#$ -j y
#$ -cwd
#$ -pe mpich_mx 60
#$ -N sge_demo_001
#$ -o sge_demo_001.out
#
# Lots more SGE information can be found by reading
# the SGE man pages - see
# man qsub
# man qconf
# man sge_pe
# However, these are very dense reading.
#
echo '*********************************************************'
THEDATE=`date`
echo 'Start job '$THEDATE
echo 'NHOSTS = '$NHOSTS
echo 'NSLOTS = '$NSLOTS
echo '======= PE_HOSTFILE ======='
cat $PE_HOSTFILE
echo '==========================='
echo '======= $TMPDIR/machines ===='
cat $TMPDIR/machines

# Add modules (broken at the moment)
source /etc/profile.d/modules.sh
module load intel
module load mpich-mx
# export PATH=${PATH}:/opt/intel/fce/9.1.051/bin

# Create test directory
\rm -fr sge_job_${JOB_ID}.d
mkdir sge_job_${JOB_ID}.d
cd sge_job_${JOB_ID}.d

# Compile MPI program
cp ../mpi_hello_world.F90 .
echo 'Using mpi compiler script '`which mpif90`
mpif90 mpi_hello_world.F90

# Run MPI program
echo 'Using mpi launcher script'`which  mpirun`
mpirun -v \
   -np ${NSLOTS} -machinefile  $TMPDIR/machines         \
   --mx-kill 30 --mx-copy-env ./a.out

echo '==========================='
echo 'End job '$THEDATE
echo '*********************************************************'

To try the above job simply cut and paste the Fortran source into a file called mpi_hello_world.F90. Then cut and paste the job script into a file called job.sge. To submit the job type the command

charles@beagle$ qsub job.sge


When this script executes it will write terminal output to a file called sge_demo_001.out. The script will also create a sub-driectory called sge_job_NNNN.d, where NNNN is a unique job number that the SGE queuing system allocates for the job.

Personal tools