How to Run a Miniapp in SST

How to Run a Miniapp in SST - One User’s Experience

(Written by Daniel Barnette, Sandia National Laboratories, NM)

This article lists the steps I used to run miniSMAC2D in SST. The miniapp miniSMAC2D is a 2-D incompressible Navier-Stokes code computing the flow field around an airfoil. The original grid is partitioned into as many subgrids as desired using one subgrid per MPI rank.

Step 1. Building SST

The steps for building SST are listed in detail elsewhere in this wiki. It is recommended that one build SST on the machine sst-devel.sandia.gov. Mac builds are often problematic whereas sst-devel builds typically go smoothly. If you do decide to install SST on a Mac, it is recommended that one start with a clean Mac since SST can reset environment variables during the installation process, rendering inoperable other programs.

Include the following libraries in your SST build:

Step 2. Building the miniapp

Build your miniapp, keeping in mind the caveats listed in the MISCELLANEOUS NOTES section below.

Step 3. Move miniapp and input files to SST working directory

Move the miniapp executable and all input files into the directory containing the SST binary.

Step 4. Modify file quads.py to conform to your miniapp

The ‘quads.py’ file I used for miniSMAC2D can be found here: quads.py. Note that the miniapp filename as found in quads.py cannot have redirected input files. In other words, one cannot have

smac2d < smac2d.in

since SST will interpret smac2d.in to be an input file to SST and not the intended smac2d binary. If your code uses redirected input, change the miniapp so that input files can be listed as command line arguments without using ‘<’, like this:

smac2d smac2d.in

Step 5. In the .bash_profile file, make sure you have the following, assuming that SST was built in the default directories specified elsewhere in this wiki:

# ------------------------------------------------------------
# This section is for Modules support
# ------------------------------------------------------------
# start .profile
if [ -f /etc/profile.modules ]
then
         . /etc/profile.modules
# put any module UNLOADS here
         module unload null
         module unload boost
         module unload mpi
         module unload gcc
# ... and LOADS here
         module load mpi/openmpi-1.7.2
         module load boost/boost-1.54.0_ompi-1.7.2
fi
 
PATH=$PATH:$HOME/bin:$HOME/local/sst-4.0/bin; export PATH
MANPATH=$MANPATH:$HOME/bin; export MANPATH
 
# for sst
export LD_LIBRARY_PATH=$HOME/scratch/src/sst-gem5-4.0.0/build/X86_SE:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/local/packages/boost-1.54/lib:$LD_LIBRARY_PATH

Step 6. It is now time to run the miniapp within SST. Again, assuming you are on the machine sst-devel and are in the SST working directory:

$ export OMP_NUM_THREADS=8                  # assumes 1 core per thread, so 8 cores used by SST
$ salloc -N1 --time=12:00:00 bash           # allocates 1 node, twelve hours run time, using a bash shell
$ nohup ./sst quads.py > my_output_file.out # runs sst using quads.py to specify all input parameters
#!/bin/bash
#SBATCH --nodes=1  # Number of nodes - all cores per node are allocated to the job
#SBATCH --time=12:00:00 # Wall clock time (HH:MM:SS) - once the job exceeds this 
                        #  time, the job will be terminated (default is 5 minutes)
#SBATCH --job-name=sst_smac2d   # Temporary name of job

# list modules to make sure correct ones are being used
module li

# check which mpiexec will be used
which mpiexec

# define parameters for this run
nodes=$SLURM_JOB_NUM_NODES    # Number of nodes you have requested above 
mpi_ranks_per_node=1          # DO NOT CHANGE! Number MPI processes to run on each node (aka PPN)
export OMP_NUM_THREADS=8      # SST uses one thread per core

# run sst with quads.py as input; output to my_output_file.out
mpiexec --npernode $mpi_ranks_per_node --n $(($mpi_ranks_per_node*$nodes)) /home/dwbarne/local/
sst-4.0/bin/sst quads.py > my_output_file.out

and to submit the code to the compute nodes, on the command line type

$sbatch my_batch_file

The typical slurm output file will be generated in the sst working directory. One can also use the command squeue on the command line to check job status.

MISCELLANEOUS NOTES

Below are some miscellaneous notes re running miniapps in SST, as of Dec 2014. Keep in mind the miniapp being used in this case is miniSMAC2D, written in FORTRAN:

  1. When using the lightweight processor model Ariel, run the miniapp in SST at least 10 times using the same input parameters. Why? Since PIN doesn’t always attach to the same point in the binary, you may get different results, especially with threads. The development team is interested in getting user feedback on this.

  2. It is crucial to compile miniapps with the same environment and modules as was used with SST. Otherwise, will get “lib_xxxx.so” errors, where “xxxx” can vary. Before running, make sure the modules for MPI and Boost have been loaded on sst-devel by using the module load command (see discussion on the .bash_profile file above).

  3. Intel compilers have the file “libmpi_usempif08.so”, but GNU compilers do not. However, the GNU compilation process will still look for this file. It is believed that this problem arises from MPI libraries used for SST fighting with the MPI libraries used when compiling the miniapp. To avoid this problem: a. Get rid of all #include "omp_lib.h" compiler directives (no spaces in front) and replace with a single use omp_lib FORTRAN statement (indented 6 spaces if using standard F77, for example) at the top of the main routine. a. It is imperative that the miniapp be capable of running without using any MPI libraries and calls. If your code already has MPI calls, then put -DHAVE_MPI in the compiler options line in your Makefile to allow the user the option to include MPI-based code in the compilation process. Next, modify the miniapp to either include or exclude MPI code as follows:

#ifdef HAVE_MPI
.
.
<MPI code>
.
#else
.
.
<not MPI code>
#endif

If -DHAVE_MPI is not included in the compilation process, and if the miniapp is properly coded, a binary file without any MPI calls will be generated.

Depending on your code, you may think of much easier ways to avoid having MPI calls in your code when running under SST. Remember, the objective is to not have your code’s MPI calls conflicting with SST’s MPI calls.