Slurm Submission Examples

1. Requesting local scratch

Slurm doesn’t have any default amount of scratch space defined per job and that may be fine for most, but not all. Many Python, Perl, R, Java programs and 3rd party commercial software will write to TMPDIR which is by default is set to /tmp or /var/tmp (less common).

The problem of having enough local scratch arises when nodes are shared by multiple jobs and users. One job can cause the other jobs running on the same node to fail, so please be considerate of your colleagues by doing the following.

  • If your job creates just under a few Gb of temporary data in /tmp than you don’t need to do anything special.

  • If your job creates many Gbs of temporary data please follow the steps below:

    1. In your SLURM submit script define how much scratch space your job needs (you may need to figure it out by trial test run) and request the nodes that have fast local scratch area via the following SLURM directives:

      #SBATCH --tmp=8G                  # requesting 8 GB (1 GB = 1,024 MB) local scratch
      #SBATCH --constraint=fastscratch  # requesting nodes with fast scratch in /tmp
    2. Ensure each job writes to a job-specific directory in /tmp by setting TMPDIR immediately after #SBATCH directives in your job submission script

      export TMPDIR=/tmp/$SLURM_JOBID   # set TMPDIR to be /tmp/yourJOBID
      mkdir -p $TMPDIR                  # create TMPDIR
      Your job commands are here
      rm -rf $TMPDIR                    # delete job created temp data

2. Requesting memory

Slurm has default and max settings for a memory allocation per core for each partition. The default allocation of memory is 3 GB/core in most of the partitions. Max allocation per core is different for different partitions. All settings can be found in the default settings for Slurm partitions.

The default settings are used when a job submission script does not specify different memory allocation, and for most jobs this is sufficient. When a job requires more memory the memory needs to be specified in the Slurm script or sru command using either memory per core directive --mem-per-cpu or total memory --mem directive. These two directive are mutually exclusive, do not use them together.

Memory requests are written as integers with an optional size specification (M - megabytes, G - gigabytes, T - terabytes). A default specification is megabytes. For example, here are possible memory requests:

Specification for the job memory
#SBATCH --mem=500            # requesting 500 MB memory for the job
#SBATCH --mem=4G             # requesting 4 GB (1 GB = 1,024 MB) memory for the job

Specification for the memory per CPU
#SBATCH --mem-per-cpu=5000   # requesting 5,000 MB memory per CPU
#SBATCH --mem-per-cpu=2G     # requesting 2 GB memory per CPU
Job memory specifications can not exceed the partition’s Max limit. If a job specifies a memory per CPU limit that exceeds the system limit, that job’s count of CPUs per task will automatically be increased. This may result in the job failing due to CPU count limits.

Here are a few examples of memory requests for srun command. Same directives are used in slurm submit scripts.

[user@login-x:~]$ srun -p free --nodes=1 --ntasks=2 --pty /bin/bash -i                   (1)
[user@login-x:~]$ srun -p free --nodes=1 --cpus-per-task=2 --pty /bin/bash -i            (1)
[user@login-x:~]$ srun -p free --nodes=1 --ntasks=2 --mem-per-cpu=18G --pty /bin/bash -i (2)
[user@login-x:~]$ srun -p free --nodes=1 --ntasks=2 --mem=36G --pty /bin/bash -i         (3)
[user@login-x:~]$ srun -p free --nodes=1 --ntasks=4 --mem-per-cpu=10G --pty /bin/bash -i (4)
1 This request will be allocated 2 CPUs and a default 3Gb per CPU memory, total memory for job is 2 x 3Gb = 6Gb. Note two ways of requesting the same resources.
2 This request will be allocated 2 CPUs and a max allowed memory per CPU, total memory for job is 2 x 18Gb = 36Gb.
3 This request will be allocated 2 CPUs and total memory of 36Gb resulting in 18Gb memory per CPU, max allowed by partition.
4 This request will be allocated 4 CPUs and 10Gb memory per CPU, total memory for job is 4 x 10Gb = 40Gb.

Note, all above examples are asking for 1 node. This is important to let SLURM know that all your processes should be on a single node and not spread over multiple nodes. Very few applications that are compiled and run with openmpi or mpich can use multiple nodes, the rest of applications including interactive sessions should use a single node.

Please see specific examples below for common applications.

3. Requesting time

Similar to memory limits, Slurm has default and max settings for a runtime for each partition. All settings can be found in the table that lists the default settings for Slurm partitions.

The default settings are used when a job submission script does not specify different runtime, and for most jobs this is sufficient. When a job requires longer runtime it needs to be specified in the Slurm script using --time (or short -t notation) setting.

Acceptable time formats are minutes, minutes:seconds, hours:minutes:seconds, days-hours, days-hours:minutes and days-hours:minutes:seconds. For example:

--time=5        # 5 minutes
-t 36:30:00     # 36 hrs and 30 min
-t 5-00:00:00   # 5 days

4. Submit different job types

This section has several examples for how to submit different job types with Slurm.

4.1. Array Jobs

A SLURM Job array provide a way for users to submit a large number of identical jobs at once with an index parameter that can be used to alter how each job behaves.

#!/bin/bash

#SBATCH --job-name=test_array   ## name of the job.
#SBATCH -A panteater_lab        ## account to charge
#SBATCH -p standard             ## partition/queue name
#SBATCH --error=error_%A_%a.txt ## error log file name: %A is job id, %a is array task id
#SBATCH --output=out_%A_%a.txt  ## output filename
#SBATCH --nodes=1               ## number of nodes the job will use
#SBATCH --ntasks=1              ## number of processes to launch for each array iteration
#SBATCH --cpus-per-task=1       ## number of cores the job needs
#SBATCH --time=1:00:00          ## time limit for each array task
#SBATCH --array=1-100           ## number of array tasks is 100
                                ## $SLURM_ARRAY_TASK_ID takes values from 1 to 100 inclusive

# Each array task takes a different input file. Following the input filename
# convention data_file_1.txt, data_file_2.txt, ..., data_file_100.txt
# can use $SLURM_ARRAY_TASK_ID to specify the filename:

analyze data_file_$SLURM_ARRAY_TASK_ID.txt

# An application *simulate* takes as an argument an integer number
# which can be specified with $SLURM_ARRAY_TASK_ID, for example:

simulate $SLURM_ARRAY_TASK_ID

This job will be scheduled as 100 independent tasks. Each task has a separate time limit of 1 hour and each task may start at a different time. The script references $SLURM_ARRAY_TASK_ID to select an input file or to set a command line argument for the application.

Each array task will write in its own output and error log files.

if you only use '%A' in the logs file specification all array tasks will try to write to a single file. The performance of the run will be drastically reduced. Make sure to use both %A and %a in the log file name specification as was shown in the submit script above.

Using a job array instead of a large number of separate serial jobs can be advantageous since the scheduler does not have to analyze job requirements for each task in the array separately, so it runs more efficiently.

Other than the initial job-submission step with sbatch, the load on the scheduler is the same for an array job as for the equivalent number of non-array jobs. The cost of dispatching each array task is the same as dispatching a non-array job.

You should not use a job array to submit tasks with very short run times, e.g. much less than an hour. Tasks with run times of only a few minutes should be grouped into longer jobs using GLOST, GNU Parallel, or a shell loop inside of a job (see the Running many short tasks).

4.1.1. Array indexing

Slurm creates a $SLURM_ARRAY_TASK_ID variable to each array task. This variable can be used inside the job script to handle input and output files or arguments for the task.

In our array example script, for a 100-task job array the input files were named data_file_1.txt, data_file_2.txt, …​, data_file_100.txt so that one can use $SLURM_ARRAY_TASK_ID to specify the filename. Often files are not named in that precise manner but can still be referenced using task ID.

For example if you have a directory of 100 files that end in .txt, you can use the following approach to get the name of the file for each task automatically:

filename=$(ls *.txt | sed -n ${SLURM_ARRAY_TASK_ID}p)
MyProg  $filename

Here, in the first line, a command ls list all the files with needed naming convention (the list will be 100 names long) and pipes the output into the sed command which takes a single line from the output and assigns it to the variable filename. The taken line corresponds to the to SLURM_ARRAY_TASK_ID order in the list of files. The second line simply executes the needed program with the variable which holds now a specific file name for this specific array task.

Array indexing can be used for multiple input parameters to a program. Let say a program X requires 3 parameters, a filename and 2 numbers. A user can create a simple text file that lists all needed parameters for the desired tasks:

/a/path/to/fileM 23 14.5
/a/other/path/fileZ 12 11.2
/a/path/to/fileS 1 2.2
... cut for brevity, total 20 lines...

Then in the Slurm submit script a user can request 20 array tasks (same number as the number of parameters lines in the created file) and provide the needed parameter for each task as :

INPUT=/path/to/params.txt
Args=$(awk "NR==$SLURM_ARRAY_TASK_ID" $INPUT)    # variable Args will contain 3 values
progX $Args

4.2. Running many short tasks

When each array task is short (seconds or a few minutes), array jobs become inefficient and overload the scheduler because time spent on managing tasks will be longer than doing actual work. This is a negative impact not only on your job but on all other users.

When you have hundreds or thousands of very short tasks, it is better to combine simple array with loop that groups multiple tasks for efficiency.

For example, the following submit script will run 4,000 runs of a program (here substituted by echo for simplicity) where each run takes just a few seconds to complete. Instead of running an array job with 4,000 tasks, the script will be much more efficient to run 4 array tasks where each completes 1,000 runs.

In this example, we specify 4 array tasks, with SBATCH directives, then use shell variables and loop to set the number of runs each array task will do and the starting and ending number of runs for them.

#!/bin/bash

#SBATCH --job-name=tinytask   ## Name of the job.
#SBATCH -p free               ## partition/queue name
#SBATCH --nodes=1             ## (-N) number of nodes the job will use
#SBATCH --ntasks=1            ## (-n) number of processes to be launched
#SBATCH --cpus-per-task=1     ## number of cores the job needs
#SBATCH --mem-per-cpu=1000    ## RAM per CPU is 1,000 MB
#SBATCH --error=%x.%A_%a.err  ## error log file: %x is job name, %A is job ID, %a is task ID
#SBATCH --output=%x.%A_%a.out ## output log file: %x is job name, %A is job ID, %a is task ID
#SBATCH --array=1-4
#SBATCH --qos=low

#Set the number of runs that each Slurm task should do
PER_TASK=1000

# Calculate the starting and ending values for this task based
# on the Slurm task ID  and the number of runs per task.
START_NUM=$(( ($SLURM_ARRAY_TASK_ID - 1) * $PER_TASK + 1 ))
END_NUM=$(( $SLURM_ARRAY_TASK_ID * $PER_TASK ))

# Print the task and run range
echo "Task $SLURM_ARRAY_TASK_ID: for runs $START_NUM to $END_NUM"

# Run the loop of runs for this task.
for (( run=$START_NUM; run<=END_NUM; run++ )); do
  echo "Slurm task $SLURM_ARRAY_TASK_ID, run number $run"
  #do your commands here
done

4.3. OpenMP jobs

OpenMP jobs use multiple cores on a single machine. Below is an example of a submission script that utilizes 6 physical cores to run OpenMP program. In a strictly OpenMP job the number of nodes is always 1.

#!/bin/bash

#SBATCH -p standard             ## partition/queue name
#SBATCH --nodes=1               ## number of nodes the job will use
#SBATCH --ntasks=1              ## number of processes to launch
#SBATCH --cpus-per-task=6       ## number of OpenMP threads
                                ## total RAM request = 6 * 3 GB/core = 18 GB
module load openmpi/4.0.3/gcc.8.4.0
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
/path/to/openmp-program
Define the value of OMP_NUM_THREADS to be equal to the value of Slurm defined environment variable $SLURM_CPUS_PER_TASK. This ensures that any change to a number of requested threads is automatically passed to OpenMP. #SBATCH --cpus-per-task must be defined.

4.4. MPI jobs

MPI jobs use multiple cores across different nodes. The following submit script will allocate 80 cores across 2 different nodes. Each core will be allocated a default 3 GB of RAM, for a total of 240 gB

#!/bin/bash

#SBATCH -p standard             ## partition/queue name
#SBATCH --nodes=2               ## number of nodes the job will use
#SBATCH --ntasks=80             ## number of processes to launch
#SBATCH --cpus-per-task=1       ## number of MPI threads
                                ## total RAM request = 80 * 3 GB/core = 240 GB
# Run MPI application
module load mpich/3.3.2/gcc.8.4.0
mpirun -np $SLURM_NTASKS app_executable > output.txt

4.5. Hybrid MPI/OpenMP jobs

A hybrid job uses multiple processes and multiple threads within a process. Usually, MPI is used to start the multiple processes, and then each process uses a multi-threading library to do computations with multiple threads.

Here is an example of 8 MPI processes running on 2 nodes (4 MPI tasks per node) with 5 OpenMP threads per each process, each OpenMP thread has 1 physical core and needs 3 GB memory. The job requests a total of 40 cores and 120 GB of RAM

#!/bin/bash

#SBATCH -p standard             ## partition/queue name
#SBATCH --nodes=2               ## number of nodes the job will use
#SBATCH --ntasks-per-node=4     ## number of MPI tasks per node
#SBATCH --cpus-per-task=5       ## number of threads per task
                                ## total  RAM request = 2 x 4 x 5 x 3 GB/core = 120 GB

# You can use mpich or openmpi, per your program requirements
# only one can be active
# module load mpich/3.3.2/gcc.8.4.0
module load openmpi/4.0.3/gcc.8.4.0

export OMP_PROC_BIND=true
export OMP_PLACES=threads
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK  # Set num threads to num of
                                             # CPU cores per MPI task.
mpirun -np $SLURM_NTASKS app_executable > output.txt

4.6. Large memory jobs

Some jobs may need more memory. For these jobs will need to set memory requirement and the number of CPUs to use, for example:

#!/bin/bash

#SBATCH -A panteater_lab        ## account to use
#SBATCH -J test-0.50            ## job name
#SBATCH -p standard             ## partition/queue name
#SBATCH --nodes=1               ## number of nodes the job will use
#SBATCH --mem=100000            ## request 100,000 MB of memory
#SBATCH --ntasks-per-node=33    ## number of tasks to launch per node
                                ## is ceiling(100,000 MB / 3,072 MB/core ) or 33 tasks
module load gcc
module load mkl
module load julia
./run.sh 3 0.50 1.00

4.7. Dependent Jobs

Job dependencies are typically used to construct pipelines where jobs need to be launched in sequence upon the successful completion of previously launched jobs. Dependencies can also be used to chain together long simulations requiring multiple steps. With Slurm this is done by utilizing the --dependency feature. To familiarize yourself with the dependency feature take a look at the man page of sbatch command man sbatch and read the -d, --dependency section. Here is an example workflow of commands that shows how to use this feature.

  1. Submit a 1st job that has no dependencies and set a variable to hold the job ID

    [user@login-x:~]$ jobid1=$(sbatch --parsable first_job.sub)
  2. Submit a 2nd job with a condition that it will launch only after the first one completed successfully. Set a variable to hold this job ID.

    [user@login-x:~]$ jobid2=$(sbatch --dependency=afterok:$jobid1 second_job.sub
  3. Submit a 3rd job that depends on a successful completion of the second job.

    [user@login-x:~]$ jobid3=$(sbatch --dependency=afterok:$jobid2 third_job.sub)
  4. Submit the last job that depends on second and third finishing successfully.

    [user@login-x:~]$ sbatch --dependency=afterok:$jobid2,afterok:$jobid3 last_job.sub
  5. Show dependencies in squeue output:

    [user@login-x:~]$ squeue -u $USER -o "%.8A %.4C %.10m %.20E"
If a dependency condition is not satisfied, then the dependent job will remain in the Slurm queue with status DependencyNeverSatisfied. In this case, you need to cancel your jobs manually with the scancel command.

4.8. GPU Jobs

To run a GPU job one needs to request 2 elements:

  1. a GPU partition

    To see all available partitions use sinfo command that lists all partitions and nodes managed by Slurm. All users can use free-gpu and gpu-debug partitions.

  2. a GPU type and number of GPUs

    GPU type and number are specified with --gres directive, for both interactive and batch jobs. Currently, HPC3 has a single GPU type: V100. Different GPU cards may be added in the future and the SBATCH directives will need to reflect the new GPU type. To find out what Generic RESource (GRES) is available use the following command:

    [user@login-x:~]$ sinfo -o "%60N %10c %10m  %20f  %10G" -e

In your Slurm submit script you will need to add

#SBATCH --nodes=1
#SBATCH --partition=free-gpu      # specify free-gpu partition
#SBATCH --gres=gpu:V100:1         # specify 1 GPU of type V100
GPU number should be set to 1. Nearly 100% of applications on the cluster will use only 1 GPU. None of Perl, Python, R-based applications need multi-GPU. Very few applications can use multiple GPUs in P2P (peer-2-peer) mode. These applications need to be specially designed and compiled with very specific flags and options to be able to use multi-GPU acceleration. A few examples of applications that can do P2P are Amber and NAMD.

5. Submit Common Applications

This section has several examples of submission scripts for the most common applications. You will have to tailor the SBATCH options for your requirements (i.e., partition, cores/memory, email alert).

5.1. Jupyterhub Portal

Sometimes applications are available via containers on our Jupyterhub portal. This includes many bioinformatics applications, Rstudio, etc. Below are the steps to start a container that will provide Rstudio along with a few other applications.

  1. Using your favorite browser go to: https://hpc3.rcic.uci.edu/biojhub3/hub/login You will see the following screen where you will Use your usual HPC3 credentials to sign in:

    signin
    Figure 1. Sign in screen
  2. After authentication you will see a screen with server options as in the figure below:

    jhub login
    Figure 2. Jupyter Server options screen
  3. Modify the Select Account to Charge to be one of your Slurm accounts, change number of CPUs and amount of memory if needed, and press Start.

    Once the notebook is done spawning, you will get a Launcher screen with a number GUI apps you can use. One of those buttons is RStudio.

  4. Be sure to stop your Juputerhub notebook server after you are done with RStudio. From the File menu choose Hub Control Panel and you will be forwarded to a screen similar where you can press on Stop My Server to shut down the server:

    jhub logout
    Figure 3. Jupyter Server stop screen

5.2. Jupyter Notebook

Sometime people create specific conda environments with additional software and wish to run jupyter notebooks. As we do not allow computational jobs on login nodes, here are the steps to run notebooks on interactive nodes.

  1. Once you login to HPC3, get an interactive node using srun command. The example below will give 1 CPU and default 3Gb memory. For most cases this is sufficient.

    [user@login-x:~]$ srun -p free --pty /bin/bash -i

    In some instances, users need to request more memory which is done with --mem= or --mem-per-cpu directives (see more info in requesting memory) and specify that all cores should be on a single node. For example, to get 20Gb for your request do:

    [user@login-x:~]$ srun -p free --mem=20G --node=1 --pty /bin/bash -i

    Please note, the above requests are for a free partition, depending on what work you do you may need a standard partition.

  2. After executing srun command you will be put on a compute node of the cluster. Take a note of the host name, it is usually a part of your shell prompt. If unsure simply execute this command to find out:

    [user@hpc3-14-00:~]$ hostname -s

    In this example the node is hpc3-14-00

  3. Load anaconda module (whichever one you are using), for example:

    [user@hpc3-14-00:~]$ module load anaconda/2020.07
  4. Before starting the notebook, check if the port you want to use is free. Use any high number above 6000. For example, to check if the port 8989 is free run:

    [user@hpc3-14-00:~]$ ss -l -n | grep 8989

    If the port is free there will be no output from the command. If there is an output from the command, then the port is in use, pick another one and check again.

  5. Start the notebook with the --ip, --port and --no-browser options. For the ip the following command will automatically fill the correct hostname, for example:

    [user@hpc3-14-00:~]$ jupyter notebook --no-browser --ip=$(hostname -s) --port=8989

    There will be output from the notebook command and it will include local host string and a token. Take a note of this string and copy it, it will be used on your laptop in the browser in the last step. This string will look similar to:

    http://127.0.0.1:8989/?token=ae4ebf3bwd456780a047254898fabsd8234gefde11bb
  6. On your laptop connect to the cluster in a terminal window using ssh tunnel and information about your jupyter notebook start. For our example, for the host hpc3-14-00 and port 8989 you will do

    ssh -L 8989:hpc3-14-00:8989 user@hpc3.rcic.uci.edu

    where "user" is your account (UCnetID). When asked for a password, use your usual credentials. Note, the first occurrence of port in the ssh command is port on your laptop, the second is the port on the cluster node. They do not need to be the same, you can pick another free port on your laptop (see your laptop documentation how to do this).

  7. On your laptop open your browser and paste the string that was produced by starting your jupyter instance into the URL address area:

    http://127.0.0.1:8989/?token=ae4ebf3bwd456780a047254898fabsd8234gefde11bb

    Your jupyter notebook will be running in the browser on your laptop.

5.3. Matlab

  1. Single core/CPU example matlab-single-cpu.sub:

    #!/bin/bash
    
    #SBATCH -p standard           ## run on the standard partition
    #SBATCH -N 1                  ## run on a single node
    #SBATCH -n 1                  ## request one task (one CPU)
    #SBATCH -t 5-00:00:00         ## 5-day run time limit
    
    module load MATLAB/R2020a
    matlab -nodesktop -nosplash -singleCompThread -r mycode -logfile mycode.out

    The above will submit the Matlab code mycode.m with specified requested resources. Note: because the default is one CPU per task, -n 1 can be thought of as requesting just one CPU.

    The equivalent command-line method:

    [user@login-x:~]$ module load MATLAB/R2020a
    [user@login-x:~]$ sbatch -p standard -N 1 -n 1 -t 05-00:00:00 \
     --wrap="matlab -nodesktop -nosplash -singleCompThread -r mycode -logfile mycode.out"
  2. Multiple core/CPU example matlab-multi-cpu.sub:

    #!/bin/bash
    
    #SBATCH -p standard      ## run on the standard partition
    #SBATCH -N 1             ## run on a single node
    #SBATCH -n 12            ## request 12 tasks (12 CPUs)
    #SBATCH -t 02-00:00:00   ## 2-day run time limit
    
    module load MATLAB/R2020a
    matlab -nodesktop -nosplash -singleCompThread -r mycode -logfile mycode.out

    The above will submit the Matlab code mycode.m with specified requested resources. Note: because the default is one CPU per task, -n 12 can be thought of as requesting 12 CPUs.

    The equivalent command-line method:

    [user@login-x:~]$ module load MATLAB/R2020a
    [user@login-x:~]$ sbatch -p standard -N 1 -n 12 -t 02-00:00:00 \
    --wrap="matlab -nodesktop -nosplash -singleCompThread -r mycode -logfile mycode.out"

5.4. R

  1. Single core/CPU example R-single-cpu.sub:

    #!/bin/bash
    
    #SBATCH -p standard  ## run on the standard partition
    #SBATCH -N 1         ## run on a single node
    #SBATCH -n 1         ## request 1 task (one CPU)
    #SBATCH -t 1-        ## 1-day run time limit
    
    module load R/3.6.2
    R CMD BATCH --no-save mycode.R

    The above will submit the R code mycode.R with specified requested resources. Note: because the default is one CPU per task, -n 1 can be thought of as requesting just one CPU.

    The equivalent command-line method:

    [user@login-x:~]$ module load R/3.6.2
    [user@login-x:~]$ sbatch -p standard -N 1 -n 1 -t 1- --wrap="R CMD BATCH --no-save mycode.R"
  2. Multiple core/CPU example R-multi-cpu.sub:

    Nearly all R jobs will only use a single core. Please make sure your job is multi-threaded or is explicitly using R parallel libraries.

    #!/bin/bash
    
    #SBATCH -p standard  ## run on the standard partition
    #SBATCH -N 1         ## run on a single node
    #SBATCH -n 12        ## request 12 tasks (12 CPUs)
    #SBATCH -t 00:20:00  ## 20 min run time limit
    
    module load R/3.6.2
    R CMD BATCH --no-save mycode.R

    The above will submit the R code mycode.R with specified requested resources. Note: because the default is one CPU per task, -n 12 can be thought of as requesting 12 CPUs.

    The equivalent command-line method:

    [user@login-x:~]$ module load R/3.6.2
    [user@login-x:~]$ sbatch -p standard -N 1 -n 12 -t 00:20:00 \
    --wrap="R CMD BATCH --no-save mycode.R"

5.5. Rstudio

There a couple of ways to run Rstudio.

  1. Windows users: this method usually works for users who connect to the cluster using MobaXterm.

    Once logged in, claim an interactive session and start Rstudio after loading Rstudio module and R of your choice module via:

    [user@login-x:~]$ srun -p free --pty --x11 /bin/bash -i   # claim an interactive session
    [user@login-x:~]$ module load rstudio/1.4.1106            # load rstudio module
    [user@login-x:~]$ module load R/4.0.2                     # load R module
    [user@login-x:~]$ rstudio                                 # start Rstudio
  2. Mac users: Your local Mac needs to have XQuartz installed. This is a standard application that provides X Window system for Mac OS. Follow Mac installation guide for installing applications if you don’t have XQuartz installed.

    • login on the cluster using X forwarding. This means using -X or -X -Y option in the ssh command. For example:

    ssh -X panteater@hpc3.rcic.uci.edu
    • Once logged in, claim an interactive session, load Rstudio and R modules. Enforce software rendering engine in the rstudio command:

      [user@login-x:~]$ srun -p free --pty --x11 /bin/bash -i   # claim an interactive session
      [user@login-x:~]$ module load rstudio/1.4.1106            # load rstudio module
      [user@login-x:~]$ module load R/4.0.2                     # load R module
      [user@login-x:~]$ QMLSCENE_DEVICE=softwarecontext rstudio # enforce rendering in rstudio
  3. All users: If the above method does not work for you (common for Mac users), the alternative way is to use our Jupyterhub portal and a container with Rstudio. Please see Jupyter portal for step by step instructions.

5.6. Python

  1. Single core/CPU example python-single-cpu.sub:

    #!/bin/bash
    
    #SBATCH -p standard                   ## run on the standard partition
    #SBATCH -N 1                          ## run on a single node
    #SBATCH -n 1                          ## request 1 task (1 CPU)
    #SBATCH -t 2:00:00                    ## 2 hr run time limit
    #SBATCH --mail-type=end               ## send email when the job ends
    #SBATCH --mail-user=UCInetID@uci.edu  ## use this email address
    
    module load python/3.8.0
    python myscript.py

    The above will submit the Python 3 code myscript.py with specified requested resources. Make sure to use your actual UCI-issued email address. While Slurm sends emails to any email address, we prefer you use your UCInetID@uci.edu email address. System administrators will use UCInetID@uci.edu if they need to contact you about a job.

    Note: Because the default is one CPU per task, -n 1 can be thought of as requesting just one CPU.

    The equivalent command-line method:

    [user@login-x:~]$ module load python/3.8.0
    [user@login-x:~]$ sbatch -p standard -N 1 -n 1 -t 2:00:00 --mail-type=end \
    --mail-user=UCInetID@uci.edu --wrap="python3 myscript.py"
  2. Multiple core/CPU example python-multi-cpu.sub:

    #!/bin/bash
    
    #SBATCH -p standard                   ## run on the standard partition
    #SBATCH -N 1                          ## run on a single node
    #SBATCH -n 1                          ## request 1 task
    #SBATCH -c 12                         ## request 12 CPUs
    #SBATCH -t 5-                         ## 5 day run time limit
    
    module load python/3.8.0
    python myscript.py

    The above will submit the Python 3 code myscript.py with specified requested resources. Note: The default setting of one CPU per task is not applicable here because -c overrides that default.

    The equivalent command-line method:

    [user@login-x:~]$ module load python/3.8.0
    [user@login-x:~]$ sbatch -p standard -N 1 -n 1 -c 12 -t 5- --wrap="python myscript.py"

5.7. Tensorflow

  1. Tensorflow CPU only example tensorflow-cpu.sub:

    #!/bin/bash
    
    #SBATCH -p free          ## run on the free partition
    #SBATCH -N 1             ## run on a single node
    #SBATCH -n 1             ## request 1 task
    #SBATCH -t 02-00:00:00   ## 2-day run time limit
    
    module load tensorflow/2.0.0
    python mycode.py

    The above will submit the Tensorflow 2.0 job mycode.py with specified requested resources. Note: Because the default is one CPU per task, -n 1 can be thought of as requesting just one CPU or core.

    The equivalent command-line method:

    [user@login-x:~]$ module load tensorflow/2.0
    [user@login-x:~]$ sbatch -p standard -N 1 -n 1 -t 02-00:00:00 --wrap="python mycode.py"
  2. Tensorflow GPU example tensorflow-gpu.sub:

    #!/bin/bash
    
    #SBATCH -p gpu              ## run on the gpu partition
    #SBATCH -N 1                ## run on a single node
    #SBATCH -n 1                ## request 1 task
    #SBATCH -t 02-00:00:00      ## 2-day run time limit
    #SBATCH --gres=gpu:V100:1   ## request 1 gpu of type V100
    
    module load tensorflow/2.0.0
    python mycode.py

    The above will submit the gpu Tensorflow 2.0 job mycode.py with specified requested resources. Note: Because the default is one CPU per task, -n 1 can be thought of as requesting just one CPU or core.

    The equivalent command-line method:

    [user@login-x:~]$ module load tensorflow/2.0
    [user@login-x:~]$ sbatch -p gpu -N 1 -n 1 -t 02-00:00:00 \
    --gres=gpu:V100:1 --wrap="python mycode.py"

5.8. SAS

Single core/CPU example sas-single-cpu.sub

#!/bin/bash

#SBATCH -p standard                   ## run on the standard partition
#SBATCH -N 1                          ## run on a single node
#SBATCH -n 1                          ## request 1 task (1 CPU)
#SBATCH -t 00:60:00                   ## 60 min run time limit
#SBATCH --mail-type=end               ## send email when the job ends
#SBATCH --mail-user=UCInetID@uci.edu  ## use this email address

module load SAS/9.4
sas -noterminal mycode.sas

The above will submit your SAS code mycode.sas with specified requested resources. Make sure to use your actual email address. While Slurm sends emails to any email address, we prefer you use your UCInetID@uci.edu email address. System administrators will use UCInetID@uci.edu if they need to contact you about a job. Note: Because the default is one CPU per task, -n 1 can be thought of as requesting just one CPU.

The equivalent command-line method:

[user@login-x:~]$ module load SAS/9.4
[user@login-x:~]$ sbatch -p standard -N 1 -n 1 -t 00:60:00 --mail-type=end \
--mail-user=UCInetID@uci.edu --wrap="sas -noterminal mycode.sas"

5.9. Stata

  1. Single core/CPU example stata-single-cpu.sub

    #!/bin/bash
    
    #SBATCH -p standard                   ## run on the standard partition
    #SBATCH -N 1                          ## run on a single node
    #SBATCH -t 72:00:00                   ## 3-day run time limit
    #SBATCH -n 1                          ## request 1 task (1 CPU)
    
    module load stata/16
    stata-se -b do mycode.do

    The above will submit the Stata job (mycode.do) with specified requested resources. Note: Because the default is one CPU per task, -n 1 can be thought of as requesting just one CPU.

    The equivalent command-line method:

    [user@login-x:~]$ module load stata/16
    [user@login-x:~]$ sbatch -p standard -N 1 -t 72:00:00 -n 1 --wrap="stata-se -b do mycode.do"
  2. Multiple core/CPU example stata-multi-cpu.sub

    #!/bin/bash
    
    #SBATCH -p standard                   ## run on the standard partition
    #SBATCH -N 1                          ## run on a single node
    #SBATCH -t 01-00:00:00                ## 1-day run time limit
    #SBATCH -n 8                          ## request 8 task (8 CPUs)
    
    module load stata/16
    stata-mp -b do mycode.do

    The above will submit the Stata job (mycode.do) with specified requested resources. Note: Because the default is one CPU per task, -n 8 can be thought of as requesting 8 CPUs.

    The equivalent command-line method:

    [user@login-x:~]$ module load stata/16
    [user@login-x:~]$ sbatch -p standard -N 1 -t 01-00:00:00 -n 8 \
    --wrap="stata-mp -b do mycode.do"