1. Beginner guide
This is a step by step beginner guide that explains how to get an account, to login, and to do a few simple things on HPC3 cluster.
1.1. Your Laptop
You will need to have a few applications on your laptop, most are standard:
Application Type |
Your laptop model |
|||
---|---|---|---|---|
Linux |
macOS |
Windows |
Windows 10 |
|
VPN access |
Install client VPN software per the instructions UCI campus VPN |
|||
Terminal |
use your favorite |
Terminal or iTerm2 |
Windows Terminal, Linux Subsystem for Windows or MobaXterm |
|
Secure Connect |
ssh |
ssh |
ssh |
|
Secure Data Transfer |
scp or rsync |
scp or rsync |
Filezilla or WinSCP |
scp or rsync (in Linux Subsystem for Windows) |
Attention
MobaXterm users DO NOT enable Remote monitoring! See MobaXterm monitoring for more info.
1.2. Get an account
Anyone with a valid UCINetID can have an account. Please see Getting an account. Once you have an account you can login to HPC3 cluster.
HPC3 is a shared facility. Please read Acceptable use guide.
1.3. Logging in
The full name of the HPC3 cluster is hpc3.rcic.uci.edu and this is the name that should be used in all connections.
We require multi-factor authentication for all password-based logins. Please make sure you have enabled your DUO device using UCI’s Duo infrastructure.
- To directly login on HPC3 cluster you must:
- be connected to the UCI campus VPN (or be on the campus network)use
ssh
command from your Terminal applicationhave a way to use DUO authentication - Step 1
Connect to UCI campus VPN, see UCI campus VPN instructions
- Step 2
Open your Terminal application on your laptop.
- Step 3
In your Terminal application start ssh session and use your your regular UCI credentials (UCINetID and password) to connect to the cluster login node.
For example, a user with UCINetID panteater will use the ssh command specifying login name and a cluster name:
$ ssh panteater@hpc3.rcic.uci.edu
When prompted for a password a user will enter UCNetID password followed by Return key. Note, password will not be visible when typed:
Password:
After that user is prompted to enter a code (backup or generated by your DUO device) or request a push to your enrolled DUO-enabled device. A prompt looks similar to:
Duo two-factor login for panteater Enter a passcode or select one of the following options: 1. Duo Push to XXX-XXX-1234 Passcode or option (1-1):
Type desired option (in this example 1).
Passcode or option (1-1): 1
Now use the DUO app on your phone and respond to the received DUO notification. Press Approve on your DUO app when prompted. If the DUO authentication is successful you will see on your laptop:
Success. Logging you in...
And after a successful login you will see a screen similar to the following:
+-----------------------------------------+ | _ _ _ _ ____ | | | | ___ __ _(_)_ __ (_) | ___| | | | |/ _ \ / _` | | '_ \ _____| | |___ \ | | | | (_) | (_| | | | | |_____| | |___) | | | |_|\___/ \__, |_|_| |_| |_|_|____/ | | |___/ | +-----------------------------------------+ Distro: Rocky 8.7 Green Obsidian Virtual: NO CPUs: 40 RAM: 191.8GB BUILT: 2022-08-30 14:02 ACCEPTABLE USE: https://rcic.uci.edu/documents/RCIC-Acceptable-Use-Policy.pdf [anteater@login-x:~]$
The above text output screen is called motd. It includes a general information about the cluster login node (we have a few) plus important messages about the cluster such as pending shutdowns.
The last line of the output
is your shell prompt, this is where you can type commands.
1.4. Simple commands
Users who are unfamiliar with Linux environment will need to learn the basics of Bash shell, file editing, or using language such as R or Python. Please see the Tutorials page that lists links to various beginner tutorials.
The cluster shell bash, which is a command language interpreter that executes commands read from the standard input (what you type). Prompt is automatically provided by the bash shell, you don’t need to type it.
Below is a small set of simple but very useful commands to try. What you type is immediately after the prompt
. Each command returns an output that will be displayed in your terminal window and will be similar to the following:[user@login-x:~]$ pwd
/data/homezvol0/panteater
[user@login-x:~]$ date
Mon May 19 12:43:42 PDT 2023
[user@login-x:~]$ hostname
login-i15
[user@login-x:~]$ ls
perl5
[user@login-x:~]$ ls -l
drwx------ 3 panteater panteater 9 Jul 13 00:02 .
drwxr-xr-x 785 root root 785 Jul 16 10:32 ..
-rw-r--r-- 1 panteater panteater 183 Jul 12 14:42 .bash_profile
-rw-r--r-- 1 panteater panteater 541 Jul 12 14:42 .bashrc
-rw-r--r-- 1 panteater panteater 500 Jul 12 14:42 .emacs
-rw-r--r-- 1 panteater panteater 17 Jul 12 14:42 .forward
-rw------- 1 panteater root 1273 Jul 13 00:02 .hpc-selective-backup
-rw------- 1 panteater root 0 Jul 13 00:02 .hpc-selective-backup-exclude
drwxrwxr-x 2 panteater panteater 2 Jun 15 09:48 perl5
pwd
command prints name of current/working directorydate
command prints current date and time in default formathostname
command prints current host name. The cluster has a few login nodes
and multiple working nodes, each has its own unique name.ls
command prints directory contents, here current directory.ls -l
with an optional flag -l
lists all contents including
hidden files that start with dot . and info about each file.By default, many commands need no arguments or additional flags, just like most of the examples above. Arguments given to the commands provide more specific information in the output, as the last command above did.
To learn about specific commands consult tutorials or use manual pages via man
command. For example to learn more about ls
command type:
[user@login-x:~]$ man ls
and use the space key to scroll through the output on the screen.
1.5. GUI
The cluster environment is not well suited for GUI type of applications. Most of the commands users need to type in, there are no clickable icons and no pop-up windows.
1.6. File editing
Users will need to learn one of file editors: vim
or emacs
.
Choose the editor that is more intuitive for you.
See the File Editors beginner tutorials, many more are available online.
Important
Please avoid using Special Characters in file and directory names.
1.7. Running applications
Cluster is a shared resource, at any time there are many users logged in and hundreds of jobs are running. What you do can adversely affect others.
Please follow Simple conduct to avoid problems.
We use Slurm scheduler to run CPU intensive or long running applications. In depth Slurm guide provides extensive info about using the scheduler. This page shows a short summary.
Slurm is an open-source workload manager for Linux clusters and provides:
HPC3 has different kinds of nodes (servers) that are separated into groups according to their resources (memory, cpu, etc). Slurm uses the term partition to signify a queue of resources and jobs are submitted to partitions.
We have a few partitions, most users will need to use just the:
standard partition is for jobs that should not be interrupted. Usage is charged against the user’s Slurm bank account. Each user gets FREE one time allocation of 1000 core hours to run jobs here. Users are NOT CHARGED FOR IT.
If all allocation is used, users can run jobs in this partition only if they are associated with labs that have core hours in their lab banks. Usually, lab bank is a PI lab account.
free partition is for jobs that can be preempted (killed) by standard jobs. Users can run jobs in this partition even if they have only 1 core-hour left. There are no charges for using this partition.
1.7.1. Using interactive job
To request an interactive job, use the srun
command.
Suppose you are enabled to charge to the panteater_lab account then,
to start an interactive session you can use one of 3 methods :
[user@login-x:~]$ srun --pty /bin/bash -i # 1
[user@login-x:~]$ srun -p free --pty /bin/bash -i # 2
[user@login-x:~]$ srun -A panteater_lab --pty /bin/bash -i # 3
Above 3 commands mean your jobs will be put on an available node:
in standard partition using your default Slurm bank account
in free partition using your default Slurm bank account
in standard partition using panteater_lab account
Once you execute the command, you will be put by Slurm on a compute node and will see a new shell prompt in the terminal, for example: [panteater@hpc3-l18-04:~]$
Now you can run your applications and commands from the command line.
After you are done logout from interactive node:
[user@hpc3-l18-04:~]$ logout
This will end your Slurm interactive session and you will return to the terminal window on the login node.
1.7.2. Using batch job
Slurm batch jobs can be submitted to the same queues as interactive jobs.
A batch job is run by the scheduler at sometime in the future and the scheduler picks an available time and node. Usually, it is within minutes, or as soon as requested resources become available. Slurm balances resource usage among many users and many jobs.
A user needs to use sbatch
command and a submit script.
In the steps below you will download an example Slurm script, python example script, submit slurm script to the scheduler and check the job output file.
All commands are executed on the cluster and all files are downloaded from the web server to the filesystem that is allocated to you on the cluster. The Slurm script and python script don’t need editing after the download and can be used as is.
- Step: download an example batch script
Type all 4 commands exactly as they are shown.
[user@login-x:~]$ cd /pub/$USER [user@login-x:~]$ wget https://rcic.uci.edu/_static/examples/firstjob.sub [user@login-x:~]$ wget https://rcic.uci.edu/_static/examples/days.py [user@login-x:~]$ cat firstjob.sub
The commands are:
cd
- to go to your DFS allocation area, here :tt:`$USER is a shortcut for your UCNetID.wget
- to download the example Slurm submit script and save it as firstjob.sub filewget
- to download the example python script and save it as days.py file. It is a simple python program that prints today’s date and a random day 1-365 days in the past.cat
- to show the content of the Slurm script in the Terminal window.
- Step: submit job to Slurm scheduler
[user@login-x:~]$ sbatch firstjob.sub Submitted batch job 5776081
The output shows that script was submitted as a job with ID 5776081. All job IDs are unique, yours will be different and the output file name of your job will reflect a different ID.
- Step: Check the job status and output file
This test job will run very quickly (fraction of a second) because it executes a few very fast commands and has no computational component.
[user@login-x:~]$ squeue -u $USER JOBID PARTITION NAME USER ACCOUNT ST TIME CPUS NODE NODELIST(REASON) [user@login-x:~]$ ls firstjob.5776081.err firstjob.5776081.out firstjob.sub [user@login-x:~]$ cat firstjob.5776081.out Running job on host hpc3-l18-05 Today is 2021-07-23 and 325 days ago it was 2020-09-01
The commands are:
squeue
- to check the status of your job. When the output shows a single line as shown, the job is finished, otherwise there will be info about your job in the output.ls
- to list the files in the current directory. There will be 2 additional files listed. These are error/output files produced by the Slurm job as was requested in the submit script.cat
- to show the contents of the output file in the Terminal window. Here the text shows the output of the commands that were submitted with the firstjob.sub submit script.
1.8. Storage
The filesystem storage is generally in 3 areas. Please see the links below for detailed information about each filesystem.
- HOME:
All users have 50GB quota $HOME area. The $HOME is in /data/homezvolX/ucinetid. Use it for storing important and rarely changed files.
- DFS:
All users have 1Tb quota /pub/ucinetid area. Use it for storing data sets, documents, Slurm scripts and jobs input/output.
Depending on a lab affiliation, some users may have space in additional DFS areas (/dfs2, /dfs3a, etc).
- CRSP
By default users don’t have access to this area.
Depending on a lab affiliation, some users may have space in /share/crsp/lab/labname/ucinetid. Please see Getting CRSP Account for details.
Important
1.9. Transferring files
Often users need to brings data from other servers and laptops.
To transfer data one needs to use scp
(secure copy) or rsync
(file copying tool).
Please see detailed Data transfer examples.
Alternatively, one can use graphical tools on their laptops (Filezilla, MountainDuck, or WinSCP) to transfer files between a local laptop and the cluster. Please follow each program instructions how to do this.
In all of the transfer application you will need to use hpc3.rcic.uci.edu as a remote server (where you want to transfer your files to/from) and use your UCNetID credentials for your user name and password.
- Simple examples of file transfers with scp:
The
scp
command is used to transfer files and directories between a local laptop and a remote server. The command has a simple structure:scp OPTIONS SOURCE DESTINATION
We omit OPTIONS for thie simple cases.
The SOURCE and DESTINATION may be specified as a local file name, or a remote host with optional path in the form user@server:path where
user is your account on a cluster@server: is the server name delimited with 2 special characters,character @ separates user name from server namecharacter : separates server name from path namepath is a file path on the server
File path names can be made explicit using absolute or relative names. For example /Users/someuser/project1/input/my.fasta is an absolute or full name, and the same file can be referred to as my.fasta which is a relative file name when used from the directory where this file is located.
Examples below use UCnetID panteater, you need to use your UCnetID credentials (username and password).
To transfer a single file myfile.txt from your laptop to HPC3 and put it in the directory /pub/panteater:
On your laptop, use a Terminal app and descend into the directory where your file is located, then execute the
scp
command (use your UCnetID):scp myfile.txt panteater@hpc3.rcic.uci.edu:/pub/panteater/myfile.txt
To transfer a single file j-123.fa from HPC3 to your laptop
On your laptop, use a Terminal app and descend into the directory where you want to transfer the file to, then execute the
scp
command (use your UCnetID):scp panteater@hpc3.rcic.uci.edu:/pub/panteater/project1/j-123.fa j-123.fa
To transfer multiple files from your laptop to HPC3:
scp f1.py f2.py doc.txt panteater@hpc3.rcic.uci.edu:/pub/panteater
To transfer the /pub/panteater/results/ directory and all its contents from HPC3 to your laptop into the directory where the command is executed top-level directory with its contents locally on your laptop. Note, the single dot character at the end means copy to this current directory.
scp -r panteater@hpc3.rcic.uci.edu:/pub/panteater/results .
1.10. Logout
You can run many commands and submit many jobs.
After you are done with your work you need to logout from the cluster
using logout
or exit
command, for example:
[user@login-x:~]$ logout