3. DFSο
3.1. Overviewο
This is a network-based multi-Petabyte storage put in place so that researchers across UCI have a reliable and resilient location to store their research data and share it with defined groups.
Parallel file systems running BeeGFS on top of ZFS provide scalable data storage on HPC3 in the /dfsX and /pub file paths.
Warning
DFS filesystems must not be used to store personally-identifiable information that would fall under guidelines such as FERPA (Student data) and HIPAA (health-care data).
If you are unsure that DFS is suitable for your data, please refer to general guidance for data security provided by the UCI Office of Research
- Performance:
Of each file system is quite good (5-6 GByte/s) when used properly, but it is not too difficult for a single user to exceed the inherent capabilities and completely wreck performance for everyone
Multiple DFS systems provide an aggregate throughput > 30 GByte/s.
- Take-home concepts about DFS parallel files systems are:
They perform well when reading/writing large files in good-sized (> 128KB) chunks.
They perform very poorly when reading/writing many small files.
All DFS systems are single copy, high-performance storage intended for scratch data. No special backup is available. Deleted data are gone forever.
Accessible only from HPC3 as a regular filesystem for storing data on the cluster. There are a few separate storage pools which are mounted on the cluster as /dfsX.
For recharge allocations please see Purchase DFS storage.
3.2. Allocationsο
There is no separate account for DFS filesystems. Quotas are enforced using groups.
- No cost allocation - Private area
All users have access to the Private Area. Each user gets a fixed default allocation:
1TB quota per account in /pub/UCInetID.
1TB backup quota for a selective backup.
- Recharge allocation - Group shared area
UCI Faculty members (PIs) can purchase low-cost recharge allocation(s) to fulfill their needs:
These group areas are quota allocations in /dfsX/group-lab-path based on PIβs purchase.
The PI is the storage owner:
can specify additional users who will have read and write access to the area.
Users can request to be added to the PIβs group shared area.
3.3. Storing Filesο
- What to Store
Any frequently changing files
Any large input data files that are used for computational jobs
Jobs transient input/output/error files
Large user-authored or third-party software installations
- Where to Store
Pick a location depending on the type of data (private or group access):
- /pub/UCInetID
is a unique Private area, never shared with other users
the organization of files and directories is up to the user
do NOT change this directory permissions
- /dfsX/<group-lab-path>
is a specific Group shared area, users may have access to one or more group areas
the organization of files and directories is up to the group members
all group members have read and write access
do NOT change directories permissions or sticky bit settings, see a warning below
- File permissions
- File permissions are used in determining quotas.The permissions involve setting logical UNIX groups.
Warning
When we create Private areas and Group shared areas on DFS we set correct permissions on the top level directories.
Each Group shared area is initially configured with the group sticky bit set so that only allowed users can access this area.
We advise users to NOT change permissions on the directories and files when writing in the group area. Incorrect permissions lead to quota exceeded errors.
Please see UNIX primer to learn about UNIX groups and understand UNIX File permissions.
3.4. Quotasο
- When writing in Private area:
Every user has a default personal group which is the same as their UCInetID (login).
1TB personal group quota on /pub/UCInetID.
1Tb selective backup quota (a default for each user).
- When writing in Group shared area:
All members of the group contribute to the quota. Itβs the sum total usage that counts.
There are no individual user quotas, only the group quota is used.
If you create file with the incorrect group, you will likely see over quota errors.
When quotas are exceeded, all users in the group will no longer be able to write in the affected filesystem and will need to remove some files and directories to free space.
Important
Users canβt change quotas and canβt request quotas. A PI can submit a ticket asking to update the quota based on purchasing. Please see Purchase DFS storage.
3.4.1. How to checkο
For all DFS file systems including selective backup one can use dfsquotas
command to check user/group quotas on a particular DFS pool.
To see the quotas for user panteater on private allocation in /dfs6:
$ dfsquotas panteater dfs6 ==== Group quotas on dfs6 for user panteater ---------------------------------------------------------------------------- Group || Size || Chunk Files name | id || used | allocated || used | allocated ---------------------------------------------------------------------------- panteater_lab | 012345 || 26.25 GiB | 1024.00 GiB || 1310459 | unlimited alpha_users | 158537 || 0 Byte | 1024.00 Gib || 0 | unlimited panteater | 000865 || 755.59 GiB | 1024.00 GiB || 258856 | unlimitedThe above shows that a user panteater can write in its private area /pub/panteater using the groups listed in the output:
panteater_lab: a supplementary group; user wrote 26.25Gb of data.
alpha_users: a supplementary group; user wrote no files, but can if needed.
panteater: a default group; user wrote ~756Gb of data.
Note
Groups listed in the output are logical UNIX groups associated with a user account. The primary use of such groups is to assign group ownership of files and directories. The 1Tb quota allocation is a total space that can be used by all listed user UNIX groups combined, not by each group individually. \(1Tb = 1024Gb\).
- To see the quotas for user panteater in lab shared allocation in /dfs9:
$ dfsquotas panteater dfs9 ==== Group quotas on dfs9 for user panteater ---------------------------------------------------------------------------- Group || Size || Chunk Files name | id || used | allocated || used | allocated ----------------------------------------------------------------------------- panteater_lab | 012345 || 38.36 TiB| 40.00 TiB || 1310459 | unlimited alpha_users | 158537 || 0 byte| 1 byte || 0 | 1 panteater | 000865 || 0 byte| 1 byte || 0 | 1
The above shows that user panteater can write in its group allocation on dfs9
only if using UNIX group panteater_lab for which there is 40Tb allocation. Currently, the used space by all users allowed to write in this area is 38.36Tb.
there is 0 quota (shown as 1 byte) for a default personal group panteater or a supplemental group alpha_users. If a user tries to write using these UNIX groups it will result in permissions and over the quota errors.
To see the quotas on all DFS filesystemss:
$ dfsquotas panteater allThe output will show information for all available DFS filesystems. When a user has no quota on a particular filesystem it will show as No quotas to report.
For more info on using this command, try:
$ dfsquotas -h
3.4.2. Over quotasο
When quota is filled, the users will not be able to write any files or directories and submitted jobs will fail with quota exceeded errors.
Quota is enforced by the file system based upon the UNIX group membership of a particular file:
For example, a listing of a current directory shows
$ ls -l total 55524423 drwxrwsr-x 7 panteater bio 7 Aug 5 2019 biofiles -rw-r--r-- 1 panteater panteater 4294967296 May 31 2019 performance.tst drwxrwsr-x 3 panteater panteater 2 Oct 8 17:11 myfilesThe user panteater is storing files under two different groups:
bio: the files in the subdirectory biofiles with its content are charged to the group bio quota.
panteater: file performance.tst and subdirectory myfiles with its content are charged to the group panteater quota
Examine the permissions of the directories: drwxrwsr-x. Notice the s for the group execute permissions (character positions 5-7). This is called the sticky bit for the directory. It is subtle, but important difference: x instead of s in the group execute permission. Compare to permissions without sticky bit:
Sticky bit
Directory mode
Description
is set
drwxrwsr-x
In the origin directory, created files and directories are written with the group permissions rws. The sticky bit s is set.
is NOT set
drwxrwxr-x
In the origin directory, created files and directories are written with the active UNIX group permissions rwx which defaults to the user login.
The UNIX command newgrp
can be used to change the active UNIX group:
For example, the user panteater by default has a group panteater. The following sequence of simple commands shows the ownership of the files created under different groups and changed when using the
newgrp
command.$ id panteater # list user and group IDs uid=1234567(panteater) gid=1234567(panteater) groups=1234567(panteater),158571(bio) $ touch aaa # create a new empty file $ ls -l aaa # check file permisisons -rw-rw-r-- 1 panteater panteater 0 Nov 3 14:57 aaa $ newgrp bio # change to a new group $ touch bbb # create a new empty file $ ls -l bbb # check file permissions -rw-rw-r-- 1 panteater bio 0 Nov 3 14:57 bbbPlease type
man newgrp
to learn about this command.
- Reasons for Over Quota
Under normal operation, when the sticky bit is set on a directory, the correct quota enforcement occurs automatically because files and subdirectories are written with correct group, no
newgrp
command is needed. When all space is used over quota is issued.Very common quota problems on DFS result from:
inadvertently removing the sticky bit on a directory and then writing with the default personal group.
changing the group ownership of a file or directory and then trying to write to it with the default personal group.
In these cases writing files and running jobs will fail with quota exceed errors.
Transferring data to HPC3 with software that explicitly sets permissions is the most common way a sticky bit becomes overwritten.
Note
Please see Data transfer for information how to move data to the cluster.
3.4.3. Fix over quotasο
- Fixing Permissions
You can use the
chmod
command to fix directories that donβt have a sticky bit set, but should have. The following command will add the sticky bit to a particular directory.$ chmod g+s directory-name
You can use the
find
command to find all directories in a subtree and combine it withchmod
command to set the sticky bit on all found directories:$ find . -type d -exec chmod g+s {} \; -print
- Fixing Group Ownership
You can also use the
chgrp
andchown
commands to change the group ownership of a file or directory. For example, to change the group from panteater to bio on a specific file or directory:$ ls -l total 55524423 drwxrwsr-x 7 panteater bio 7 Aug 5 2019 biofiles -rw-r--r-- 1 panteater panteater 4294967296 May 31 2019 performance.tst drwxrwsr-x 3 panteater panteater 2 Oct 8 17:11 myfiles $ chgrp bio performance.txt $ chown -R panteater:bio myfiles $ ls -l total 55524423 drwxrwsr-x 7 panteater bio 7 Aug 5 2019 biofiles -rw-r--r-- 1 panteater bio 4294967296 May 31 2019 performance.tst drwxrwsr-x 3 panteater bio 2 Oct 8 17:11 myfiles
The
ls -l
command is used to show permissions before and after the change.
3.5. Selective Backupο
We cannot backup everything on the cluster. Selective Backup allows the users to choose what is important and have it automatically saved. The physical location of the backup server is different from the cluster location for extra protection.
Important
You will want to backup only critical data such as scripts, programs, etc.
DO NOT backup data you can get from other sources, especially large data-sets.
If you go past your backup quota then backups stops for your account. The backup will fail as no new data can be written to the backup server since you reached your limit.
3.5.1. Default settingsο
The Selective Backup is based on rsync
in conjunction with GNU Parallel. The combination
maximizes the network throughput and server capabilities in order to backup hundreds of
user accounts from multiple public and private filesystems.
The Selective Backup process will automatically start saving your home directory as well as some public and private disk spaces.
Note
Users manage their Selective Backup via two control files located in their $HOME directory:
.hpc-selective-backup file lists (1) backup options and the (2) files/directories names to be saved in order of priority from the most to the least important. All backup options are initially commented out.
The files are backed up in the order as they are listed. That way, if a user runs out of selective disk quota before all listed files have been backed up, at least their most prized data are saved. By default, this file contains $HOME and /pub areas of your account:
/data/homezvolX/UCInetID /pub/UCInetID
The following table lists all available backup options:
Selective Backup Option
What it does
HPC_SEND_EMAIL_SUMMARY
Sends you daily email summaries of your saves. Default is NO summary email notifications.
HPC_SEND_EMAIL_ON_ERROR
You will receive an email only if rsync completes with an error. Error being non-zero exit status from rsync. Consult the
man rsync
page for error values and meaning. Default is NO email notifications.HPC_KEEP_DELETED=X
Keep deleted files on the backup server for X days where X is a number in 0-90 range. Deleted files are files you removed from the source location. Default is 14 days.
.hpc-selective-backup-exclude This file lists file/directories names you want to exclude from backup. By default, this file excludes ZFS snapshots from $HOME:
$HOME/.zfs
For more information on exclude please see the ANCHORING INCLUDE/EXCLUDE PATTERNS section of the
man rsync
output.
3.5.2. Custom settingsο
To customize, edit control files with your favorite editor. We highly recommend the following:
request email notifications to make sure things are working
Choose one of two SEND_EMAIL options in .hpc-selective-backup file and uncomment it (remove the # sign at the beginning of the line). For example, if you choose to receive email notifications in the event of errors, edit your configuration file and change the line:
# HPC_SEND_EMAIL_ON_ERROR
to:
HPC_SEND_EMAIL_ON_ERROR
perform some spot checks of what you think is being saved to make sure your data is indeed being backed-up.
3.5.3. Where backups areο
A user can access backup files on the login nodes of the cluster from the following paths:
Where |
What |
---|---|
/sbak/zvolX/backups/UCInetID/data/homezvolX/UCInetID |
user $HOME |
/sbak/zvolX/backups/UCInetID/pub/UCInetID |
/pub/$USER/ |
/sbak/zvolX/backups/UCInetID/DELETED-FILES |
deleted files by date (counts towards backup quota) |
/sbak/zvolX/logs/$DATE/UCInetID |
backup logs by date, available for the past Y days |
Note
3.6. Deleted Files Recoveryο
Note
Below is a general procedure for user panteater to restore accidentally deleted from /pub/panteater directory spring-2022 and files in it.
$ cd /sbak/zvol0/backups/panteater/DELETED-FILES # 1
$ find . -type d -name spring-2022 # 2
./2024-0214/pub/panteater/spring-2022
./2024-0213/pub/panteater/spring-2022
$ ls ./2024-0214/pub/panteater/spring-2022/ # 3
schedule1 schedule1.sub slurm.template
$ cp -p -r ./2024-0214/pub/panteater/spring-2022 /pub/panteater # 4
The above commands mean:
The
cd
puts you at the top level of a backup directory for your files.The
find
finds all backups by date where the desired directory exists. Here, two snapshots are found by date: 2024-0214 and 2024-0213.Run
ls
for the specific snapshot to see if it has needed files. If needed files exists in the backup, user can usecp
to copy the files back to the pub directory. It is recommended to use-p
and-r
options:-p
copy preserves the time stamp and the ownership of a file.-r
copy recursively, this is needed when copying a directory and its contents.
One can restore in a similar way files and directories deleted from $HOME.