3. DFS

3.1. Overview

This is a network-based multi-Petabyte storage put in place so that researchers across UCI have a reliable and resilient location to store their research data and share it with defined groups.

Parallel file systems running BeeGFS on top of ZFS provide scalable data storage on HPC3 in the /dfsX and /pub file paths.

Warning

DFS filesystems must not be used to store personally-identifiable information that would fall under guidelines such as FERPA (Student data) and HIPAA (health-care data).

If you are unsure that DFS is suitable for your data, please refer to general guidance for data security provided by the UCI Office of Research

Performance:
  • Of each file system is quite good (5-6 GByte/s) when used properly, but it is not too difficult for a single user to exceed the inherent capabilities and completely wreck performance for everyone

  • Multiple DFS systems provide an aggregate throughput > 30 GByte/s.

Take-home concepts about DFS parallel files systems are:
  • They perform well when reading/writing large files in good-sized (> 128KB) chunks.

  • They perform very poorly when reading/writing many small files.

  • All DFS systems are single copy, high-performance storage intended for scratch data. No special backup is available. Deleted data are gone forever.

  • Accessible only from HPC3 as a regular filesystem for storing data on the cluster. There are a few separate storage pools which are mounted on the cluster as /dfsX.

  • For recharge allocations please see Purchase DFS storage.

3.2. Allocations

There is no separate account for DFS filesystems. Quotas are enforced using groups.

No cost allocation - Private area

All users have access to the Private Area. Each user gets a fixed default allocation:

  • 1TB quota per account in /pub/UCInetID.

  • 1TB backup quota for a selective backup.

Recharge allocation - Group shared area

UCI Faculty members (PIs) can purchase low-cost recharge allocation(s) to fulfill their needs:

3.3. Storing Files

What to Store
  • Any frequently changing files

  • Any large input data files that are used for computational jobs

  • Jobs transient input/output/error files

  • Large user-authored or third-party software installations

Where to Store

Pick a location depending on the type of data (private or group access):

/pub/UCInetID
  • is a unique Private area, never shared with other users

  • the organization of files and directories is up to the user

  • do NOT change this directory permissions

/dfsX/<group-lab-path>
  • is a specific Group shared area, users may have access to one or more group areas

  • the organization of files and directories is up to the group members

  • all group members have read and write access

  • do NOT change directories permissions or sticky bit settings, see a warning below

File permissions
File permissions are used in determining quotas.
The permissions involve setting logical UNIX groups.

Warning

  • When we create Private areas and Group shared areas on DFS we set correct permissions on the top level directories.

  • Each Group shared area is initially configured with the group sticky bit set so that only allowed users can access this area.

  • We advise users to NOT change permissions on the directories and files when writing in the group area. Incorrect permissions lead to quota exceeded errors.

Please see UNIX primer to learn about UNIX groups and understand UNIX File permissions.

3.4. Quotas

All DFS-based file systems have quotas for all private and group shared areas.
Quotas are enforced using UNIX groups.
When writing in Private area:
  • Every user has a default personal group which is the same as their UCInetID (login).

  • 1TB personal group quota on /pub/UCInetID.

  • 1Tb selective backup quota (a default for each user).

When writing in Group shared area:
  • All members of the group contribute to the quota. It’s the sum total usage that counts.

  • There are no individual user quotas, only the group quota is used.

  • If you create file with the incorrect group, you will likely see over quota errors.

  • When quotas are exceeded, all users in the group will no longer be able to write in the affected filesystem and will need to remove some files and directories to free space.

Important

Users can’t change quotas and can’t request quotas. A PI can submit a ticket asking to update the quota based on purchasing. Please see Purchase DFS storage.

3.4.1. How to check

For all DFS file systems including selective backup one can use dfsquotas command to check user/group quotas on a particular DFS pool.

To see the quotas for user panteater on private allocation in /dfs6:

$ dfsquotas panteater dfs6
==== Group quotas on dfs6 for user panteater
----------------------------------------------------------------------------
               Group      ||          Size            ||    Chunk Files
            name | id     ||       used | allocated   ||    used | allocated
----------------------------------------------------------------------------
   panteater_lab | 012345 ||  26.25 GiB | 1024.00 GiB || 1310459 | unlimited
     alpha_users | 158537 ||     0 Byte | 1024.00 Gib ||       0 | unlimited
       panteater | 000865 || 755.59 GiB | 1024.00 GiB ||  258856 | unlimited

The above shows that a user panteater can write in its private area /pub/panteater using the groups listed in the output:

  • panteater_lab: a supplementary group; user wrote 26.25Gb of data.

  • alpha_users: a supplementary group; user wrote no files, but can if needed.

  • panteater: a default group; user wrote ~756Gb of data.

Note

Groups listed in the output are logical UNIX groups associated with a user account. The primary use of such groups is to assign group ownership of files and directories. The 1Tb quota allocation is a total space that can be used by all listed user UNIX groups combined, not by each group individually. \(1Tb = 1024Gb\).

To see the quotas for user panteater in lab shared allocation in /dfs9:
$ dfsquotas panteater dfs9
==== Group quotas on dfs9 for user panteater
----------------------------------------------------------------------------
               Group      ||          Size            ||    Chunk Files
            name | id     ||      used  | allocated   ||    used | allocated
-----------------------------------------------------------------------------
   panteater_lab | 012345 ||   38.36 TiB|   40.00 TiB || 1310459 | unlimited
     alpha_users | 158537 ||      0 byte|    1   byte ||       0 |         1
       panteater | 000865 ||      0 byte|    1   byte ||       0 |         1

The above shows that user panteater can write in its group allocation on dfs9

  • only if using UNIX group panteater_lab for which there is 40Tb allocation. Currently, the used space by all users allowed to write in this area is 38.36Tb.

  • there is 0 quota (shown as 1 byte) for a default personal group panteater or a supplemental group alpha_users. If a user tries to write using these UNIX groups it will result in permissions and over the quota errors.

To see the quotas on all DFS filesystemss:

$ dfsquotas panteater all

The output will show information for all available DFS filesystems. When a user has no quota on a particular filesystem it will show as No quotas to report.

For more info on using this command, try:

$ dfsquotas -h

3.4.2. Over quotas

When quota is filled, the users will not be able to write any files or directories and submitted jobs will fail with quota exceeded errors.

Quota is enforced by the file system based upon the UNIX group membership of a particular file:

For example, a listing of a current directory shows

$ ls -l
total 55524423
drwxrwsr-x  7 panteater bio                 7 Aug  5  2019 biofiles
-rw-r--r--  1 panteater panteater  4294967296 May 31  2019 performance.tst
drwxrwsr-x  3 panteater panteater           2 Oct  8 17:11 myfiles

The user panteater is storing files under two different groups:

  • bio: the files in the subdirectory biofiles with its content are charged to the group bio quota.

  • panteater: file performance.tst and subdirectory myfiles with its content are charged to the group panteater quota

Examine the permissions of the directories: drwxrwsr-x. Notice the s for the group execute permissions (character positions 5-7). This is called the sticky bit for the directory. It is subtle, but important difference: x instead of s in the group execute permission. Compare to permissions without sticky bit:

Sticky bit

Directory mode

Description

is set

drwxrwsr-x

In the origin directory, created files and directories are written with the group permissions rws. The sticky bit s is set.

is NOT set

drwxrwxr-x

In the origin directory, created files and directories are written with the active UNIX group permissions rwx which defaults to the user login.

The UNIX command newgrp can be used to change the active UNIX group:

For example, the user panteater by default has a group panteater. The following sequence of simple commands shows the ownership of the files created under different groups and changed when using the newgrp command.

$ id panteater                  # list user and group IDs
uid=1234567(panteater) gid=1234567(panteater) groups=1234567(panteater),158571(bio)
$ touch aaa                     # create a new empty file
$ ls -l aaa                     # check file permisisons
-rw-rw-r-- 1 panteater panteater 0 Nov  3 14:57 aaa

$ newgrp bio                    # change to a new group
$ touch bbb                     # create a new empty file
$ ls -l bbb                     # check file permissions
-rw-rw-r-- 1 panteater bio 0 Nov  3 14:57 bbb

Please type man newgrp to learn about this command.

Reasons for Over Quota
  1. Under normal operation, when the sticky bit is set on a directory, the correct quota enforcement occurs automatically because files and subdirectories are written with correct group, no newgrp command is needed. When all space is used over quota is issued.

  2. Very common quota problems on DFS result from:

    • inadvertently removing the sticky bit on a directory and then writing with the default personal group.

    • changing the group ownership of a file or directory and then trying to write to it with the default personal group.

    In these cases writing files and running jobs will fail with quota exceed errors.

  3. Transferring data to HPC3 with software that explicitly sets permissions is the most common way a sticky bit becomes overwritten.

    Note

    Please see Data transfer for information how to move data to the cluster.

3.4.3. Fix over quotas

Fixing Permissions

You can use the chmod command to fix directories that don’t have a sticky bit set, but should have. The following command will add the sticky bit to a particular directory.

$ chmod g+s directory-name

You can use the find command to find all directories in a subtree and combine it with chmod command to set the sticky bit on all found directories:

$ find . -type d -exec chmod g+s {} \; -print
Fixing Group Ownership

You can also use the chgrp and chown commands to change the group ownership of a file or directory. For example, to change the group from panteater to bio on a specific file or directory:

$ ls -l
total 55524423
drwxrwsr-x  7 panteater bio                 7 Aug  5  2019 biofiles
-rw-r--r--  1 panteater panteater  4294967296 May 31  2019 performance.tst
drwxrwsr-x  3 panteater panteater           2 Oct  8 17:11 myfiles

$ chgrp bio performance.txt
$ chown -R panteater:bio myfiles
$ ls -l
total 55524423
drwxrwsr-x  7 panteater bio                 7 Aug  5  2019 biofiles
-rw-r--r--  1 panteater bio        4294967296 May 31  2019 performance.tst
drwxrwsr-x  3 panteater bio                 2 Oct  8 17:11 myfiles

The ls -l command is used to show permissions before and after the change.

3.5. Selective Backup

We cannot backup everything on the cluster. Selective Backup allows the users to choose what is important and have it automatically saved. The physical location of the backup server is different from the cluster location for extra protection.

Important

  • You will want to backup only critical data such as scripts, programs, etc.

  • DO NOT backup data you can get from other sources, especially large data-sets.

  • If you go past your backup quota then backups stops for your account. The backup will fail as no new data can be written to the backup server since you reached your limit.

3.5.1. Default settings

The Selective Backup is based on rsync in conjunction with GNU Parallel. The combination maximizes the network throughput and server capabilities in order to backup hundreds of user accounts from multiple public and private filesystems.

The Selective Backup process will automatically start saving your home directory as well as some public and private disk spaces.

Note

For a majority of users, defaults are sufficient.
There is nothing for you to do if you like the defaults.

Users manage their Selective Backup via two control files located in their $HOME directory:

  1. .hpc-selective-backup file lists (1) backup options and the (2) files/directories names to be saved in order of priority from the most to the least important. All backup options are initially commented out.

    The files are backed up in the order as they are listed. That way, if a user runs out of selective disk quota before all listed files have been backed up, at least their most prized data are saved. By default, this file contains $HOME and /pub areas of your account:

    /data/homezvolX/UCInetID
    /pub/UCInetID
    

    The following table lists all available backup options:

    Selective Backup Option

    What it does

    HPC_SEND_EMAIL_SUMMARY

    Sends you daily email summaries of your saves. Default is NO summary email notifications.

    HPC_SEND_EMAIL_ON_ERROR

    You will receive an email only if rsync completes with an error. Error being non-zero exit status from rsync. Consult the man rsync page for error values and meaning. Default is NO email notifications.

    HPC_KEEP_DELETED=X

    Keep deleted files on the backup server for X days where X is a number in 0-90 range. Deleted files are files you removed from the source location. Default is 14 days.

  2. .hpc-selective-backup-exclude This file lists file/directories names you want to exclude from backup. By default, this file excludes ZFS snapshots from $HOME:

    $HOME/.zfs
    

    For more information on exclude please see the ANCHORING INCLUDE/EXCLUDE PATTERNS section of the man rsync output.

3.5.2. Custom settings

To customize, edit control files with your favorite editor. We highly recommend the following:

  1. request email notifications to make sure things are working

    Choose one of two SEND_EMAIL options in .hpc-selective-backup file and uncomment it (remove the # sign at the beginning of the line). For example, if you choose to receive email notifications in the event of errors, edit your configuration file and change the line:

    # HPC_SEND_EMAIL_ON_ERROR
    

    to:

    HPC_SEND_EMAIL_ON_ERROR
    
  2. perform some spot checks of what you think is being saved to make sure your data is indeed being backed-up.

3.5.3. Where backups are

A user can access backup files on the login nodes of the cluster from the following paths:

Where

What

/sbak/zvolX/backups/UCInetID/data/homezvolX/UCInetID

user $HOME

/sbak/zvolX/backups/UCInetID/pub/UCInetID

/pub/$USER/

/sbak/zvolX/backups/UCInetID/DELETED-FILES

deleted files by date (counts towards backup quota)

/sbak/zvolX/logs/$DATE/UCInetID

backup logs by date, available for the past Y days

Note

The X in /sbak/zvolX maps to the volume number shown in your $HOME variable. In other words:
/data/homezvol0 -> /sbak/zvol0/backups
/data/homezvol1 -> /sbak/zvol1/backups
/data/homezvol2 -> /sbak/zvol2/backups
/data/homezvol3 -> /sbak/zvol3/backups
The number of days Y is defined by HPC_KEEP_DELETED=Y in your .hpc-selective-backup

3.6. Deleted Files Recovery

Note

Deleted files and directories can be recovered provided they exist in the selective backup.
You have to be on a login node to access backup files.

Below is a general procedure for user panteater to restore accidentally deleted from /pub/panteater directory spring-2022 and files in it.

$ cd /sbak/zvol0/backups/panteater/DELETED-FILES                  # 1
$ find . -type d -name spring-2022                                # 2
./2024-0214/pub/panteater/spring-2022
./2024-0213/pub/panteater/spring-2022

$ ls ./2024-0214/pub/panteater/spring-2022/                       # 3
schedule1    schedule1.sub   slurm.template

$ cp -p -r ./2024-0214/pub/panteater/spring-2022 /pub/panteater   # 4

The above commands mean:

  1. The cd puts you at the top level of a backup directory for your files.

  2. The find finds all backups by date where the desired directory exists. Here, two snapshots are found by date: 2024-0214 and 2024-0213.

  3. Run ls for the specific snapshot to see if it has needed files.

  4. If needed files exists in the backup, user can use cp to copy the files back to the pub directory. It is recommended to use -p and -r options:
    -p copy preserves the time stamp and the ownership of a file.
    -r copy recursively, this is needed when copying a directory and its contents.

One can restore in a similar way files and directories deleted from $HOME.