2. HOME
2.1. Overview
2.2. Storing Files
- What to Store
configuration files for logging in or for applications, many by default are in $HOME
Store only important files here that change relatively infrequently
- What NOT to Store
DO NOT store and then delete large data files. Such data is considered transient and should be stored on DFS filesystems.
DO NOT store any large input data files that are used for computational jobs, or any large output/error job files. Use your personal /pub/UCInetID or your group DFS file systems.
- NO Symbolic links
Many users have additional space on one or more CRSP or DFS filesystems. As a shortcut some created soft inks from their $HOME to one or more of these filesystems. The links might look similar to this example:
[user@login-x:~]$ pwd /data/homezvol0/panteater [user@login-x:~]$ ls -l lrwxrwxrwx 1 panteater panteater 27 Jun 12 12:18 shared_lab -> /dfsX/pilab/shared_lab lrwxrwxrwx 1 panteater panteater 25 Jun 12 12:18 crsplalab -> /share/crsp/lab/pilab
- How to fix
Remove symbolic links from $HOME or its subdirectories (here shared_lab and crsplab).
Use aliases or environment variables in your .bashrc to create shortcuts to desired filesystems. See Symbolic Links for more info.
Attention
Symbolic link from ZFS filesystem to CRSP/DFS filesystem is a DANGEROUS practice as it often leads to unnecessary increase in loads on both NFS server where your ZFS-based $HOME is and on the linked CRSP or DFS filesystem:
Any command or process you run in your $HOME or that needs anything from your $HOME has to resolve this symbolic link and verify mount every single time.
This involves multiple operations and system calls between filesystem where your $HOME is and a parallel filesystem (CRSP or DFS) where your link points.
Executed many times by many users creates a performance issue for everyone.
2.3. Quotas
Your $HOME together with ZFS snapshots has a fixed quota:
Quota |
What for |
Access |
How to use |
---|---|---|---|
50GB |
$HOME |
read + write |
Keep it clean and organized. |
50GB |
ZFS snapshots |
read only |
ZFS snapshots are copies of added, deleted or rewritten data. This gives you some data protection/backup capability. |
Important
The total 100GB quota works as follows. If your snapshots consume X more space than 50GB, you $HOME quota is automatically reduced by that X amount.
2.3.1. ZFS Snapshots
Snapshots are kept in $HOME/.zfs/snapshots/ directory.
- Snapshot schedule:
- daily:
keep last 14
Per this schedule, you have about 2 weeks before a file is permanently deleted.Any changes or file deletions that occurred more than 2 weeks ago are gone forever.- Snapshots are point-in-time copies of data
- All files and directories in your $HOME are included in snapshots.You cannot exclude any file or directories from a snapshot.Your home area is snapshot daily at a random time.Snapshots are kept for a period of time and then automatically deleted.A file/directory is permanently deleted when the last snapshot that holds it is removed.Under normal use, the 100GB total limit for $HOME+Snapshots is rarely reached.
- Snapshots do not protect you from all possible data loss
Lost data can only be recovered if it existed at the time a snapshot was taken and the snapshot is still available. If you create a file and delete it a few hours later, that file is likely irretrievable.
- ZFS snapshots capability is not the same as a selective backup
Selective backup was created for automatically saving important files that are located in various file paths, including DFS filesystems. See Selective Backup.
- Every time a snapshot is taken, a virtual copy of all files at that time reside in the snapshot
When you delete a file, it is still in the snapshot. If you constantly create and delete files, many of the creates/deletes will remain in snapshots and consume more space.
Important
This is why you should never put transient or frequently changed files in $HOME.
2.3.2. How to check
Changes to the contents of your $HOME are recorded daily and result in snapshots. How frequently and how much data you add/delete/overwrite affects how much data your can store in $HOME.
Attention
If you are changing the contents very often the snapshots will go over the quota very quickly.
- To see your $HOME quota usage do:
$ df -h ~ Filesystem Size Used Avail Use% Mounted on 10.240.61.77:/homezvol0/panteater 50G 14G 37G 27% /data/homezvol0/panteater
The ~ is a short notation for your $HOME. The output above shows that user panteater used 14Gb of its 50Gb allocation.
Note
Snapshots do not show in the quota output.
To see the usage by files and directories in $HOME:
$ cd # change to your $HOME directory $ ls # list contents of $HOME bin examples maintenance perl5 sw tst.pl copy-archive.sh info matlab R sys writing database.py local modulefiles README testmodfiles $ du -s -h * # find disk usage for files and directories in $HOME 98K bin 2.0K copy-archive.sh 2.0K database.py 1.4M examples 5.1M info 64M local 140K maintenance 1.5K matlab 59K modulefiles 88K perl5 918M R 4.0K README 17K sw 31M sys 2.0K testmodfiles 1.0K tst.pl 32M writingThe output shows disk usage in kilobytes (K), megabytes (M) or gigabytes (G). For directories, all their contents are included. For example, a directory R and everything under it use total 918Mb of disk space.
The above list does not sum to 14Gb, where did the rest of disk space go? The
du -s -h *
command does not take into an account hidden files and directories which are names that start with a dot character.Many applications configuration/setup files as well as shell initialization files by default are in $HOME. They are by design hidden,
To see the usage by hidden files and directories:
$ du -s -h .[a-z,A-Z]* # type command as shown here 54M .aspera 20K .bash_history 2.0K .bashrc 7.1M .beast 37M .cache # used by many applications to store cached data 5.0G .conda # user installed conda environments and packages 37M .config # used by many applications for configuration files 625M .local # user installed Python packaged (by pip) 2.0K .MathWorks 5.7M .matlab 1.5K .Rhistory 1.5K .rnd 42K .rstudio 6.5G .singularity # used as a cache by singularity containers 512 .slurm 22K .ssh 1.5K .vim 2.5K .vscode-remote 167M .vscode-server # used by VS Code ... <deleted lines> ...The total of all hidden files and directories is close to 13Gb, most of the storage for this user.
2.3.3. Over quotas
Important
Once you fill your quota you will not be able to write in your $HOME until some of the space is freed. You applications and jobs will exhibit various errors and will fail.
Most of the errors are (but not limited to):
Cannot write to …Disk quota exceeded for …
The only way to free space is to remove some snapshots and the users CAN NOT do this themselves. You will have to submit a ticket to hpc-support@uci.edu
After your snapshots are removed you will be required to free enough space in your $HOME in order to continue to work.
2.4. Deleted Files Recovery
You can use snapshots to restore files and directories provided that existing snapshots still hold the desired data. There is no way to restore files changed more than 2 weeks ago. Below is an example how to restore accidentally deleted file. A similar technique can be used for multiple files and directories.
File is accidentally deleted
$ ls -l out -rw-rw-r-- 1 panteater panteater 4004 Sep 17 15:13 out $ rm -rf out $ ls -l out ls: cannot access out: No such file or directory
Check the existing snapshots
$ ls .zfs/snapshot/ zfs-auto-snap_daily-2020-09-16-1017 zfs-auto-snap_daily-2020-09-17-1045 zfs-auto-snap_daily-2020-09-18-1048The output indicates there are 3 daily snapshots done at different times. Snapshots names include a time stamp: year, month, day, hours and minutes.
The deleted file had a time stamp Sep 17 15:13 which means the file was created or modified on that time.
The first two snapshots time stamps 2020-09-16-1017 and 2020-09-17-1045 are earlier that the deleted file time stamp so they will not have deleted file or will have an earlier version of it.
Need to search the snapshots that have a time stamp that is later than the time stamp of the deleted file:
$ ls .zfs/snapshot/zfs-auto-snap_daily-2020-09-18-1048/out .zfs/snapshot/zfs-auto-snap_daily-2020-09-18-1048/out
- Restore file from a snapshot
Copy found file:
$ cp zfs/snapshot/zfs-auto-snap_daily-2020-09-18-1048/out . $ ls -l out -rw-rw-r-- 1 panteater panteater 4004 Sep 18 10:53 out