File systems

File Systems

The Lichtenberg cluster's storage system is highly tuned for performance and high I/O throughput, consists of expensive optimized hardware and is thus not suitable as long-term storage.

It provides the following file systems for data related to calculations / batch jobs:

Query storage occupation

The command „cquota“ provides an overview of all available storage to you with information about occupied space and file quotas.

Mountpoint	`/work/home/` (= `/home/`)	`/work/projects/` `/work/groups/`	`/work/scratch/`
Size	Σ 6,1 PByte
Performance	up to 1,200 Gbps ^{(the bandwidth available for the individual user depends on the overall file system traffic)}
Files accessible from/for	global for all nodes
Persistence	permanent	during the project's validity term + 6 months grace time	8 weeks after their last write access, files will be deleted unconditionally and without further notice
Quota	50 GByte ^and/or 4 Mio files	5 TByte ^and/or 200,000 files	20 TByte ^and/or 20 Mio files
Backup	Snapshots (see below) + daily tape backup (for desaster recovery only)		none!
Usage pattern	low-volume I/O ^{Interactive, static input data, results of finished jobs} *Do not* use home, groups or projects for running jobs!**		high-volume I/O ^{Batch, running jobs' input/output, intermediary files (CPR)}

All above file systems allow for file transfers to and from outside the cluster (i.e. to back up files in time and before deletion from /work/scratch to your own storage).

Since October 2019, the global cluster file systems do not differ in throughput or latency any longer, as they share the same large pool of NVMe SSDs .

However, some internal optimizations determine two “classes” of file systems: low volume I/O and high volume I/O .

___________________________________________________________________________

Low-volume I/O

Caution

Due to being too much volume for snapshots and tape backup: Do not use home, groups or projects for I/O of running jobs!

Backup

For this class of file systems with low turnover in terms of number and size of files, the following backup mechanisms are in place:

a daily tape backup
This is mainly to protect against catastrophic damages to the file system, and restores are not directly available to users.
the snapshot mechanism
For your own restore/recover purposes, the IBM Storage Scale file system (see “Technology” below) does regular snapshots of the content. Available via the hidden folder “.snapshots/” in every subdirectory, you can copy back what has been lost, ie. any file inadvertently deleted. For details, see “Technologies” below.

Do not use directories on these file systems for running jobs!

/home

The home directory should be used for all files that are important and need to be stored permanently.

Every user can only store a small amount of data here (see the “Quota” line above).
In well reasoned cases and on request, this quota can be increased.

The folder /home/$USER (“Home”) is created with each user account. You can reference it by the environment variable $HOME.

/work/projects and /work/groups

Available on request, projects with more than a few members can get a projects folder, to share static input data and common software (versions) for their members and coworkers.

Likewise, groups of a sufficiently fair size (institutes) can get a group folder for the same purposes.

___________________________________________________________________________

High-volume I/O

Backup

NONE!

This class of file systems is explicitly optimized for high I/O volume and high throughput for jobs and applications.

Due to the high turnover in created/deleted/changed files, an (expensive) legacy tape backup would “explode” in volume and meta data. For the same reason, even the GPFS-internal snapshots could rather quickly exceed the file system's physical capacity.

There is absolutely no backup for the following file systems. What you delete here, is gone forever, and cannot be restored/recovered, not even by administrators.

/work/scratch

Here, almost unlimited disk space is available for all users, but only for a limited time: After 8 weeks of not being written to, the files will be deleted unconditionally without further notice (automatic deletion/removal policy).
A plain read access is not sufficient to prevent deletion, since for performance reasons, the files' “last access date” is not written reliably and may thus not be current.

The folder /work/scratch/$USER (“scratch”) is created with each user account. In job scripts, it can be referenced as environment variable $HPC_SCRATCH.

___________________________________________________________________________

Technologies

IBM Storage Scale

All shared, cluster-wide file systems above are based on IBM's Storage Scale (formerly known as General Parallel File System). This commercial product can share large disk arrays via Infiniband among thousands of nodes.

Of course, arbitrating read/write requests from such amounts of nodes to individual files will take some more time than accessing local disks. In addition, all running jobs and all logged-in users are equally working on this shared storage resource.

That's why you sometimes see (hopefully short) “hiccups” when doing a “ls -l” or similar commands. This is perfectly normal and an expected result from the principle “common, shared file system available everywhere”.

Snapshots

For folders on the low-volume I/O class of file systems, IBM Storage Scale automatically creates periodic snapshots, allowing you to access (and restore) older versions of your files without assistance by the admins. Snapshots are saved to the hidden folder .snapshots (you would not see this folder listed, not even by an 'ls -la'). Nonetheless, you can go to that hidden folder by explicitly “cd .snapshots” (<TAB>-completion does not work either, you have to fully type .snapshots).

Once being in .snapshots/, you can do 'ls -l' and 'cd' as usual, and access former versions (or states) of all your data (within the hourly.*/, 6hour.*/, daily.*/ and weekly.*/ directories).

Files from the snapshot folder still occupy storage space! Therefore, it is possible your home folder's quota is exceeded, even though the 'df' command still shows less usage.

Snapshots cannot be deleted (deleting data creates copies of the snapshot).
Frequent saving and deleting files fills up the snapshot area and requires space at the containing folder. Thus, this should be avoided if possible (so do not use home, groups or projects folders for high-volume I/O, eg. for I/O of running jobs!).
In urgent cases, the snapshot folder can be cleaned/deleted by the administrators.