.. _modules:
Environment modules
===================
Overview
---------
Most software installed on the cluster can be accessed only after loading the appropriate
*environment module*. For the most part, cluster users only need to know (or discover) the
already-available software *environment module* and load it prior to using the application.
**What are Modules**
There are a few meanings for modules
1. **Language modules** or packages are collections of related variables, functions and subroutines that perform a set
of specific programming tasks. Simply put they are files consisting Python/R/Perl code.
To access a language module or find out what is installed, you need to
use its language Python/R/Perl interface.
2. **Environment modules** are used to activate a specific software (including Python/R/Perl)
that a user wants to use. Environment modules provide a way to control which versions of software are active.
Typically, a user only needs to know which module is needed to gain access to its applications.
Some applications that are installed with the system OS do not require any special settings,
because they must be always accessible by users, thus they are easily available without any modules.
A very large fraction of scientific software applications on HPC3
requires to use environment modules associated with them.
`Environment Modules `_
provide an access to specific software and control which versions of software are active.
Modules are simple files or scripts that, when executed,
affect your Unix environment variables.
Modules are quite simple to use:
* **loading** a module is used to activate the software. It
modifies environment variables (``PATH`` for finding applications binaries,
``LD_LIBRARY_PATH`` for shared object libraries etc).
* **unloading** a module is used to deactivate the software. It
reverses the action of loading and unsets the environment variables set by loading.
Modules are accessible via ``module`` command. To find in-depth details about
using this command please see ``module --help``.
Please see sections below to learn how to identify and use them.
.. _list modules:
List modules
------------
``module avail``
**To list all available modules:**
.. code-block:: console
[user@login-x:~]$ module avail
The output will list all installed modules by categories.
We try to broadly classify software into categories by what they do so
that it is a bit easier to find a module you are interested in.
**To list all modules for specific software**:
.. code-block:: console
[user@login-x:~]$ module avail eigen
-------------------- /opt/rcic/Modules/modulefiles/LIBRARIES --------------------
eigen/3.3.7 eigen/3.4.0
The output shows 2 modules for *eigen* software. These modules are in :tt:`LIBRARIES` category.
.. _search modules:
Search modules
--------------
| ``module avail``
| ``module keyword``
You might not know the name of a particular module or versions available.
You can use one of the following commands to find some information.
Note, by default the search of modules is case-sensitive.
To turn on case insensitive module parameter evaluation, add ``-i`` switch to your
``module`` commands.
**Partial name lookup**:
Case sensitive
.. code-block:: console
[user@login-x:~]$ module avail eig
-------------------- /opt/rcic/Modules/modulefiles/LIBRARIES --------------------
eigen/3.3.7 eigen/3.4.0
Case insensitive
.. code-block:: console
[user@login-x:~]$ module avail -i BWA
-------------------- /opt/rcic/Modules/modulefiles/BIOTOOLS ---------------------
bwa/0.7.8 bwa/0.7.17
**Keyword lookup**
Find the keyword if it shows up anywhere in the module definition:
.. code-block:: console
[user@login-x:~]$ module keyword eigen
-------------------- /opt/rcic/Modules/modulefiles/LIBRARIES --------------------
eigen/3.3.7: Category_______ LIBRARIES
eigen/3.3.7: Name___________ eigen
eigen/3.3.7: Version________ 3.3.7
eigen/3.3.7: Description____ Eigen is a C++ template library for linear
...
eigen/3.4.0: Category_______ LIBRARIES
eigen/3.4.0: Name___________ eigen
eigen/3.4.0: Version________ 3.4.0
eigen/3.4.0: Description____ Eigen is a C++ template library for linear
...
scalapack/2.1.0: Category_______ LIBRARIES
scalapack/2.1.0: Name___________ scalapack
scalapack/2.1.0: Version________ 2.1.0
scalapack/2.1.0: Description____ ScaLAPACK 2.1.0 is a library of high-performance
...
The above partial output shows in the 1st column module names
that contain a keyword, and in the second column shows the text line
of the module file where a keyword was found. The keyword :tt:`eigen` is found in 3 different
modules.
.. _display modules:
Display modules
---------------
| ``modulw whatis``
| ``module display``
**Find information about specified module**
.. code-block:: console
[user@login-x:~]$ odule whatis hdf5/1.10.5/gcc.8.4.0
hdf5/1.10.5/gcc.8.4.0: Category------- TOOLS
hdf5/1.10.5/gcc.8.4.0: Name----------- hdf5
hdf5/1.10.5/gcc.8.4.0: Version-------- 1.10.5
hdf5/1.10.5/gcc.8.4.0: Description---- HDF5 is a data model, library and file format
hdf5/1.10.5/gcc.8.4.0: for storing and managing data. It supports an
hdf5/1.10.5/gcc.8.4.0: unlimited variety of datatypes, and is designed
hdf5/1.10.5/gcc.8.4.0: for flexible and efficient I/O and for high
hdf5/1.10.5/gcc.8.4.0: volume and complex data. HDF5 is portable and
hdf5/1.10.5/gcc.8.4.0: is extensible, allowing applications to evolve
hdf5/1.10.5/gcc.8.4.0: in their use of HDF5. The HDF5 Technology suite
hdf5/1.10.5/gcc.8.4.0: includes tools and applications for managing,
hdf5/1.10.5/gcc.8.4.0: manipulating, viewing, and analyzing data in
hdf5/1.10.5/gcc.8.4.0: the HDF format. Environment var:
hdf5/1.10.5/gcc.8.4.0: HDF5_HOME=/opt/apps/hdf5/1.10.5/gcc/8.4.0
hdf5/1.10.5/gcc.8.4.0: Load modules--- java/1.8.0
hdf5/1.10.5/gcc.8.4.0: gcc/8.4.0
hdf5/1.10.5/gcc.8.4.0: Prerequisites-- java8-module
hdf5/1.10.5/gcc.8.4.0: gcc_8.4.0-module
hdf5/1.10.5/gcc.8.4.0: rcic-module-support
hdf5/1.10.5/gcc.8.4.0: hdf5_1.10.5_gcc_8.4.0
The output shows
| :tt:`Name`, :tt:`Category`, :tt:`Version` for the software this module provides
| :tt:`Description` what this software does
| :tt:`Load modules` prerequisite modules that will be automatically loaded
| :tt:`Prerequisites` list of required installed RPM Package for the software to work
Both prerequisites modules and RPMS are automatically found, user does not
need to do anything.
**Find more info about a specific module**
.. code-block:: console
[user@login-x:~]$ module display foundation/v8
/opt/rcic/Modules/modulefiles/TOOLS/foundation/v8:
module-whatis {Category------- TOOLS}
module-whatis {Name----------- foundation}
module-whatis {Version-------- v8}
module-whatis {Description---- This module provides access to up-to-date versions of commonly}
module-whatis { used tools for building software including cmake v.3.22.1,}
module-whatis { curl v.7.81.0, git v.2.34.1, git-lfs v.3.0.2, ninja v.1.10.2,}
module-whatis { and swig v.4.0.2.}
module-whatis {Prerequisites-- rcic-module-support}
setenv foundation__PREFIX /opt/foundation/v8
setenv foundation__CPPFLAGS -I/opt/foundation/v8/include
setenv foundation__LDFLAGS {-L/opt/foundation/v8/lib -Wl,-rpath,/opt/foundation/v8/lib}
prepend-path GEM_PATH /opt/foundation/v8/share/gems
prepend-path PATH /opt/foundation/v8/bin
prepend-path MANPATH /opt/foundation/v8/share/man
prepend-path LD_LIBRARY_PATH /opt/foundation/v8/lib
prepend-path PKG_CONFIG_PATH /opt/foundation/v8/lib/pkgconfig
The display command gives additional info (compare to whatis):
| the full path of the module file (first output line)
| :tt:`setenv` lines show the environment variables that will be set
| :tt:`prepend-path` lines show changes added to the PATH-like variables
.. _use modules:
Using modules
-------------
RCIC-authored modules follow a uniform build, formatting and :ref:`module names` schema.
There are a few specifics about how the modules are built that are unique to HPC3.
* Nearly all modules have version numbers to specify the software version they provide.
:red:`Version numbers are important!`
* We use a notion of :tt:`Category` to group some modules together. This is
only a convenience and the categories show in the output of module display commands.
* Many modules are compiled with GCC compiler. For some we do not
specify compiler in the module name, for others we do. This is dictated by
the software build specifics. The prerequisite compiler will be automatically loaded if needed.
* If a module name contains :tt:`gcc` or :tt:`intel` it was compile with this compiler.
* If a module name contains :tt:`openmpi` or :tt:`mpich` it was compile with this MPI implementation enabled.
* If a module name contains :tt:`cuda` it means a module provides a GPU-enabled
software that needs to be run on GPU-enabled nodes (any of GPU partitions).
* Automatic prerequisites loading: if a module has any prerequisite modules they are automatically added when
the module is loaded,
* Automatic prerequisites unloading: the prerequisite modules are automatically removed when the module is
unloaded. Our modifications to modules has Smart unloading: when a prerequisite
was already loaded, unloading the higher-level module will leave the prerequisite intact.
* We provide a convenient way for users to add their own modules. Please see :ref:`user installed modules`.
**Rules of module loading/unloading**:
1. .. attention:: | Always include the version number when loading a module:
| ``module load X/1.2.3``
| This ensures you will get the version you need.
|
| If used without version, a default behavior is loading the latest currently available:
| ``module load X`` - :red:`DANGEROUS`
| This may give unexpected results of using a wrong version of the software when a new
| version is added or an old version is removed. Always use module name with the version.
2. You can load multiple modules, loading order is not important.
3. You need to load modules:
* in Slurm submit scripts for batch jobs
* in your interactive shell for interactive jobs
In these cases modules are automatically unloaded when your batch or
interactive job exists.
4. You can unload modules that you explicitly loaded:
.. code-block:: console
[user@login-x:~]$ module load bwa/0.7.17
[user@login-x:~]$ module unload bwa/0.7.17
Never unload modules that were auto-loaded by a module itself.
5. If you loaded multiple modules and need to unload them (rare cases),
**always unload modules in the reverse order of loading**:
last-loaded should be first unloaded. Not doing
this can result in an expected or broken environment.
For example, if you loaded modules as:
.. code-block:: console
[user@login-x:~]$ module load bwa/0.7.17
[user@login-x:~]$ module load proj/9.0.0
[user@login-x:~]$ module load bracken/2.6.0
You will need to unload them in reverse:
.. code-block:: console
[user@login-x:~]$ module unload bracken/2.6.0
[user@login-x:~]$ module unload proj/9.0.0
[user@login-x:~]$ module unload bwa/0.7.17
It is easier to unload all loaded modules via
.. code-block:: console
[user@login-x:~]$ module purge
Suppose you want access to GCC compiler version 8.4.0.
The following sequence shows which version of gcc is active prior
to module loading (default gcc is installed with the system OS),
after module load, and after the unloading.
.. code-block:: console
[user@login-x:~]$ module list # 1
No Modulefiles Currently Loaded.
[user@login-x:~]$ gcc --version | grep ^gcc
gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10)
[user@login-x:~]$ module load gcc/8.4.0 # 2
[user@login-x:~]$ module list
Currently Loaded Modulefiles:
1) gcc/8.4.0
[user@login-x:~]$ gcc --version | grep ^gcc
gcc (GCC) 8.4.0
[user@login-x:~]$ module unload gcc/8.4.0 # 4
[user@login-x:~]$ gcc --version | grep ^gcc
gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10)
| 1 check which modules are loaded, and what is active gcc version
| 2 load desired gcc module, verify gcc version
| 3 unload the module, this restores the environment, active gcc version is reverted to default
.. _module names:
Module Names
------------
You will notice in ``module avail`` command output that the module names have a few different formats.
The module naming schema makes it more apparent what version is available and what are key differences among different versions.
This also shows the potential *combinatorial* number of variants of any software.
We do not attempt to build every variant of *compiler x mpi* for these kinds of software. We build what is needed.
.. centered:: Module name formats
.. table::
:class: noscroll-table
+---+----------------------------------------------------------+----------------------------------------+
| No| Naming schema | Example full name |
+===+==========================================================+========================================+
| 1 | name | dot |
+---+----------------------------------------------------------+----------------------------------------+
| 2 | name/version | python/3.8.0 |
+---+----------------------------------------------------------+----------------------------------------+
| 3 | name/version/compiler.compiler_version | boost/1.78.0/gcc.8.4.0 |
+---+----------------------------------------------------------+----------------------------------------+
| 4 | name/version/compiler.compiler_version-mpi.mpi_version | hdf5/1.10.5/intel.2020u1-openmpi.4.0.3 |
| | | hdf5/1.10.5/gcc.8.4.0-openmpi.4.0.3 |
+---+----------------------------------------------------------+----------------------------------------+
| 5 | name/version/compiler.compiler_version-cuda.cuda_version | namd/2.14b2/gcc.8.4.0-cuda.10.1.243 |
+---+----------------------------------------------------------+----------------------------------------+
| 1 - module with only a name without version. *Reserved for a few OS-installed modules*.
| 2 - module for a specific version of python.
| 3 - module for a specific version of boost built with a specific compiler.
| 4 - two modules for hdf5 version built with Intel and GCC compilers and openmpi.
| 5 - module for a specific namd version built with gcc compiler and cuda.
.. _module limitations:
Limitations
-----------
**Caveat emptor** - Environment modules do their job, but have limitations.
You can easily render your environment into a completely
broken mess if you randomly unload modules.
For example, if you unload one of the prerequisite modules that were
automatically loaded when you did :tt:`module load PkgName/1.2.3` you won't see any errors or
complaints until you attempt to run :tt:`PkgName` program. Needed libraries or binaries
that were provided by unloaded module will not be available.
The solution is to unload onyl modjule that you explicitely loaded. Please
follow the rules in :ref:`use modules`.
.. THIS is no longer valid
2. Some modules overwrite environment variables, **clang** module is an
example of this. Clang requires LLVM and GCC, but overwrites the :tt:`CC` environment variable.
If you you load gcc module and then load clang module, that will work just fine. However,
if you then unload clang module, the :tt:`CC` environment variable will NOT be restored to [.tt]*CC=gcc*. Instead the environment
variable will no longer be defined.
IMPORTANT: When a module changes an environment variable, _it does not record the previous version of it._
.. _user installed modules:
User installed modules
----------------------
You don't need to create a new module if you are installing
packages (a.k.a language modules) for Python/R/Perl, or when adding packages with conda.
Please see the install guides in :ref:`user installed` that explain
how to create conda environments, or to install Python/R/Perl packages.
Users can install additional software and add environment module for it
either for themselves or for their groups.
.. attention:: Software install and module install are two separate
tasks. The installation location of a module file is
different from the location where the software is installed.
There are a few basic steps:
1. **Compile and install your desired software**
Do this in your user/group area :underline:`per your software instructions`.
Verify that the software is working.
* if installing for yourself, your user area is in :tt:`/pub/ucinetid`
.. important:: | Do not install in :tt:`$HOME`.
| Do not install in :tt:`$HOME/modulefiles/`
* if installing for the group, your group area is in one of DFS
file systems. Make sure that for group access the directories and files permissions
are set correctly.
2. **Create an environment module template**
The environment module file is a text file in a specific format that provides
information about the software and creates needed environment for using it.
We suggest to use existing available software module files as a template.
Run command ``module display`` for one of the available modules,
the output shows the full path to the module (output first line).
Copy this file to your user area, for example:
.. code-block:: console
[user@login-x:~]$ module display clang/13.0.0
----------------------------------------------------
/opt/rcic/Modules/modulefiles/COMPILERS/clang/13.0.0:
module-whatis {Category_______ COMPILERS}
module-whatis {Name___________ clang}
module-whatis {Version________ 13.0.0}
module-whatis {Description____ Clang version 13.0.0.