Python virtual environment
Caution
Making a new Python virtual environment creates a new subfolder, inside which all the binaries, scripts and libraries are copied. After that those are available to be called, which initializes the environemnt. The initialization of the environment needs to be done before every use and (even more important) before the installation of packages. Failing to do so, will result in failed commands and (in case of package installation) full home directory, which prevents normal usage of the cluster.
Introduction
Using Python venv
module creates lightweight virtual environment
, with own independent set of Python packages installed in the specified directory. A virtual environment is created on top of an existing Python installation, known as the virtual environment's base
Python. Packages are isolated from the packages in the base environment, so only those explicitly installed in the virtual environment are available.
More information about Python virtual environment
Python has several different options on how to install and manage Python packages - here is described the recommended way for our clusters. The steps are
- first creating the Python virtual environment
- then initialization of the environment
- installing custom package(s) inside this environment
- deactivation of the virtual environment
- and how to submit a slurm job that uses virtual Python environment
Creating new Python virtual environment
To create a new Python virtual environment on the user interface machine (which can be later also used on compute nodes) use
- Load latest version of Python
module load Python
- Check the Python's version
python3 --version
- Build a Python Virtual Environment
python3 -m venv /groups/umcg-GROUP/TMPXX/MY_PYTHON_SUBDIR
and replace theGROUP
,TMPXX
andMY_PYTHON_SUBDIR
with appropriate values. RemembertmpXX
folders are also available on compute nodes and therefore the compute jobs can access this Python environment.
To use the existing environment
- Source the environments
source /groups/umcg-GROUP/TMPXX/MY_PYTHON_SUBDIR/bin/activate
To install packages inside the Python environment
- (Recommended) Upgrade
pip
andwheel
packages, so that that you can install all the latest package versionspip install --upgrade pip wheel
- Install package
pip install pypackage
wheremypackage
is one or more packages from thehttps://pypi.org/
.
Deactivation of the virtual environment
- to deactivate the virtual environment, simply run command
deactivate
Using Python virtual environment inside slurm job
Create a slurm script with appropriate fields. For more information check also the documentation page about batch jobs.
#!/bin/bash
#SBATCH --job-name=pyvenv
#SBATCH --output=pyvenv-%N-%u-%j.out
#SBATCH --error=pyvenv-%N-%u-%j.err
#SBATCH --time=00:15:00
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1gb
#SBATCH --nodes=1
#SBATCH --open-mode=truncate
#SBATCH --export=NONE
#SBATCH --get-user-env=60L
# clear all loaded modules
module purge
# to load the latest version of Python
module load Python
# initialize the Python virtual environment
source /groups/umcg-GROUP/TMPXX/MY_PYTHON_SUBDIR/bin/activate
# check the Python version (useful for debugging)
python3 --version
# print the location of Python3 executable binary (also useful for debugging)
which python3
# run the Python command or in this case some custom Python script
python3 ./my_python_script.py
Then submit this slurm SLURM job
sbatch pyvenv.sh