Notebooks#

Notebook Platform of VUB-HPC

notebooks.hpc.vub.be

Computational notebooks are an alternative interface to the traditional terminal to access and use the HPC. We provide a notebook platform integrated with our Tier-2 HPC cluster (Hydra) at notebooks.hpc.vub.be. This platform is based on the popular Jupyter project and allows to manage and launch your notebooks directly on the HPC from a JupyterLab environment.

Access to the notebook platform#

All VSC users can use the notebook platform of VUB-HPC. The same access policies apply to request a new VSC account to use the notebook platform than the regular terminal interface.

Access to the notebook platform does not require the upload of any SSH key to your VSC account. This means that if you will only use the HPC through the notebook interface, the process of creating your VSC account is much simpler and you can skip all steps related to the creation and upload of the SSH key. This can be specially useful for teaching, as students carrying out exercises on the HPC can now create their VSC accounts and access the cluster entirely from the web browser.

Once you log in to notebooks.hpc.vub.be, the following screen will request read access to your VSC account:

../../_images/jupyterhub-oauth-request.png

Request to get read access to your VSC account.#

Click on Authorize and you will be automatically redirected to the notebook platform.

Computational resources#

The notebook platform allows to launch JupyterLab environments directly on the Tier-2 HPC cluster of VUB (Hydra). After a successful login, you will be presented with a panel to select the computational resources dedicated to your notebooks.

../../_images/jupyterhub-moss-simple.png

Panel with simple selection of computational resources for JupyterLab session.#

Notebooks can be launched on almost all cluster partitions of Hydra. You can start your JupyterLab session on a generic compute node (Intel Broadwell or Skylake), on GPUs (Nvidia Pascal) or even on nodes with InfiniBand interconnect (advanced). The only limitation are the Nvidia Ampere GPUs, which are left out of the pool of resources for notebooks as they are already in very high demand by regular computational jobs.

The maximum amount of resources available to notebooks is smaller than those of regular jobs due to the interactive nature of this interface. Hence, each user is only allowed to start a single JupyterLab session at a time, with a maximum of 10 dedicated CPU cores (e.g. to run 10 notebooks simultaneously) and 12 hours of execution time on generic nodes and 6 hours on GPUs. We consider that these restrictions fit well in one day of work running multiple notebooks. Longer or bigger simulations should continue to use regular computational jobs.

The resources available to the notebook platform are subject to change. If the current options are not sufficient for your workflow, please contact us at VUB-HPC Support

Jupyter environment#

The main work environment provided by the notebook platform is JupyterLab. If you are not familiar with it, please check its official documentation at jupyterlab.readthedocs.io

You will find several options in the menu Jupyter environment of the resource selection panel. All the options will launch a JupyterLab interface running on the HPC, integrated with the software module system and with all notebook kernels available. The differences between these lab environments concern:

  • version of Python and JupyterLab used in the environment

  • pre-installed lab extensions

  • available software modules (software toolchain)

Users are not allowed to install JupyterLab extensions on their own, those are managed by VUB-HPC. Therefore, you will typically find a default environment of the lab with just the software module extension plus some others environments with extra extensions.

Environments are found in the menu Jupyter environment where they are grouped by their Python version. You might find multiple versions of the same environment, which differ in the year of their creation and will use older or newer software modules. We typically provide the following environments:

Default: minimal with all modules available

This is a default JupyterLab environment without any extensions beyond the integration with the module system of the HPC. It uses modules in the corresponding year toolchains and the indicated version of Python for the kernel of its Python notebooks. All other notebook kernels can be loaded on-demand through the module system.

DataScience: SciPy-bundle + matplotlib + dask

Default JupyterLab environment with pre-loaded data science Python packages such as numpy, scipy, pandas and matplotlib; plus the capability to display in your notebooks interactive graphs with matplotlib and an integrated Dask dashboard to manage and monitor your workflows with Dask.

Molecules: DataScience + nglview + 3Dmol

DataScience JupyterLab environment plus the nglview and 3Dmol lab extensions to visualize molecular structures in 3D.

RStudio with R

Default JupyterLab environment plus a pre-loaded R kernel and a lab extension to launch RStudio from within the lab interface.

MATLAB

Default JupyterLab environment plus a pre-loaded MATLAB kernel and a lab extension to launch MATLAB Desktop from within the lab interface.

File browsing#

JupyterLab and notebooks will be launched from your VSC_DATA storage by default. You can change this starting location in the configuration file of jupyter-server, which is located in your home directory at ~/.jupyter/jupyter_server_config.py. You can create this file with the contents below or add the ServerApp.root_dir and ContentsManager.root_dir settings to an existing one.

Example jupyter_server_config.py to change starting location of JupyterLab to VSC_SCRATCH#
1# Configuration file for jupyter-server.
2import os
3c = get_config()  #noqa
4
5## Starting directory for lab, notebooks and kernels.
6c.ServerApp.root_dir = os.environ['VSC_SCRATCH']
7c.ContentsManager.root_dir = os.environ['VSC_SCRATCH']

Regardless of your starting directory in the lab, you can access all your files and folders in the HPC from your notebooks. We recommend to move around your personal storage partitions and those of your Virtual Organization (VO) by relying on the environment variables $VSC_HOME, $VSC_DATA, $VSC_SCRATCH and their variants for VOs. See below for an example:

import os

vsc_scratch = os.environ['VSC_SCRATCH']

# change current working directory
os.chdir(vsc_scratch)

# open file by absolute path
filename = os.path.join(vsc_scratch, 'some_folder', 'some_data.txt')
with open(filename) as f:
    content = f.readlines()
../../_images/jupyterhub-file-browser-2023a.png

File browser in JupyterLab.#

The file browser in JupyterLab can be accessed through the tab on the left panel (see screenshot on the right). This file browser can be used to navigate your starting location (VSC_DATA by default) and all its sub-folders, but it does not allow to jump to other storage partitions.

You can open notebooks in any storage partition in Hydra from the menu File > Open from Path…. A pop-up will open where you can write the absolute path to the notebook file.

Alternatively, you can use symbolic links to quickly access any other storage from the file browser of the lab. Symbolic links are a feature of the underlying Linux system that allows to link any existing file or folder from any location. You can create new symbolic links from a Linux shell on the HPC by using the ln -s command. The screenshot on the right shows the resulting home and scratch symbolic links on VSC_DATA from the commands below:

create symbolic links in VSC_DATA to your scratch and home#
ln -s $VSC_SCRATCH $VSC_DATA/scratch
ln -s $VSC_HOME $VSC_DATA/home

Software modules#

The JupyterLab environment launched by the notebook platform is integrated with the software module system in the HPC. This means that you can load and use in your notebooks the same software packages used in your computational jobs.

../../_images/jupyterhub-lmod-tab-2023a.png

Module tab in JupyterLab.#

You can load software modules from the tab with a hexagon icon on the left panel of JupyterLab. This tab opens a list of loaded modules followed by a list of available modules.

Upon launch, the list of loaded modules will already show some modules loaded by JupyterLab itself. For instance, you will always see a module of Python loaded which determines the version of Python of the kernel used by your Python notebooks on this session.

Warning

Modules already loaded when your JupyterLab environment starts are necessary for the correct function of the lab and notebooks. They should not be unloaded.

../../_images/jupyterhub-lmod-load-2023a.png

Loading a module from the module tab in JupyterLab.#

Below loaded modules, you will find the list of available modules that can be loaded on-demand. Point your cursor to the right of the module name and a Load button will appear (see screenshot on the right). All modules shown in the list are compatible with each other, so you can load any combination of modules.

All available JupyterLab environments use a single module toolchain. You can select the toolchain of your JupyterLab session on launch from the resource selection panel. The menu Jupyter environment lists the available environments indicating the Python version and the generation of its toolchain.

Note

Any change to the list of loaded modules requires rebooting the kernel of any open notebook. After loading/unloading modules, click on the top-right button of the notebook toolbar, labelled Python 3 (ipykernel) in the screenshot below, and re-select your notebook kernel from the menu.

../../_images/jupyterhub-kernel-reload.png

Notebook toolbar.#

Notebook kernels#

The following table shows the notebook kernels available in all JupyterLab environments of this platform and the corresponding modules that have to be loaded to enable them:

Notebooks kernels provided by software modules#

Notebook Kernel

Software Module

Python

(loaded by default)

R

IRkernel

Julia

IJulia

MATLAB

jupyter-matlab-proxy + MATLAB

The default lab environment only loads the Python kernel on launch. You can activate any other kernel by loading its corresponding software module. Once a module providing a new kernel is loaded, a new icon will automatically appear on your lab launcher to start a notebook with that kernel.

Some specific Jupyter environments have extra kernels already loaded by default, for instance RStudio with R environments also load IRkernel rendering notebooks with R readily available on it.

../../_images/jupyterhub-all-kernels.png

Launchers for notebooks for Python, Julia, MATLAB and R.#

TensorBoard#

You can start TensorBoard from within your notebooks in all Jupyter environments of the notebook platform.

  1. Load the tensorboard software module using the module panel. Make sure that it is loaded last to avoid conflicts.

  2. Reboot the kernel of any open notebook

  3. Load the TensorBoard extension in your notebook

    %load_ext tensorboard
    
  4. Launch the TensorBoard panel using some existing logs

    %tensorboard --logdir /path/to/directory/with/logs
    
../../_images/jupyterhub-tensorboard.png

TensorBoard running in a Jupyter notebook#

See also

Official guide Using TensorBoard in Notebooks

RStudio#

You can launch RStudio from the notebook platform. This environment is specific to the R language. If you are not familiar with it, please check its documentation site at education.rstudio.com.

RStudio is available through any Jupyter environment with RStudio in its name. Launching any of these environments will start a JupyerLab with the R kernel readily available for your notebooks and a launcher for RStudio.

../../_images/jupyterhub-rstudio-launcher.png

Launchers of Python notebook, R notebook and R Studio.#

MATLAB#

The Desktop interface of MATLAB is available on the notebook platform as well. This graphical interface is analog to MATLAB Desktop but it works on the web browser. If you are not familiar with this environment, please check its documentation site at mathworks.com/help/matlab.

MATLAB Desktop is available through any Jupyter environment with MATLAB in its name. Launching these environments will start a JupyterLab with the MATLAB kernel readily available for your notebooks and a launcher for MATLAB Desktop.

../../_images/jupyterhub-matlab-launcher.png

Launchers of Python notebook, MATLAB notebook and MATLAB Desktop.#

Custom Python environments#

You can use Python virtual environments to generate custom kernels for your notebooks. Virtual environments provide a layer of isolation allowing users to install additional Python packages on top of the software modules without conflicts. Each of your virtual environments can be added as a new kernel for your notebooks and launched from the lab interface.

The main step in adding a new kernel to your JupyterLab environment from one of your virtual environments is to create the virtual environment itself.

  1. Start a new session in notebooks.hpc.vub.be in the cluster partition and with the Jupyter environment of choice

    Note

    Software installed in virtual environments will only work in the cluster partition and Jupyter environment used for its creation.

  2. Open the Terminal from your lab interface

  3. Follow the instructions in Python virtual environments to create a new virtual environment and install any Python packages in it. Keep in mind that loading the Python module is not necessary as that is already done by the JupyterLab session. This new virtual environment can be placed anywhere you like in the storage of the cluster.

    Example sequence of commands to create a new virtual environment in the directory myenv#
    $ virtualenv --system-site-packages myenv
    $ source myenv/bin/activate
    (myenv) $
    (myenv) $ python -m pip install --upgrade pip
    (myenv) $ python -m pip install <insert_cool_package>
    
  4. Add your new virtual environment as a new Jupyter kernel (from the same terminal shell)

    $ python -m ipykernel install --user --name=myenv
    
  5. A new launcher will appear in the lab interface to start notebooks using this new virtual environment

    ../../_images/jupyterhub-custom-launcher.png

    Launchers of standard Python notebook and custom Python kernel from virtual environment#

Note

Whenever you want to reuse your existing virtual environments in the lab, keep in mind to load any software modules used in its creation beforehand.

Jupyter extension manager#

Extensions for JupyterLab are installed and managed by the SDC team. You will find the list of available extensions in the extension tab on the left panel (puzzle piece icon) and you can enable or disable any of them. Different Jupyter environments provide different extensions and those are only available in their corresponding environments.

The store of Jupyter extensions is disabled on the notebook platform as the available extensions for download on the store are unreviewed and they can contain malicious or malfunctioning software. If you need any Jupyter extension not yet available on the notebook platform, please contact VUB-HPC Support.