3. Data Transfer

3.1. How can I share files with other users?

The main platform to share files in Hydra is a so-called Virtual Organization (VO). It provides extra disk space in Hydra that is shared with the other members of your research group, where you can easily share any data files with them.

If you want to setup a shared directory with another user in Hydra, but joining a common VO is not an option, you should contact VUB-HPC Support. We will evaluate the best solution for your use case.

3.2. Can I copy files between Hydra and VUB’s OneDrive directly?

You can copy files between Hydra and the VUB OneDrive directly, using the third-party sync app Onedrive Client for Linux. This avoids copying the files to/from your local computer as an intermediate step.

Warning

Hydra and OneDrive use different filesystems which are not 100% compatible:

  • OneDrive does not discrimintate capitalization in file names. Avoid having two files in the same folder that only differ in the capitalization.

  • OneDrive does not allow filenames that contain any of the characters \/:*?""<>|. Files that contain any of these characters will not be synced.

3.2.1. Client Authorization

  1. Run the onedrive command for the first time.

    onedrive
    

    Upon execution, a URL starting with https://login.microsoftonline.com is shown to authorize the client to access your VUB Office 365 account. The URL contains the client_id of the sync app, which should be exactly ‘d50ca740-c83f-4d1b-b616-12c519384f0c’:

    Output:
    $ onedrive
    Configuring Global Azure AD Endpoints
    Authorize this app visiting:
    
    https://login.microsoftonline.com/[...]
    
    Enter the response uri:
    
  2. Copy/paste the full URL in your browser.

  3. Log in with your credentials if necessary. You should be redirected to a blank page in your browser.

  4. Copy/paste the URL of the blank page into the prompt of onedrive in Hydra.

At this point, if there is no error, your client should have access to your account. By default, the access token to Office 365 is stored in the file ~/.config/onedrive/refresh_token.

3.2.2. Synchronize with personal OneDrive

  1. Create a directory that will be synced with your OneDrive.

    The following command creates the sync directory hydra-sync inside $VSC_DATA/onedrive (avoid using $HOME as it is small).

    mkdir -p $VSC_DATA/onedrive/hydra-sync
    
  2. Create the configuration file ~/.config/onedrive/config.

    The following commands generate the config file. The entry sync_dir is mandatory and points to the parent directory of the sync directory. Also, we recommend to skip syncing symlinks and dotfiles (files that start with a dot) by default to avoid unexpected data transfers unless you know that you need those.

    config=~/.config/onedrive/config
    echo sync_dir = \"$VSC_DATA/onedrive\" > $config
    echo 'skip_symlinks = "true"' >> $config
    echo 'skip_dotfiles = "true"' >> $config
    
  3. Create the sync_list file ~/.config/onedrive/sync_list.

    The following command adds the sync directory hydra-sync to the sync_list file. This ensures that only data inside the sync directory is synced.

    echo hydra-sync > ~/.config/onedrive/sync_list
    
  4. Check if the OneDrive client has been configured correctly.

    onedrive --resync --synchronize --verbose --dry-run
    
  5. If the dry-run succeeded, re-run the above command but remove the --dry-run option to do the real sync.

    onedrive --resync --synchronize --verbose
    

    If the sync is successful, the sync directory (here: hydra-sync) should show up under My files in your VUB OneDrive.

  6. For subsequent synchronizations, remove also the --resync option to avoid any further full synchronization. A resync is only needed after modifying the configuration or sync_list file.

    onedrive --synchronize --verbose
    

3.3. Can I copy files between Hydra and VUB’s ownCloud directly?

You can indeed copy files between Hydra and the ownCloud of VUB directly. This avoids copying the files to/from your local computer as an intermediate step. The davix tool communicates with the ownCloud server via the WebDAV protocol.

For security reasons, you should never use your netID password. Instead, generate a dedicated App password for davix in the ownCloud web interface:

  1. Go to your personal Settings page

  2. In the sidebar, click on Security

  3. Under the heading App passwords / tokens, type davix in the App name field

  4. Click on Create new app passcode

  5. Copy the generated password and save it in a secure place. This password should always be used with davix

Example Using davix to copy files between Hydra and ownCloud:

Copy the file my_awesome_text.txt from your ownCloud home directory to Hydra
 davix-get https://owncloud.vub.ac.be/remote.php/webdav/my_awesome_text.txt my_awesome_text.txt --userlogin <netID> --userpass <passwd>
Copy the file my_awesome_text.txt from Hydra to your ownCloud home directory
 davix-put my_awesome_text.txt https://owncloud.vub.ac.be/remote.php/webdav/my_awesome_text.txt --userlogin <netID> --userpass <passwd>
Copy directory mydir recursively from Hydra to your ownCloud home directory using 4 concurrent threads (-r) for increased speed
 davix-put mydir https://owncloud.vub.ac.be/remote.php/webdav/mydir --userlogin <netID> --userpass <passwd> -r 4

See also

The davix documentation.

3.4. How can I transfer data to/from Hydra with Globus?

Hydra is already available in Globus with its own collection. The name of Hydra’s collection is VSC VUB Tier2. Please follow the steps below to add Hydra to your Globus account

  1. Install and configure Globus Personal Connect in your local computer following VSC Docs: Globus

  2. Open Globus and select the File Manager in the left panel

  3. Write VSC VUB Tier2 in the Collections field and select it

  4. At this point, the storage of Hydra will open and you can navigate it within Globus. Only data in your $VSC_DATA and $VSC_SCRATCH will be accessible

    • Path to your VSC_SCRATCH: /~/scratch/brussel/<vsc_first_3_digits>/<vsc_username>/

    • Path to your VSC_DATA: /~/data/brussel/<vsc_first_3_digits>/<vsc_username>/

Tip

Create bookmarks in Globus to easily access your data in Hydra

3.5. How can I automate the transfer of data to/from Hydra?

Automatic (scripted) data transfer between Hydra and external SSH servers can be safely done using rsync in Hydra with a secure SSH connection without password. The authentication with the external server is done with a specific pair of keys not requiring any additional password or passphrase from the user. Once the passwordless SSH connection between Hydra and the external server is configured, rsync can use it to transfer data between them.

Important

The only caveat of this method is that anybody gaining access to your Hydra account will automatically gain access to your account in the external server as well. Therefore, it is very important that you use a user account in the external server that is exclusively used for sending/receiving files to/from Hydra and that has limited user rights.

The following steps show the easiest way to setup a secure connection without password to an external server:

  1. Check the connection to the external server from Hydra: Login to Hydra and try to connect to the external server with a regular SSH connection using a password. If this step does not work, your server may not be reachable from Hydra and you should contact the administrators of the external server to make it accessible:

    $ ssh <username>@<hostname.external.server>
    
  2. Create a SSH key pair without passphrase: Login to Hydra and create a new pair of SSH keys that will be exclusively used for data transfers with external servers. The new keys have to be stored inside the .ssh folder in your home directory. In the example below, the new key is called id_filetransfer. Leave the passphrase field empty to avoid any password prompt on authentication:

    $ ssh-keygen
    Generating public/private rsa key pair.
    Enter file in which to save the key (/your/home/.ssh/id_rsa): </your/home/.ssh/id_filetransfer>
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in id_filetransfer.
    Your public key has been saved in id_filetransfer.pub.
    [...]
    
  3. Transfer the keys to the external server: The new key created in Hydra without a passphrase has to be installed in the external server as well. In this step you will have to provide your password to connect to the external server:

    $ ssh-copy-id -i ~/.ssh/id_filetransfer <username>@<hostname.external.server>
    
  4. Configure the connection to the external server: The specific keys used in the connection with the external server can be defined in the file ~/.ssh/config. This avoids having to explicitly set the option -i ~/.ssh/id_filetransfer on every SSH connection. Add the following lines at the bottom of your ~/.ssh/config file in Hydra (create the file if it does not exist):

    1
    2
    3
      Host <hostname.external.server>
          User <username>
          IdentityFile ~/.ssh/id_filetransfer
    
  5. Check the passwordless connection: At this point it should be possible to connect from Hydra to the external server with the new keys and without any prompt for a password:

    $ ssh <username>@<hostname.external.server>
    
  6. Automatic copy of files: Once the passwordless SSH connection is properly configured, rsync will automatically use it. You can execute the following commands in Hydra to either transfer data to the external server or from the external server:

    Transfer from Hydra to external server
      $ rsync -av /path/to/source <username>@<hostname.external.server>:/path/to/destination
    
    Transfer from external server to Hydra
      $ rsync -av <username>@<hostname.external.server>:/path/to/source /path/to/destination