Main Page

From UFAL COVIDcloud
Revision as of 09:10, 8 April 2020 by Admin (talk | contribs)

Welcome to UFAL COVIDcloud

UFAL COVIDcloud is Linux based cluster providing aprox. 1000 CPU cores to support researchers working on a solution of the COVID-19 crisis. Most of our computational nodes have 123GB of memory (reserved for jobs) and 31 CPU cores. Available storage space is 10TB.

Notes:

Cluster availability: April 1 - June 30, 2020
Please, make sure that your jobs do not run more than three months, or that you will be able to move them elsewhere after the cluster is returned to internal use.

Access

Contact us at covidcloud@ufal.mff.cuni.cz to request an account. You have to provide us with details describing both details of your project and requested number of CPU cores, data storage requirements and server memory requirements.

Basic HOWTO

Following HOWTO is meant to provide only a simplified overview of the cluster usage. It is strongly recommended to read some further documentation (CPU) before running some serious experiments. More serious experiments tend to take more resources. In order to avoid unexpected failures please make sure your quota is not exceeded.

Suppose we want to run some computations described by a script called job_script.sh:

#!/bin/bash
echo "This is just a test."
echo "printing parameter1: $1"
echo "prinitng parameter2: $2"


We need to submit the job to the grid which is done by logging on the submit host covid.ufal.mff.cuni.cz and issuing the command:
qsub -cwd -j y job_script.sh Hello World

This will enqueue our job to the default queue which is cpu.q@*. The scheduler decides which particular machine in the specified queue has resources needed to run the job. Typically we will see a message which tells us the ID of our job (82 in this example):

Your job 82 ("job_script.sh") has been submitted

The basic options used in this example are:

  • -cwd - the script is executed in the current directory (the default is your $HOME)
  • -j y - stdout and stderr outputs are merged and redirected to a file (job_script.sh.o82)

We have specified two parameters Hello and World. The output of the script will be located in your $HOME directory after the script is executed. It will be merged with stderr and it should look like this:

AIC:ubuntu 18.04: SGE 8.1.9 configured...                                                                                              
This is just a test.
printing parameter1: Hello
prinitng parameter2: World
======= EPILOG: Tue Jun 4 12:41:07 CEST 2019
== Limits:   
== Usage:    cpu=00:00:00, mem=0.00000 GB s, io=0.00000 GB, vmem=N/A, maxvmem=N/A
== Duration: 00:00:00 (0 s)
== Server name: cpu-node13