Revision as of 16:04, 28 April 2017
All Information on the page are subject to change! Especially hostnames are going to be replaced by nice alias names.
The cluster consists of two partitions:
- image1 with 11 compute nodes with 2x20 Intel-Cores each
- image2 with 1 compute nodes with 8x8 AMD cores each
Basic Access and First Time Setup
For accessing the cluster, you need access to ssh-diku-image.science.ku.dk. Ask one of the Admins to grant you access. The way to access the cluster is with kuid being your ku-username:
ssh email@example.com ssh a00552
The simplest way is to add an entry in .ssh/config
Host cluster HostName a00552 User <kuid> ProxyCommand ssh -q -W %h:%p <kuid>@ssh-diku-image.science.ku.dk
With this in place, you can directly login via
This will also come in handy if you want to copy your files via scp
scp -r my_file1 my_file2 my_folder/ cluster:~/Dir
This will copy my_file1 my_file2 and my_folder/ into the path /home/kuid/Dir/. All files in your home directory are available to all compute nodes. You can also copy back simply by
scp -r cluster:~/Dir ./
After your first login, you have to setup a private key which allows password free login to any of the other nodes. This is required for slurm to function properly! Simply execute the following. When asked for a password, leave blank:
ssh-keygen ssh-copy-id a00553
Slurm is a Batch processing manager which allows you to submit tasks and request a specific amount of resources which have to be reserved for the job. Resources are for example Memory, number of processing cores, GPUs or even a number of machines. Moreover, Slurm allows you to start arrays of jobs easily, for example to Benchmark an algorithm with different parameter settings. When a job is submitted, it is enqueued to the waiting queue and will stay there until the required resources are available. Slurm is therefore perfectly suited for executing long-running tasks.
To see how many jobs are queued type
To submit a job use
Where the sbatchscript.sh file is a normal bash or sh script that also contains information about the ressources to allocate. A quite minimal version looks like:
#!/bin/bash #SBATCH --job-name=MyJob #number of independent tasks we are going to start in this script #SBATCH --ntasks=1 #number of cpus we want to allocate for each program #SBATCH --cpus-per-task=4 #We expect that our program should not run langer than 2 days #Note that a program will be killed once it exceeds this time! #SBATCH --time=2-00:00:00 #Skipping many options! see man sbatch # From here on, we can start our program ./my_program option1 option2 option3 ./some_post_processing
Jobs are run on the node in the same path as the path you were when you submitted the job. This means that storing files relative to your current path will work flawlessly.