Image Cluster

From ImageWiki

Jump to: navigation, search

The following is a description of the image group cluster computer.

Contents

Image cluster setup

Old cluster

You find a description of the old image cluster on the Image cluster setup page.

Software installed include:

  • Operating system: Gentoo 64-bit Linux
  • OfficeGrid
  • Matlab 64-bit
  • Maya

New cluster

You find a description of the new image cluster on the New image cluster setup page.

Software installed include:

How to use the cluster

Policy

Old cluster

The policy for usage of the image cluster is that:

  • All computation jobs should be run via the OfficeGrid resource management system (as far as this is possible - ask the maintainers if you have problems).
  • Computation jobs should be written as batch jobs that only reads and writes files and print text to the console (due to the design of OG).
  • You may compile programs on all machines, but preferably do this on the imagediskserver1-3 machines (they are identical to imageserver1-3).
  • Visualising results, e.g. in matlab, should be done on one of the imagediskserver1-3 machines and NOT on imageserver1-3 machines (in order to leave resources for computations).
  • To help the OfficeGrid schedular you should provide as much information as possible on your resource usage (number of cores, memory, time, etc.).

How many jobs can I run at the same time?

  • If your jobs need more than 1 hour computation time, you should only run jobs taking up no more than 4 cores.
  • If your jobs need less than 1 hour computation time, you may use more than 4 cores but adjust your consumption of cores after the current load on the cluster (think of your colleagues).

Benefits:

  • It is easy for you to run and manage several simultaneous jobs.
  • You are guaranteed that your job will run when there are enough resources available, otherwise it will be queued until the cluster is vacant.

New cluster

TBD

Usage

Benchmark

You find some benchmarks on the image cluster here.


Maintenance

Who are the administrators?

Currently the image cluster administrators are Kim, Christian, Jan, Niklas, and Jon. If you need any help, contact imsupport@diku.dk.

Adding new users (New cluster)

1. Ask Kim to ask SCIENCE-IT to add user to the AD group sec-nat-l-ssh-diku-image_users.

Local accounts

1. Create local accounts for employed: On nfs2-diku-image do

 sudo /usr/sbin/useradd -m -g users <KU-login>-l -c '<The real name of the user>'
 id <KU-login>-l # Remember the user ID assigned to this new user

On the rest of the machines do:

 sudo /usr/sbin/useradd -u <userID> -g users <KU-login>-l -c '<The real name of the user>'

For students: On nfs2-diku-image do

 sudo /usr/sbin/useradd -m -g users -e <YYYY-MM-DD> -f 30 <KU-login>-l -c '<The real name of the user>, student of <supervisor name>'
 id <KU-login>-l # Remember the user ID assigned to this new user

On the rest of the machines do:

 sudo /usr/sbin/useradd -u <userID> -g users -e <YYYY-MM-DD> -f 30 <KU-login>-l  -c '<The real name of the user>, student of <supervisor name>'


2. Change password on all machines together with the new user. Do on all machines

 sudo passwd <KU-login>-l

Adding new users (Old cluster)

Make the user account the first time on imagediskserver1. This should create the home directory.

Now add the user to all the other machines making sure that user ID and group ID is the same on all machines.

When making new groups one should also do this with the same ID on all machines.


Creating Image Group member users

All members of the Image Group (VIP, Phd, research assistents, etc.) should be in the group 'users' and possible any other relevant groups (e.g. system administrators should be in the group 'wheel').


Example: How to create the user myuser
On imagediskserver3 using sudo:

useradd -m -g users -p <crypt output> myuser -c '<real name of user>'

This will create a new account with the name myuser in the group 'users' with home directory in /home/myuser.

<crypt output> is the crypt encoded password string. You get it from the diku system by running the following command on ask.diku.dk (or any other host on the diku internal network):

ypmatch myuser passwd

The crypted password is the second string in the ':' separated list.


On the rest of the machines:
Check the user ID (called uid) - can be found using the command "id":

imagediskserver1 ~ # id myuser
uid=1100(myuser) gid=100(users) groups=100(users)

Lets assume it is 1100 (don't use the same user ID for all users!!!), then on each machines do

useradd -g users -u 1100 -p <crypt output> myuser


Alternatively, consider using newusers which is based on a similar format as the passwd file (man 5 passwd).

NOTE: We should probably make a script for this in the future or use some other authetication model.


Adding a user to other groups:
Example: Adding myuser to the group wheel

usermod -a -G wheel myuser 

Creating student users

Students doing projects within the image group can get access to the cluster, provided that they have the need for the computational resources or access to data on the cluster. The project supervisor must approve this access. The student account will only be active for the duration of the project.

A student user should always be in the group 'students'.

Example: How to create the user myuser
On imagediskserver3 as root you can create a new account with the name myuser in the group 'students' with home directory in /home/myuser by:

useradd -m --expiredate <YYYY-MM-DD> -g students -p <crypt output> myuser -c '<what is the users real name and reason for account>'

Here <YYYY-MM-DD> is the date on which the account will be disabled. Set this date shortly after the time the project deadline.

<crypt output> refers to the crypt encoded password string. You get it from the diku system by running the following command on ask.diku.dk (or any other host on the diku internal network):

ypmatch myuser passwd

The crypted password is the second string in the ':' separated list.


On the rest of the machines:
Check the user ID (called uid) - can be found using the command "id":

imagediskserver1 ~ # id myuser
uid=1100(myuser) gid=443(students) groups=443(students)

Lets assume it is 1100 (don't use the same user ID for all users!!!), then on each machines do

useradd --expiredate <YYYY-MM-DD> -g students -u 1100 -p <crypt output> myuser

Automatic synchronization of passwords with the diku system

Old Cluster

Non!

New cluster

We use the KU login system (a.k.a. the Swedish license plate), so if you change your KU password then this is automatically updated on these machines.

Backup

Currently there is no backup of the cluster. The only safety lies in the use of RAID 5 on all disk servers, which in theory should provide tolerance towards one faulty disk per disk server. We are working on a backup solution.

Personal tools