Image Cluster

From ImageWiki

(Difference between revisions)
Jump to: navigation, search
(Installation notes)
(Installation notes)
Line 49: Line 49:
* [[Link_bonding|Link bonding / aggregation]]
* [[Link_bonding|Link bonding / aggregation]]
* WMware server / player and Windows XP
* WMware server / player and Windows XP
-
* On 113, the useflags have been updated in /etc/make.conf, so we need to emerge --newuse world on this machine.  But since portage has updated its understanding of use flags, we need to do the following on all cluster machines:
+
* April 2, 2008, sporring: On 113, the useflags have been updated in /etc/make.conf, so we need to emerge --newuse world on this machine.  But since portage has updated its understanding of use flags, we need to do the following on all cluster machines:
** emerge -C sys-apps/setarch
** emerge -C sys-apps/setarch
** emerge -uDav --newuse world
** emerge -uDav --newuse world

Revision as of 10:15, 2 April 2008

The following is a description of the image group cluster computer.

Contents

Image cluster setup

You find a description of the image cluster on the Image cluster setup page.

Software installed include:

  • Operating system: Gentoo 64-bit Linux
  • OfficeGrid
  • Matlab 64-bit
  • Maya

How to use the cluster

Policy

The policy for usage of the image cluster is that:

  • All computation jobs should be run via the OfficeGrid resource management system (as far as this is possible - ask the maintainers if you have problems).
  • Computation jobs should be written as batch jobs that only reads and writes files and print text to the console (due to the design of OG).
  • You may compile programs on all machines, but preferably do this on the imagediskserver1-3 machines (they are identical to imageserver1-3).
  • Visualising results, e.g. in matlab, should be done on one of the imagediskserver1-3 machines and NOT on imageserver1-3 machines (in order to leave resources for computations).

Benefits:

  • It is easy for you to run and manage several simultaneous jobs.
  • You are guaranteed that your job will run when there are enough resources available, otherwise it will be queued until the cluster is vacant.

Usage

Benchmark

You find some benchmarks on the image cluster here.

Installation notes

Things to do in the future:

  • Link bonding / aggregation
  • WMware server / player and Windows XP
  • April 2, 2008, sporring: On 113, the useflags have been updated in /etc/make.conf, so we need to emerge --newuse world on this machine. But since portage has updated its understanding of use flags, we need to do the following on all cluster machines:
    • emerge -C sys-apps/setarch
    • emerge -uDav --newuse world

Maintenance

Who are the administrators?

Currently the image cluster administrators are Pechin, Kim, and Jon. If you need any help, try contacting them.

Adding new users

Make the user account the first time on imagediskserver1. This should create the home directory.

Now add the user to all the other machines making sure that user ID and group ID is the same on all machines.

When making new groups one should also do this with the same ID on all machines.


Creating Image Group member users

All members of the Image Group (VIP, Phd, research assistents, etc.) should be in the group 'users' and possible any other relevant groups (e.g. system administrators should be in the group 'wheel').


Example: How to create the user myuser</br> On imagediskserver1 as root:

useradd -m -g users -p <crypt output> myuser

This will create a new account with the name myuser in the group 'users' with home directory in /home/myuser.

<crypt output> is the crypt encoded password string. You get it from the diku system by running the following command on ask.diku.dk (or any other host on the diku internal network):

ypmatch myuser passwd

The crypted password is the second string in the ':' separated list.


On the rest of the machines:
Check the user ID (called uid) - can be found using the command "id":

imagediskserver1 ~ # id myuser
uid=1100(myuser) gid=100(users) groups=100(users)

Lets assume it is 1100 (don't use the same user ID for all users!!!), then on each machines do

useradd -g users -u 1100 -p <crypt output> myuser


Alternatively, consider using newusers which is based on a similar format as the passwd file (man 5 passwd).

NOTE: We should probably make a script for this in the future or use some other authetication model.


Adding a user to other groups:
Example: Adding myuser to the group wheel

usermod -a -G wheel myuser 


Creating student users

Students doing projects within the image group can get access to the cluster, provided that they have the need for the computational resources or access to data on the cluster. The project supervisor must approve this access. The student account will only be active for the duration of the project.

A student user should always be in the group 'students'.

Example: How to create the user myuser</br> On imagediskserver1 as root you can create a new account with the name myuser in the group 'students' with home directory in /home/myuser by:

useradd -m --expiredate <YYYY-MM-DD> -g students -p <crypt output> myuser

Here <YYYY-MM-DD> is the date on which the account will be disabled. Set this date shortly after the time the project deadline.

<crypt output> refers to the crypt encoded password string. You get it from the diku system by running the following command on ask.diku.dk (or any other host on the diku internal network):

ypmatch myuser passwd

The crypted password is the second string in the ':' separated list.


On the rest of the machines:
Check the user ID (called uid) - can be found using the command "id":

imagediskserver1 ~ # id myuser
uid=1100(myuser) gid=443(students) groups=443(students)

Lets assume it is 1100 (don't use the same user ID for all users!!!), then on each machines do

useradd --expiredate <YYYY-MM-DD> -g students -u 1100 -p <crypt output> myuser

Automatic synchronization of passwords with the diku system

Rune is working on this.

Backup

Currently there is no backup of the cluster. The only safety lies in the use of RAID 5 on all disk servers, which in theory should provide tolerance towards one faulty disk per disk server. We are working on a backup solution.

Personal tools