Image Cluster
From ImageWiki
(→Policy) |
(→Installation notes) |
||
Line 49: | Line 49: | ||
* [[Link_bonding|Link bonding / aggregation]] | * [[Link_bonding|Link bonding / aggregation]] | ||
* WMware server / player and Windows XP | * WMware server / player and Windows XP | ||
+ | * On 113, the useflags have been updated, so we need to emerge --newuse world on this machine. But since portage has updated its understanding of use flags, we need to do the following on all cluster machines: | ||
+ | ** emerge -C sys-apps/setarch | ||
+ | ** emerge -uDav --newuse world | ||
=Maintenance= | =Maintenance= |
Revision as of 10:08, 2 April 2008
The following is a description of the image group cluster computer.
Contents |
Image cluster setup
You find a description of the image cluster on the Image cluster setup page.
Software installed include:
- Operating system: Gentoo 64-bit Linux
- OfficeGrid
- Matlab 64-bit
- Maya
How to use the cluster
Policy
The policy for usage of the image cluster is that:
- All computation jobs should be run via the OfficeGrid resource management system (as far as this is possible - ask the maintainers if you have problems).
- Computation jobs should be written as batch jobs that only reads and writes files and print text to the console (due to the design of OG).
- You may compile programs on all machines, but preferably do this on the imagediskserver1-3 machines (they are identical to imageserver1-3).
- Visualising results, e.g. in matlab, should be done on one of the imagediskserver1-3 machines and NOT on imageserver1-3 machines (in order to leave resources for computations).
Benefits:
- It is easy for you to run and manage several simultaneous jobs.
- You are guaranteed that your job will run when there are enough resources available, otherwise it will be queued until the cluster is vacant.
Usage
- How to access the cluster
- Using Office Grid (resource management)
- How to use software on the cluster
- Temporary (scratch) disk space
- Data disk space
- Using X11 forwarding and Cygwin-X in the cluster
- Mounting a USB harddisk in the cluster
Benchmark
You find some benchmarks on the image cluster here.
Installation notes
- Partitioning the disks
- Installation of Gentoo (minimal 2007.0 installation, amd64 profile)
- Installation of matlab
- Installation of NFS
- Installation of Maya
- Installation of ntfs-3g driver for Big disk
- Installation of Office Grid
- Installation note for imageserver3
Things to do in the future:
- Link bonding / aggregation
- WMware server / player and Windows XP
- On 113, the useflags have been updated, so we need to emerge --newuse world on this machine. But since portage has updated its understanding of use flags, we need to do the following on all cluster machines:
- emerge -C sys-apps/setarch
- emerge -uDav --newuse world
Maintenance
Who are the administrators?
Currently the image cluster administrators are Pechin, Kim, and Jon. If you need any help, try contacting them.
Adding new users
Make the user account the first time on imagediskserver1. This should create the home directory.
Now add the user to all the other machines making sure that user ID and group ID is the same on all machines.
When making new groups one should also do this with the same ID on all machines.
Creating Image Group member users
All members of the Image Group (VIP, Phd, research assistents, etc.) should be in the group 'users' and possible any other relevant groups (e.g. system administrators should be in the group 'wheel').
Example: How to create the user myuser</br>
On imagediskserver1 as root:
useradd -m -g users -p <crypt output> myuser
This will create a new account with the name myuser in the group 'users' with home directory in /home/myuser.
<crypt output> is the crypt encoded password string. You get it from
the diku system by running the following command on ask.diku.dk (or any other host on the diku internal network):
ypmatch myuser passwd
The crypted password is the second string in the ':' separated list.
On the rest of the machines:
Check the user ID (called uid) - can be found using the command "id":
imagediskserver1 ~ # id myuser uid=1100(myuser) gid=100(users) groups=100(users)
Lets assume it is 1100 (don't use the same user ID for all users!!!), then on each machines do
useradd -g users -u 1100 -p <crypt output> myuser
Alternatively, consider using newusers which is based on a similar
format as the passwd file (man 5 passwd).
NOTE: We should probably make a script for this in the future or use some other authetication model.
Adding a user to other groups:
Example: Adding myuser to the group wheel
usermod -a -G wheel myuser
Creating student users
Students doing projects within the image group can get access to the cluster, provided that they have the need for the computational resources or access to data on the cluster. The project supervisor must approve this access. The student account will only be active for the duration of the project.
A student user should always be in the group 'students'.
Example: How to create the user myuser</br>
On imagediskserver1 as root you can create a new account with the name myuser in the group
'students' with home directory in /home/myuser by:
useradd -m --expiredate <YYYY-MM-DD> -g students -p <crypt output> myuser
Here <YYYY-MM-DD> is the date on which the account will be disabled. Set this date shortly after the time the project deadline.
<crypt output> refers to the crypt encoded password string. You get it from
the diku system by running the following command on ask.diku.dk (or any other host on the diku internal network):
ypmatch myuser passwd
The crypted password is the second string in the ':' separated list.
On the rest of the machines:
Check the user ID (called uid) - can be found using the command "id":
imagediskserver1 ~ # id myuser uid=1100(myuser) gid=443(students) groups=443(students)
Lets assume it is 1100 (don't use the same user ID for all users!!!), then on each machines do
useradd --expiredate <YYYY-MM-DD> -g students -u 1100 -p <crypt output> myuser
Automatic synchronization of passwords with the diku system
Rune is working on this.
Backup
Currently there is no backup of the cluster. The only safety lies in the use of RAID 5 on all disk servers, which in theory should provide tolerance towards one faulty disk per disk server. We are working on a backup solution.