Setting up a GROMACS cluster

2. Setup networking

We need to setup the cluster so each compute node gets assigned a static private IP by the headnode using DHCP and then each compute node can see the outside network by communicating via the headnode using NAT. Most of the work here is on the headnode.

Headnode

At present the public IP of the headnode is assigned by DHCP. Edit /etc/network/interfaces.

$ sudo vim /etc/network/interfaces

The file looks like

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet dhcp

auto eth1
iface eth1 inet static
address 192.168.0.100
network 192.168.0.0
netmask 255.255.255.0
broadcast 192.168.0.255

Then edit the hosts file

$ sudo vim /etc/hosts

127.0.0.1 localhost
192.168.0.100 bioch6054
192.168.0.1 node01
192.168.0.2 node02
192.168.0.3 node03
..

And finally edit/check the hostname is consistent.

$ sudo vim /etc/hostname

A reboot here wouldn’t go amiss. Now we can setup the headnode to use DHCP/NAT. First, find out the name of the DNS servers that the headnode has obtained via DHCP and make a note of these.

$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 129.68.1.150
nameserver 163.1.4.1
nameserver 129.67.71.244
search bioch.ox.ac.uk

Edit

$ sudo vim /etc/sysctl.conf

so this line is uncommented

net.ipv4.ip_forward=1

Save the file and run the following command to make the change effective without a reboot.

$ sudo sysctl -w net.ipv4.ip_forward=1

Now edit /etc/rc.local

$ sudo vim /etc/rc.local

Make sure the following two lines appear before the exit 0 line in the file.

/sbin/iptables -P FORWARD ACCEPT
/sbin/iptables --table nat -A POSTROUTING -o eth0 -j MASQUERADE

To make these iptables rules active without rebooting, run the following commands:

$ sudo iptables -P FORWARD ACCEPT
$ sudo iptables –-table nat -A POSTROUTING -o eth0 -j MASQUERADE

Now install the DHCP server

$ sudo apt-get install isc-dhcp-server

and edit its configuration file

$ sudo vim /etc/dhcp/dhcpd.conf

These are the bits of the file I changed. First need to add the DNS servers

option domain-name-servers 129.67.1.180, 163.1.2.1, 129.67.72.244;

Mark this as the authoritative server

authoritative;

And define the subnet

subnet 192.168.0.0 netmask 255.255.255.0 {
option routers 192.168.0.100;
option subnet-mask 255.255.255.0;
option broadcast-address 192.168.0.255;
group {
    host node01 { hardware ethernet 00:24:35:f3:db:86; fixed-address 192.168.0.1; }
    host node02 { hardware ethernet 00:24:35:f3:df:7e; fixed-address 192.168.0.2; }
    host node03 { hardware ethernet 00:24:35:f3:dc:74; fixed-address 192.168.0.3; }
    }
}

The bit in the group is important; here each compute node is identified by its MAC address and assigned a fixed IP address. The Xserves have a handy pull-out card at the back that tells you the MAC address of each ethernet port so there is no need to go digging around with ifconfig. Finally, tell DHCP that requests will come via the second ethernet port. Edit /etc/default/isc-dhcp-server so it reads

INTERFACES="eth1"

Now we can setup NFS. What caught me out here is Ubuntu 12.04 installs NFS version 4 by default but nearly all the guides I found were version 3. It sort of worked if you do this, but nodes would start to become unresponsive. Here we shall specify that we are using NFS version 3. On the headnode, the directories that are being exported must exist – we’ve already created /opt/software above and /home already exists. Install the NFS server

$ sudo apt-get install nfs-kernel-server

Now we add the following lines to /etc/exports

/home 192.168.0.0/24(rw,sync,no_subtree_check,no_root_squash)
/opt/software 192.168.0.0/24(rw,sync,no_subtree_check,no_root_squash)

and edit the server settings

$ sudo vim /etc/default/nfs-kernel-server

making sure the line reads

RPCMOUNTDOPTS="--manage-gids --no-nfs-version 4"

and start the service

$ sudo service nfs-kernel-server start

Compute nodes

Don’t need to change the /etc/network/interfaces file as using DHCP and this will be setup by default to be.

# The loopback network interface
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp

Instead copy the /etc/hosts file over (but remember that the loopback address, 127.0.1.1, needs to refer the name of the compute node, not the headnode), and edit the hostname. Again we’ve already created the mount points, so add the following lines to /etc/fstab to ensure the directories are mounted at boot

bioch6054:/home /home nfs nfsvers=3
bioch6054:/opt/software /opt/software nfs nfsvers=3

A reboot here seems to do the trick. Now we have /home/fowler shared via NFS between all the machines we can setup an ssh keypair for logging without passwords.

$ ssh-keygen -t rsa
$ cd .ssh/
$ cat id_rsa.pub >> authorized_keys

You now should have a networked cluster. Each compute node should pick up a static IP from the headnode, everything should be reachable by a consistent naming scheme (e.g. ssh node02). Files written to the shared directories /home/ and /opt/software should be visible on all machines. The user fowler should be able to write and read files in /home/fowler on any machine and you should have sudo access to any file in the shared directories.

Leave a Comment

Your email address will not be published. Required fields are marked *