How to setup a Gramble Philip Fowler, 14th April 2016 This is a Gramble, which of course is short for a GROMACS Bramble, or, in other words, a Raspberry Pi 2 model B cluster running GROMACS. Given the ARM processor in a Raspberry Pi 2 does not allow SIMD instructions like the more complex (and expensive) Intel chips, why would I want to do such a thing? Well, I wanted to learn how to setup a simple compute cluster. And this is what I did. Unless stated otherwise you need to do this on both machines (or how ever many you are using). 1. Install Ubuntu 14.04 LTS Server onto the microSD cards Each Raspberry Pi 2 runs off a microSD card; on a computer with a microSD slot (I used an iMac) download the Ubuntu 14.04 LTS image and copy it onto the microSD card as described here. Then you simply push it back into the slot on the Raspberry Pi and power it up. Note that Ubuntu does not run on the model A and, at the time of writing, only ran on the model B. If your microSD card is bigger than 2GB you might want to resize the partition. 2. Update First, let’s update the installed software and also install the ssh server so we can remotely connect. sudo apt-get update -y sudo apt-get upgrade -y sudo apt-get install openssh-server 3. Setup the network Now my setup was a bit strange. I was using an old Apple Time Capsule I had; both Raspberry Pis were connected to this via ethernet cables and the Time Capsule itself was in “Extend Wireless Network” mode since our main wireless router is somewhere else in the house. Ideally, I’d want a dual-homed headnode with one public IP address and then a private network for communication within the cluster. Instead each of the two Raspberry Pis have their own IP that is dynamically assigned by my router, but this will do for now. sudo nano /etc/hosts so it reads 127.0.0.1 localhost 192.168.0.12 rasp0 192.168.0.18 rasp1 Also edit the hostname sudo nano /etc/hostname so it matches /etc/hosts rasp0 and reboot sudo reboot 4. Add an MPI user We will need a special user that can log in without passwords to all the nodes that SLURM will use later on. As I understand it, giving it a uid less than 1000 stops the user appearing in any login GUI. sudo adduser mpiuser --uid 999 5. Install NFS We will share a folder on the headnode with all the compute nodes using the NFS protocol. This means we’ll only need to install applications on the headnode and they will be accessible from any compute node. Also this is where the GROMACS output files will be written. on the headnode (rasp0) sudo apt-get install nfs-kernel-server on the compute node (rasp1) sudo apt-get install nfs-common on the headnode (rasp0) add the following to /etc/exports /home/mpiuser *(rw,sync,no_subtree_check) /apps *(rw,sync,no_subtree_check) This will export the folders /apps and /home/mpiuser on rasp0 to all the compute nodes (in this case just rasp1). You need to make sure all folders shared by NFS exist on both machines. So on both machines sudo mkdir /apps You don’t need to mkdir /home/mpiuser as creating this user will have automatically created a home directory for it. Now on the headnode sudo service nfs-kernel-server start On the compute node (rasp1), sudo ufw allow from 192.168.1.0/24 sudo mount rasp0:/home/mpiuser /home/mpiuser sudo mount rasp0:/apps /apps The first line opens a port in the firewall. although I’m definitely sure I needed to do this. The last two manually mount the NFS share from the headnode (rasp0). To set it up so this happens automatically sudo nano /etc/fstab and add rasp0:/home/mpiuser /home/mpiuser nfs rasp0:/apps /apps nfs then we can force a remount via sudo mount -a 6. Create an SSH key pair to allow passwordless login Because we now have /apps and /home/mpiuser shared with all nodes of the cluster (ok, just rasp1, but you know what I mean) we can simply on the headnode create an ssh keypair as mpiuser and it will be shared with all the compute nodes. So on rasp0 su mpiuser ssh-keygen -t rsa cd .ssh/ cat id_rsa.pub >> authorized_keys I didn’t use a passphrase during key generation. I expect this is a bad thing and I did read you could use a key chain, but as this is a toy cluster I’m going to stick my fingers in my ears and pretend I didn’t read that. If you haven’t created an ssh keypair before, it is fairly simple – it creates a public and a private key. This is described in more detail here. The key things are that the private key (.ssh/id_rsa) should only be readable by the mpiuser and no-one else. In Linux-land, this means it should have permissions of 400 – this is how it will be created. Secondly, any remote machine will allow a passwordless login if the public key for that user is in .ssh/authorized_keys; this explains the last line above. Let’s test it. Since we are already the mpiuser and we are on rasp0 ssh rasp1 Should automatically log you into rasp1. If you try the same thing as the default ubuntu user it will prompt you for your password as that user doesn’t have an ssh keypair setup. 7. Compile GROMACS As we are thinking about NFS, let’s compile GROMACS in /apps so the gmx binary can be run from any of the compute node(s). We need a few things before we begin. sudo apt-get install build-essential cmake The first package contains the compilers you’ll need to, well, compile GROMACS and cmake is the build tool GROMACS uses. So as the mpiuser, cd /apps mkdir src cd src/ wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-5.1.2.tar.gz cd gromacs-5.1.2/ mkdir build-gcc48 cd build-gcc48 cmake .. -DGMX_BUILD_OWN_FFTW=ON -DCMAKE_INSTALL_PREFIX=‘/apps/gromacs/5.1.2’ -DBUILD_SHARED_LIBS=off make -j 4 sudo make install Note the unusual -DCBUILD_SHARED_LIBS flag in the GROMACS cmake command; this is to get around an error when compiling GROMACS on the Raspberry Pi. You shouldn’t normally need this flag. Now the make command will take at least ten minutes, so put the kettle on. Because of dynamic linking, you’ll also need to install the compilers on the compute nodes via sudo apt-get install build-essential I suspect using environment modules might avoid this issue; I’m going to play with these and if I can get it to work will write another post. Once you’ve done this, then on any machine you should be able to run GROMACS via source /apps/gromacs/5.1.2/bin/GMXRC gmx mdrun 8. Install a cluster management and job scheduling system (SLURM) Despite the fact the machines I’ve used in the past have tended to use PBS or SGE (and so my fingers can type qstat really, really fast), I chose to use SLURM as it is available as an Ubuntu package our University high performance computing centre have recently started using it and they recommended it it is actively developed and I can’t work out, or at least remember for more than five minutes, what is going on with SGE it has documentation! and tutorials! it is open source (GPL2) I liked the name To install on rasp0 sudo apt-get install slurm-llnl This also installs MUNGE as a pre-requisite (see the next section). Now SLURM appeared to want to use /usr/bin/mail and complained when it couldn’t find it so I also installed sudo apt-get install mailutils which drops you into a setup screen and I chose the “local” option. Also on rasp1 sudo apt-get install slurm-llnl 9. Get MUNGE working MUNGE creates and validates credentials and SLURM uses it. On the headnode sudo /usr/sbin/create-munge-key This creates a key /etc/munge/munge.key. Now copy this key to /etc/munge/ on all nodes (you may need to fiddle with permissions etc to use the NFS share). There appears to be a bug with Ubuntu and MUNGE, but the workaround is to do the following on all nodes sudo nano /etc/default/munge and add the line OPTIONS=“—force" now start the service sudo service munge start Check it is running ps -e | grep munge 10. Get SLURM working This was the bit I wasn’t looking forward to as job schedulers have, frankly, scared me. But it turns out this was one of the easiest steps. If we are the mpiuser on rasp0. SLURM comes with a very simple configuration file that we can edit. cp /usr/share/doc/slurm-llnl/examples/slurm.conf.simple.gz . gunzip slurm.conf.simple.gz nano slurm.conf.simple All I did was change the lines so they read ControlMachine=rasp0 .. NodeName=rasp[0-1] Procs=4 State=UNKNOWN PartitionName=test Nodes=rasp[0-1] Default=YES MaxTime=INIFINITE State=UP You’ll notice I’ve identified rasp0 as both the ControlMachine (i.e. headnode) and also a Node (compute node) belonging to the test Partition. On a regular cluster you probably don’t want the headnode also being a compute node, but I only had two Raspberry Pis so I thought why not? This also shows the syntax for referring to multiple nodes. If you want a more complex configuration an online configuration tool is provided. There is also an even more complicated online configurator. A note of caution: these may not work with the version of SLURM installed by apt-get (2.6.5) since the current version is 15.08. That doesn’t mean 2.6.5 is old; they’ve changed the numbering system recently. Finally copy the file to the right place on all nodes sudo cp slurm.conf.simple /etc/slurm-llnl/slurm.conf and (on the headnode, rasp0) sudo service slurm start on the compute node sudo slurmd -c Test! srun -N1 hostname or sinfo 11. Submit a GROMACS job to the queue I’m going to assume you have prepared a TPR file called md.tpr and have copied it into /home/ubuntu (and we are now the default user, ubuntu). Let’s do some simple benchmarking – remember a Raspberry Pi has 4 cores. So first, let’s create a series of TPR files cp md.tpr md-1.tpr cp md.tpr md-2.tpr cp md.tpr md-4.tpr Now let’s create some SLURM job submission files. This is the one for running on two cores – you’ll need to change the --cpus-per-task, the --job-name SBATCH flags and the -deffnm and -ntmpi GROMACS flags depending on the number of cores. sudo nano md-2.slurm.sh and copy in #!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=2 #SBATCH --time=00:15:00 #SBATCH --job-name=md-2 source /apps/gromacs/5.1.2/bin/GMXRC srun gmx mdrun -deffnm md-2 -ntmpi 2 -ntomp 1 -maxh 0.1 -resethway -noconfout This will run for 6 minutes, resetting the GROMACS timers after 3 minutes. It won’t write out a final GRO file as this can affect the timings. Hopefully you’ll find that, whilst useful and fun machines, Raspberry Pis are really slow at running GROMACS! To submit the jobs sbatch md-2.slurm.sh To check the queue we can issue squeue and to cancel we can use scancel. Ta da! Share this:Twitter Related computing molecular dynamics
antimicrobial resistance New preprint: rapid prediction of AMR by free energy methods 15th January 202015th January 2020 The story behind this preprint goes back to the workshop on free energy methods run… Share this:Twitter Read More
computing GROMACS on AWS: Performance and Cost 17th January 20163rd March 2019 So we have created an Amazon Machine Image (AMI) with GROMACS installed. In this post… Share this:Twitter Read More
molecular dynamics Installing GROMACS with MPI support on a Mac 5th December 2014 GROMACS is an optimised molecular dynamics code, primarily used for simulating the behaviour of proteins…. Share this:Twitter Read More