Setting up a GROMACS cluster Philip Fowler, 28th April 2016 4. Installing SLURM As I wrote before, this was the part that scared me most, but turned out to be the easiest. We’ve already installed SLURM using apt-get on all machines; this has also installed MUNGE which is needed for authentication. The first step is to setup MUNGE on all machines. First we need to create a MUNGE key on the headnode and distribute it to all the machines for authentication. $ sudo /usr/sbin/create-munge-key To copy it to the compute nodes, the easiest thing is to put a copy in your user home directory as this will be shared by NFS. $ sudo cp /etc/munge/munge.key /home/fowler/ Then on each compute node we can copy the file into the right directory /home/fowler/$ sudo cp munge.key /etc/munge/ There is a problem with permissions here; the original key is owned by munge which is part of the munge group. As we’ve been moving it around with sudo, it is now owned by root. So we also need to change the permissions back on each compute node. $ sudo chown munge /etc/munge/munge.key $ sudo chgrp munge /etc/munge/munge.key There appears to be a bug with Ubuntu as well, this is fixed by adding the following line, on all machines, to /etc/default/munge OPTIONS="--force" Now on each machine we can start the MUNGE service $ sudo service munge start and check it is running $ ps -e | grep munge Now we are in a position to setup SLURM. /home/fowler$ cp /usr/share/doc/slurm-llnl/examples/slurm.conf.simple.gz . /home/fowler$ gunzip slurm.conf.simple.gz /home/fowler$ mv slurm.conf.simple slurm.conf /home/fowler$ vim slurm.conf This is an example, basic SLURM configuration file which you can just alter. All I did was change the following lines from their defaults to match my cluster, which is currently ControlMachine=bioch6054 .. NodeName=node0[1-2] Procs=16 State=UNKNOWN PartitionName=production Nodes=node0[1-2] Default=YES MaxTime=INFINITE State=UP This file needs to be copied to every machine, including the headnode. /home/fowler/$ sudo cp slurm.conf /etc/slurm Now we can start the SLURM controller on the headnode $ sudo service slurm-llnl start and the SLURM daemon on each compute note $ slurrmd -c For a simple check, run the following on the headnode $ srun -N1 hostname node01 Or to try using two nodes $ srun -N2 hostname node01 node02 I’m going to assume you have a GROMACS TPR file called at-1-1.tpr that you’ve made using GROMPP. Let’s benchmark the cluster. So on the headnode, create and edit a SLURM job file /home/fowler$ sudo vim at-1-1.sh Mine looks like #!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=1 #SBATCH --time=00:15:00 #SBATCH --job-name=at-1-1 source /etc/profile.d/modules.sh module load gromacs/5.1.2 gmx mdrun -deffnm at-1-1 -ntmpi 1 -ntomp 1 -maxh 0.1 -resethway -noconfout -stepout 100 -v This has some benchmarking specific flags. Make sure at-1-1.sh and at-1-1.tpr are both present, then submit it to the queue $ sbatch at-1-1.sh Submitted batch job 1 and then we can check the queue $ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1 productio at-1-1 fowler R 0:28 1 node02 Or, let’s run an MPI GROMACS job instead. #!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks-per-node=8 #SBATCH --cpus-per-task=1 #SBATCH --time=00:15:00 #SBATCH --job-name=at-1-1 source /etc/profile.d/modules.sh module load gromacs/5.1.2 mpirun -np 8 mdrun_mpi -deffnm at-1-1 -maxh 0.1 -resethway -noconfout -stepout 100 -v Job done! Share this: Click to share on X (Opens in new window) X Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Mastodon (Opens in new window) Mastodon Related Pages: 1 2 3 4 5 computing molecular dynamics skills
computing More posts on the Oxford Software Carpentry Boot Camp 7th November 201223rd September 2018 Mike Jackson from the Software Sustainability Institute was one of our instructors last week and… Share this: Click to share on X (Opens in new window) X Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Mastodon (Opens in new window) Mastodon Read More
computing Is Software a Method? 1st April 201523rd September 2018 Last month I went to the Annual Meeting of the US Biophysical Society. As a… Share this: Click to share on X (Opens in new window) X Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Mastodon (Opens in new window) Mastodon Read More
New publication: Predicting antibiotic resistance in complex protein targets using alchemical free energy methods 26th August 202224th October 2022 In this paper, Alice Brankin calculates how different mutations in the DNA gyrase affect the… Share this: Click to share on X (Opens in new window) X Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Mastodon (Opens in new window) Mastodon Read More