computing

GROMACS on AWS: compiling against CUDA

If you want to compile GROMACS to run on a GPU Amazon Web Services EC2 instance, please first read these instructions on how to compile GROMACS on an AMI without CUDA. These instructions then explain how to install the CUDA toolkit and compile GROMACS against it.

The first few steps are loosely based on these instructions, except rather than download the NVIDIA driver, we shall download the CUDA toolkit since this includes an NVIDIA driver. First we need to make sure the kernel is updated

sudo yum install kernel-devel-`uname -r`
sudo reboot

Safest to do a quick reboot here. Assuming you are in your HOME directory, move into your packages folder.

cd packages/

And download the CUDA toolkit (version 7.5 at present)

wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda_7.5.18_linux.run
sudo /bin/bash cuda_7.5.18_linux.run

It will ask you to accept the license and then asks you a series of questions. I answer Yes to everything except installing the CUDA samples. Now add the following to the end of your ~/.bash_profile using a text editor

export PATH; PATH="/usr/local/cuda-7.5/bin:$PATH"
export LD_LIBRARY_PATH; LD_LIBRARY_PATH="/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH"

Now we can build GROMACS against the CUDA toolkit. I’m assuming you’ve already downloaded a version of GROMACS and probably installed a non-CUDA version of GROMACS (so you’ll already have one build directory). Let’s make another build directory. You can call it what you want, but some kind of consistent naming can be helpful. The -j 4 flag assumes you have four cores to compile on – this will depend on the EC2 instance you have deployed. Obviously the more cores, the faster, but GROMACS only takes minutes, not hours.

mkdir build-gcc48-cuda75
cd build-gcc48-cuda75
cmake .. -DGMX_BUILD_OWN_FFTW=ON -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/5.0.7-cuda/  -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
make -j 4
sudo make install

To load all the GROMACS tools into your $PATH, run this command and you are done!

source /usr/local/gromacs/5.0.7-cuda/bin/GMXRC

If you run this mdrun binary on a GPU instance it should automatically detect the GPU and run on it, assuming your MDP file options support this. If it does you will see this zip by in the log file as GROMACS starts up

1 GPU detected:
  #0: NVIDIA GRID K520, compute cap.: 3.0, ECC:  no, stat: compatible

1 GPU auto-selected for this run.
Mapping of GPU to the 1 PP rank in this node: #0

Will do PME sum in reciprocal space for electrostatic interactions.

Depending on the size and forcefield you are using you should get a speedup of at least a factor two, and realistically three, using a GPU in combination with the CPUs. For example, see these benchmarks.

GROMACS on AWS: compiling GCC

These are some quick instructions on how to build a more recent version of GCC than is provided by the devel-tools package on the Cent OS based Amazon Linux AMI. (currently GCC 4.8.3) You may, for example, wish to use a more recent version to compile GROMACS – that is my interest. If so, then these instructions assume you have done all the steps up to, but not including, compiling GROMACS in this post. Compiling GCC needs several GB of disk space so if you use the default 8GB for an EC2 AMI it will run out of disk space; increasing this to 12 GB is sufficient.

First let’s find out what versions of GCC are available.

[ec2-user@ip-172-30-0-42 ~]$ svn ls svn://gcc.gnu.org/svn/gcc/tags | grep gcc | grep release
...
gcc_4_8_3_release/
gcc_4_8_4_release/
gcc_4_8_5_release/
gcc_4_9_0_release/
gcc_4_9_1_release/
gcc_4_9_2_release/
gcc_4_9_3_release/
gcc_5_1_0_release/
gcc_5_2_0_release/
gcc_5_3_0_release/

As you can see when I wrote this 5.3.0 was the most recent stable version, so let’s try that one. I’m going to compile everything inside a folder called packages/ so let’s create that then use subversion to check out version 5.3.0 (this is going to download a lot of files so will take a minute or two)

[ec2-user@ip-172-30-0-42 ~]$ mkdir ~/packages
[ec2-user@ip-172-30-0-42 ~]$ cd ~/packages
[ec2-user@ip-172-30-0-42 packages]$ svn co svn://gcc.gnu.org/svn/gcc/tags/gcc_5_3_0_release/
A    gcc_5_3_0_release/config-ml.in
A    gcc_5_3_0_release/libitm
...
A    gcc_5_3_0_release/fixincludes/fixopts.c
A    gcc_5_3_0_release/install-sh
A    gcc_5_3_0_release/ylwrap
 U   gcc_5_3_0_release
Checked out revision 232268.
[ec2-user@ip-172-30-0-42 packages]$ cd gcc_5_3_0_release/

GCC needs some prerequisites which are installed by this script.

[ec2-user@ip-172-30-0-42 gcc_5_3_0_release]$ ./contrib/download_prerequisites 
--2016-01-12 13:24:23--  ftp://gcc.gnu.org/pub/gcc/infrastructure/mpfr-2.4.2.tar.bz2
       => ‘mpfr-2.4.2.tar.bz2’
Resolving gcc.gnu.org (gcc.gnu.org)... 209.132.180.131
...
isl-0.14.tar.bz2    100%[=====================>]   1.33M   693KB/s   in 2.0s   

2016-01-12 13:24:39 (693 KB/s) - ‘isl-0.14.tar.bz2’ saved [1399896]

Go up a level, make a build directory and move there.

[ec2-user@ip-172-30-0-42 gcc_5_3_0_release]$ cd ..
[ec2-user@ip-172-30-0-42 packages]$ mkdir gcc_5_3_0_release_build/
[ec2-user@ip-172-30-0-42 packages]$ cd gcc_5_3_0_release_build/

Now we are in a position to compile GCC 5.3.0. This took about 50 min using all eight cores of a c3.2xlarge instance, so this is a good moment to go and have lunch. Note that since the instance I am compiling on has 8 virtual CPUs, I can use the -j 8 flag to tell make to use up to 8 threads during compilation which will speed things up. If you are using a micro instance, just omit the
-j 8 (but good luck as that would take a long time).

[ec2-user@ip-172-30-0-42 gcc_5_3_0_release_build]$ ../gcc_5_3_0_release/configure && make -j 8 && sudo make install && echo "success" && date
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking target system type... x86_64-unknown-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether ln works... yes
...

Hopefully you now have a newer version of GCC to compile binaries with. With any luck it might even give you a performance boost.

GROMACS on AWS: Performance and Cost

So we have created an Amazon Machine Image (AMI) with GROMACS installed. In this post I will examine the sort of single core performance you can expect and much this is likely to cost compared to other compute options you might have.

Benchmark

To test the different types of instances you can deploy our GROMACS image on, we need a benchmark system to test. For this I’ve chosen a peptide MFS transporter in a simple POPC lipid bilayer solvated by water. This is very similar to the simulations found in this paper. Or to put it another way: 78,000 atoms in a cube, most of which are water, some belong to lipids and the rest, protein. It is fully atomistic and is described using the CHARMM27 forcefield.

Computing Resources Tested

I tried to use a range of compute resources to provide a good comparison for AWS. First, and most obviously, I used my workstation on my desk, which is a late-2013 MacPro which has 12 Intel Xeon cores. In our department we also have a small compute cluster, each node of which has 16 cores. Some of these nodes also have a K20 GPU. Then I also have access to a much larger computing cluster run by the University. Unfortunately, since the division I am in has decided not to contribute to its running, I have to pay for any significant usage.

Rather than test all the different types of instances available on EC2, I tested an example from each of the current (m4) and older generation (m3) of non-burstable general purpose instances. I also tested an example from the latest generation of compute instances (c4) and finally the smaller instance from the GPU instances (g2).

Performance

fig-aws-gromacs-performance

The performance, in nanoseconds per day for a single compute core, is shown on the left (bigger is better).

One worry about AWS EC2 is that for a highly-optimised compute code, like GROMACS, performance might suffer due to the layers of virtualisation, but, as you can see, even the current generation of general purpose instances is as fast as my MacPro workstation. The fastest machine, perhaps unsurprisingly, is the new University compute cluster. On AWS, the compute c2 class is faster than the current general purpose m4 class, which in turn is faster than the older generation general purpose m3 class. Finally, as you might expect, using a GPU boosts performance by slightly more than 3x.

 

Cost

fig-aws-gromacs-costI’m going to do a “real” comparison. So if I buy a compute cluster and keep it in the department I only have to pay the purchase cost but none of the running costs. So I’m assuming the workstation is £2,500 and a single 16-core node is £4,000 and both of these have a five year lifetime. Alternatively I can use the university’s high performance computing clusters at 2p per core hour. This obviously is unfair on the university facility as this does include operational costs, like electricity, staff etc, and you can see that reflected in the difference in costs.

So AWS EC2 more or less expensive? This hinges on whether you use it in the standard “on demand” manner or instead get access through bidding via the market. The later is significantly cheaper but you only have access whilst your bid price is above the current “spot price” and so you can’t guarantee access and your simulations have to be able to cope with restarts. Since the spot price varies with time, I took the average of two prices at different times on Wed 13 Jan 2016.

As you can see AWS is more expensive per core hour if you use it “on demand”, but is cheaper than the university facility if you are willing to surf the market. Really, though we should be considering the cost efficiency i.e. the cost per nanosecond as this also takes into account the performance.

 

Cost efficiency

fig-aws-gromacs-efficiency

 

When we do this an interesting picture emerges: using AWS EC2 via bidding on the market is cheaper than using the university facility and can be as cheap as buying your own hardware even if you don’t have to pay the running costs. Furthermore, as you’d expect, using a GPU decreases cost and so should be a no-brainer for GROMACS.

Of course, this assumes lots of people don’t start using the EC2 market, thereby driving the spot price up…

GROMACS on AWS

In this post I’m going to show how I created an Amazon Machine Instance with GROMACS 5.0.7 installed for use in the Amazon Web Services cloud.

I’m going to assume that you have signed up for Amazon Web Services (AWS), created an Identity and Access Management (IAM) user (each AWS account can have multiple IAM users), created an SSH key pair for that user, downloaded it, given it an appropriate name with the correct permissions and placed it in. ~/.ssh. Amazon have a good tutorial that cover the above actions. One thing that confused me is if you already have an amazon.com or amazon.co.uk account then you can use this to signup to AWS. In other words, depending on your mood, you can order a book or 10,000 CPU hours of compute. I felt a bit nervous about setting up an account backed by my credit card – if you also feel nervous, then Amazon offer a Free Tier which permits you at present to use up to 750 hours a month, as long as you only use the smallest virtual machine instance (t2.micro). If you use more than this, or use a more powerful instance then you will be billed.

First, log in to your AWS console. This will have a strange URL like

https://123456789012.signin.aws.amazon.com/console

where 123456789012 is your AWS account number. You should get something that looks like this.

AWS Management Console

AWS Management Console

Next we need to create an EC2 (ElastiCloud) instance based on one of the standard virtual machine images and download and compile GROMACS on it. In the AWS Management Console, choosing “EC2” in the top left should bring you here

AWS EC2 dashboard

AWS EC2 dashboard

Now click the Blue “Launch Instance” button.

Step 1. Choose an Amazon Machine Instance (AMI).

Here we can choose one of the standard virtual machine images to compile GROMACS on. Let’s keep it simple and use the standard Amazon Linux AMI.

aws-ec2-1

Step 2. Choose an Instance Type.

The important thing to remember here is that the image we create can be run on any instance type. So if we want to compile on multiple cores to speed things up we can choose an instance with say 8 vCPUs, or if we don’t want to be billed and are willing to wait that we can choose the t2.micro instance. Let’s choose an c4.2xlarge instance which has 8 vCPUs. You can at this stage hit “Review and Launch” but it is worth checking the amount of storage allocated to the instance. So hit Next:Configure Instance Details. I’m not going to fiddle with these options. Hit Next:Add Storage.

aws-ec2-2

Step 4. Add storage.

What I have found here is if you use the version of gcc installed via yum (4.8.3) then 8 GB is fine, but if you want to compile a more recent version you will need at least 12 GB.
I’m going to accept the rest of the defaults for the rest of the steps so will click “Review and Launch” now.

aws-ec2-4

Step 7. Review instance Launch.

Check it all looks ok and hit “Launch”. This will bring up a window. Here it is crucial that you choose the name of the keypair you created and downloaded. As you need a different key pair for each IAM user for each Amazon Region, it is worth naming them carefully as you will otherwise rapidly get very confused. Also Amazon don’t let you download a key pair again so you have to be careful with them. You can see mine is called

PhilFowler-key-pair-euwest.pem

Which contains the name of my IAM user and the name of the AWS region it will work for, here EU West, which is Ireland. Hit Launch.

aws-ec2-7b

Launch Status

This window gives you some links on how to connect to the AWS instance. Hit View Instances””. It may take a minute or two for your instance to be created. During this time the status is given as “Initializing”. When it is finished, you can click on your new instance (you should have only one) and it will give you a whole host of information. We need the public IP address and the name of our SSH key pair so we can ssh to the instance (Note that the user by default is called ec2-user).

aws-instances-2

lambda 508 $ ssh -i "PhilFowler-key-pair-euwest.pem" ec2-user@54.229.73.128
The authenticity of host '54.229.73.128 (54.229.73.128)' can't be established.
ECDSA key fingerprint is SHA256:N+B3toLxLE3vRuuzLZWF44N9qb3ucUVVU/RD00W3iNo.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '54.229.73.128' (ECDSA) to the list of known hosts.

__| __|_ )
_| ( / Amazon Linux AMI
___|\___|___|

https://aws.amazon.com/amazon-linux-ami/2015.09-release-notes/
11 package(s) needed for security, out of 27 available
Run "sudo yum update" to apply all updates.
[ec2-user@ip-172-30-0-42 ~]$

Installing pre-requisites

Amazon Linux is based on CentOS so uses the yum package manager. You might be more familiar with apt-get if you use Ubuntu but the principles are similar. Worth following their recommendation and applying all the updates – this will spew out a lot of information to the terminal and asks you to confirm.

[ec2-user@ip-172-30-0-42 ~]$ sudo yum update
Loaded plugins: priorities, update-motd, upgrade-helper
Resolving Dependencies
--> Running transaction check
---> Package aws-cli.noarch 0:1.9.1-1.29.amzn1 will be updated
---> Package aws-cli.noarch 0:1.9.11-1.30.amzn1 will be an update
---> Package binutils.x86_64 0:2.23.52.0.1-30.64.amzn1 will be updated
---> Package binutils.x86_64 0:2.23.52.0.1-55.65.amzn1 will be an update
---> Package ec2-net-utils.noarch 0:0.4-1.23.amzn1 will be updated
...
sudo.x86_64 0:1.8.6p3-20.21.amzn1
vim-common.x86_64 2:7.4.944-1.35.amzn1
vim-enhanced.x86_64 2:7.4.944-1.35.amzn1
vim-filesystem.x86_64 2:7.4.944-1.35.amzn1
vim-minimal.x86_64 2:7.4.944-1.35.amzn1

Complete!

This instance is fairly basic and there is no version of gcc, cmake etc. But we can install them via yum

[ec2-user@ip-172-30-0-42 ~]$ sudo yum install gcc gcc-c++ openmpi-devel mpich-devel cmake svn texinfo-tex flex zip libgcc.i686 glibc-devel.i686
...
texlive-xdvi.noarch 2:svn26689.22.85-27.21.amzn1
texlive-xdvi-bin.x86_64 2:svn26509.0-27.20130427_r30134.21.amzn1
zziplib.x86_64 0:0.13.62-1.3.amzn1

Complete!

Next we need to add some the openmpi executables to the $PATH. These will only persist for this session; to make them permanent add them to the .bashrc.

export PATH=/usr/lib64/openmpi/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib64/openmpi/lib

Now we hit a potential problem. The version of gcc installed by yum is fairly old

[ec2-user@ip-172-30-0-42 ~]$ gcc --version
gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Having said that 4.8.3 should be good enough for GROMACS. I’ll push ahead using this version, but in a subsequent post I also detail how to download and install gcc 5.3.0.

Compiling GROMACS

First, let’s get the GROMACS source code using wget. I’m going to compile version 5.0.7 since I’ve got benchmarks for this one, but you could equally install 5.1.X.

[ec2-user@ip-172-30-0-42 ~]$ mkdir ~/packages
[ec2-user@ip-172-30-0-42 ~]$ cd ~/packages
[ec2-user@ip-172-30-0-42 packages]$ wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-5.0.7.tar.gz
[ec2-user@ip-172-30-0-42 packages]$ tar zxvf gromacs-5.0.7.tar.gz
[ec2-user@ip-172-30-0-42 packages]$ cd gromacs-5.0.7

Now let’s make a build directory, move there and then issue the cmake directive

[ec2-user@ip-172-30-0-42 gromacs-5.0.7]$ mkdir build-gcc48
[ec2-user@ip-172-30-0-42 gromacs-5.0.7]$ cd build-gcc48
[ec2-user@ip-172-30-0-42 build-gcc48]$ cmake .. -DGMX_BUILD_OWN_FFTW=ON -DCMAKE_INSTALL_PREFIX='/usr/local/gromacs/5.0.7/

The compilation step will take a good few minutes on a single core machine, but as I’ve got 8 virtual CPUs to play with I can give make the “-j 8” flag which is going to speed things up.

[ec2-user@ip-172-30-0-42 build-gcc48]$ make -j 8
...
Building CXX object src/programs/CMakeFiles/gmx.dir/gmx.cpp.o
Building CXX object src/programs/CMakeFiles/gmx.dir/legacymodules.cpp.o
Linking CXX executable ../../bin/gmx
[100%] Built target gmx
Linking CXX executable ../../bin/template
[100%] Built target template

This took 90 seconds using all 8 cores. Now we can install the binary. Note that because I told cmake to install it in /usr/local/gromacs/5.0.7 so I can keep track of different versions, rather than just having /usr/local/gromacs

[ec2-user@ip-172-30-0-51 build-gcc48]$ sudo make install
...
-- Installing: Creating symbolic link /usr/local/gromacs/5.0.7/bin/g_velacc
-- Installing: Creating symbolic link /usr/local/gromacs/5.0.7/bin/g_wham
-- Installing: Creating symbolic link /usr/local/gromacs/5.0.7/bin/g_wheel

To add this version of GROMACS to your $PATH (add this to .bashrc to avoid doing this each time)

[ec2-user@ip-172-30-0-51 build-gcc48]$ source /usr/local/gromacs/5.0.7/bin/GMXRC

Now you have all the GROMACS tools available!

Analysing Simulation Data CECAM Workshop, Jülich, 14-15 October 2015

IMG_0987 (1)

This two day workshop on Analysing Simulation Data was part of the larger CECAM Macromolecular Simulation Software Workshop at the Forschnungzentrum, Jülich that I co-organised. It was the second workshop and immediately followed an introductory Software Carpentry workshop.

Prior to a few years ago I analysed all my simulation data using either VMD, often by writing Tcl scripts, or using a GROMACS g_tools, if one was available. Then I started using MDAnalysis, a python module. This enabled me to do two things: first MDAnalysis has its own analysis routines and therefore you could often do all the analysis you needed to in a simple python script. More powerfully, since it can read many different simulation formats, it can also act as a gatekeeper to the huge range of powerful python modules. The net result is I have been able to analyse my data in ways that previously would not have been possible (i.e. I would have had to write C code.)

For example we presented a paper (open access) at a Faraday Discussion meeting where we used the image-processing tools in scikt-image to analyse whether the presence of a small cell-signalling protein retarded the rate at which a three-component lipid bilayer phase separated. I posted some example code on GitHub. In other work, not all published, I have used scipy and numpy to, for example, calculate the power spectra of fluctuations in lipid bilayers (using fast fourier transforms).

Aim

The workshop brought together researchers, especially PhD students and postdoctoral researchers, and academic software developers.

The hope was that the researchers would come out of it feeling not only more confident about developing their own software and maybe even start contributing to an academic open source project but also that they could use the python ecosystem to analyse their data in new and interesting ways.

For the developers the hope was they would get to talk to a range of current and prospective users and gain a better understanding of how people are using their code (and maybe pick up some contributors along the way)

IMG_0970

Structure

I felt that a traditional didactic approach wouldn’t work; so no sessions of talks + questions. In the end I stole shamelessly from the excellent series of Collaborations Workshops run by the Software Sustainability Institute in the UK. The workshop worked towards and cumulated in a HackDay. I now believe HackDays are great ways of not only teaching but also as a way of building teams — I am writing a post of this for the SSI and will link to it here when it is up.

IMG_0988 (1)I invited developers from two biomolecular python projects: MDAnalysis and pmx. Given more budget I would have loved to invite other developers, e.g. from mdtraj. On the first day each project gave a short talk followed by around two hours of guided tutorial. Then at the end of day one, I invited participants to present analysis problems drawn from their own research that they would like to solve. Teams were allowed to form around six ideas. On day two these teams had around six hours to solve their problem, before presenting their solution to the rest of the workshop. The winning project, MDARTINI, aimed to make MDAnalysis more aware of the coarse-grained forcefield, MARTINI.

Feedback

Overall,

  • 94% of participants enjoyed the workshop,
  • 100% learnt something useful that will help their research
  • 100% would recommend a workshop like this one to other researchers.
  • 88% feel confident enough to contribute to an academic open source project.
graph-understand-enough-try

“I now understand enough to try using the following tools”

I then asked “I now understand enough to try using the following tools”. Given most participants had heard of MDAnalysis, but only a few had used it – and very few had heard of pmx – this is an encouraging shift. This was then followed up by: “I intend using the tools and methods to help my research”. Usually, the answers are a bit more pessimistic as people might understand a tool, but not have any intention of using it. Here, though it goes the other way.

graph-intend-using

“I intend using the tools and methods to help my research.”

To try and understand which parts of the workshop went well: “I enjoyed the following components of the workshop”. So, talks were ok, then the HackDay but the tutorials and meeting other researchers were most highly rated.

graph-enjoyed-components

“I enjoyed the following components of the workshop.”

Finally, to find out if there were any practical problems I asked “The following elements contributed to making the workshop a success”.

graph-elements-contributed-success

“The following elements contributed to making the workshop a success.”

The big problem here was the network; we had better connectivity in the small hotel in Jülich. It turned out there was a problem with the wireless router in the room and this was fixed a few days after this workshop. Nor did many people like the location in Jülich, however the various coffee breaks – which we were grateful to the Software Sustainability Institute for sponsoring – and the general social atmosphere were appreciated.

 

 

Lessons for next time

This type of workshop is very complicated and plenty can go wrong. Always have a Plan B. For example, assume that not everyone will be able to install all the necessary software on their laptops so come prepared with a (linux) virtual machine image that will work in all the tutorials. And don’t assume that the network will “just work”.

Software Carpentry Workshop, Jülich, 12-13 October 2015

IMG_0944 (1)Last week, myself and David Dotson from
ASU, ran a 2 day Software Carpentry workshop to kick off the CECAM Macromolecular Simulation Software Workshop at the Forschnungzentrum, Jülich. The idea was to give participants who were less well versed in python and working collaboratively with e.g. git a crash course to bring them up to speed for the following five mini-workshops. As you can imagine, coffee and tea are essential for running an intensive bootcamp and we owe thanks to The Software Sustainability Institute for sponsoring our coffee breaks.

 
As we were a self-organised workshop, there was no centrally coordinated surveying of the participants to gauge their level of experience. So instead I sent out a questionnaire very similar to one I’d previously sent before the first workshop I organised back in 2012. As is often the case, the learners were more comfortable with bash and simple python, but hadn’t heard or used testing or version control. Interestingly, compared to this previous workshop a higher proportion of learners were experienced in bash and python. Both groups were drawn from the bimolecular simulation community so this may reflect an increasing level of expertise.

fig-pre-expertiseThe workshop itself was the smoothest I’ve been involved in; I think it helped that both myself and David have taught several now. Also, devoting three hours for each of bash and version control and then six hours for python (including coffee breaks) meant it wasn’t quite as rushed. The last workshop I taught was in January 2015 and the course materials have been overhauled and updated and separated from the workshop GitHub repository. The latest version of the materials seemed to work well.
It also meant I was unfamiliar with the evolution of ipython notebooks into jupyter notebooks which David used to teach. Interestingly, although there was only one helper, Charlie Laughton, we were never overwhelmed. At each workshop I have taught or organised the ratio of helpers to learners decreased, which may reflect improvements in installation and the course materials.
Finally, I was live coding on my Mac laptop and using the new Split View in Mac OS 10.11 worked really well.

That is what I thought: what about the learners? I had fifteen
responfig-post-understand-enoughses to the questionnaire which was about a two-thirds response rate. All of them agreed with the statements “I enjoyed the Software Carpentry workshop” and “I feel I learnt something useful that will help my research”, but as we know, enjoyment does not necessarily translate into learning! As before I asked two key questions. First “I now understand enough to try using the following tools/approaches.”. As you can see there is a big shift in attitude compared to before the workshop with the majority of people feeling that they understood the tools covered during the workshop.

fig-post-intend-usingBut will this translate into a change in behaviour? To try and test this I also asked
“I intend using the tools listed below to help my research”. The results are pretty similar but interestingly peoples intentions were stronger than their understanding, i.e. there was a slightly stronger response to the intention question than the understanding question. Compared to the workshop I ran in Oxford in January 2015, the shift in behaviour was more dramatic, although the two groups were drawn from different research areas so can’t be directly compared.

CECAM Macromolecular simulation software workshop

I’m co-organiser of this slightly-different CECAM workshop in October 2015 at the Forschungszentrum Jülich, Germany. Rather than following the traditional format of 3-4 day populated by talks with the odd poster session, this is an extended workshop made up of six mini-workshops. Since it is focussed on python-based tools for biomolecular simulations, of which there are an increasing number, the first mini-workshop will be a Software Carpentry bootcamp that I will be lead instructor on (helped by David Dotson from ASU). I’m also leading the next mini-workshop on analysing biomolecular simulation data.

Running GROMACS on an AMD GPU using OpenCL

I first used an Apple Mac when I was eight. Apart from a brief period in the 1990s when I had a PC laptop I’ve used them ever since.

Until last year I had an old MacPro which had four PCI slots so you could add a GPU-capable NVIDIA card, although you were limited by the power supply. A GPU can accelerate the molecular dynamics code I use, GROMACS, by up to 2-3 times.

Unfortunately, when Apple designed the new MacPro, they put in AMD FirePro GPUs so although it is a lovely machine, you can’t run CUDA applications.

But this morning I saw that the next release candidate of GROMACS 5.1 supported OpenCL. Although OpenCL applications are usually a bit slower than CUDA applications, this would, in theory, allow me to accelerate GROMACS on my MacPro.

So I downloaded the code, compiled it with the appropriate OpenCL flag and it just works! I benchmarked the code on an atomistic and a coarse-grained benchmark that I use. Running on a single core, using a single AMD FirePro D300 accelerated GROMACS by 2.0 and 2.5x for the atomistic and coarse-grained benchmarks, respectively.

Here’s looking forward to the final release of GROMACS 5.1!fig-gromacs-5.1-amd

New Publication: Alchembed

In much of my research I’ve looked at how proteins embedded in cell membranes behave. An important part in any simulation of a membrane protein is, obviously, putting it into a model membrane, often a square patch of several hundred lipid molecules. This is surprisingly difficult: although a slew of methods have been published, none of them can embed several proteins simultaneously into a complex (non-flat) arrangement of lipids. For example, a virus, as shown in our recent paper.

Here we introduce a new method, dubbed Alchembed, that uses an alternative way, borrowed from free energy calculations, of “turning on” the van der Waals interactions between the protein and the rest of the system. We show how it can be used to embed five different proteins into a model vesicle on a standard workstation. If you want to try it out, there is a tutorial on GitHub. This assumes you have GROMACS is setup

 

You can get the paper for free from here.

Is Software a Method?

Last month I went to the Annual Meeting of the US Biophysical Society. As a Software Sustainability Institute fellow I was interested not only in my research area, but also in how my community viewed software. Were there talks and posters on how people had improved important pieces of community software? After all, there would be talks and posters on improving experimental methods. Turns out, not so much. Click here to read the full post.