GROMACS in DOCKER: First Steps

DOCKER is cool. But what is it? From the DOCKER webpage

Docker containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, runtime, system tools, system libraries – anything you can install on a server. This guarantees that it will always run the same, regardless of the environment it is running in.

I like to think of it as somewhere in between virtualenv and a virtual machine. Although the DOCKER website is focussed on commercial software development, and so talks about building and shipping applications, DOCKER could be of huge use to myself as a computational scientist. For example, rather than make a series of input files for my simulations available, along with a list of which software versions I used, I could instead simply make a DOCKER image available that contains all the compiled software I used along with all the input files. Then anyone should, in principle, be able to reproduce my research.


Make no mistake: reproducibility is, rightly, a coming trend. But surely all scientific results are reproduced?. Turns out if the experiment or simulation was difficult to do the answer is not so much. And when concerted efforts have been made to reproduce results reported in high impact journals, the answer is often, well, disconcerting at the very least. In a now famous study, Begley & Ellis from a pharmaceutical company, Amgen, reported that their in-house scientists were unable to reproduce 47 out of 53 landmark experimental studies in haematology and oncology. They were looking at novel, exciting findings which are more likely to be challenging to reproduce (although the pressure to over-sell is also stronger). I have no reason to think computational studies are much better. The past few years there have been a flurry of paperscomments and best practices. One can even now make a DOCKER image available via GitHub with a DOI so it can be cited independently of an article.

As I’d like to do this in the future, I’ve started to play with DOCKER and GROMACS. Since my workstation is a Mac, the DOCKER host has to run within a lightweight Linux virtual machine. First I installed DOCKER. Then I opened a DOCKER Quick Terminal and checked everything was working by downloading the hello-world image and running it

$ docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world

4276590986f6: Pull complete 
a3ed95caeb02: Pull complete 
Digest: sha256:4f32210e234b4ad5cac92efacc0a3d602b02476c754f13d517e1ada048e5a8ba
Status: Downloaded newer image for hello-world:latest

Hello from Docker.
This message shows that your installation appears to be working correctly.

Let’s get try something more real, like an Ubuntu 16.04 Server image.

$ docker run -it ubuntu bash

This drops me inside the Ubuntu image. Let’s compile GROMACS!

root@4b511a41dbf0:/# apt-get update -y
root@4b511a41dbf0:/# apt-get upgrade -y
root@4b511a41dbf0:/# apt-get install build-essential cmake wget openssh-server -y
root@4b511a41dbf0:/# wget
root@4b511a41dbf0:/# tar zxvf gromacs-5.1.2.tar.gz 
root@4b511a41dbf0:/# cd gromacs-5.1.2
root@4b511a41dbf0:/# mkdir build
root@4b511a41dbf0:/# cd build
root@4b511a41dbf0:/# cmake .. -DGMX_BUILD_OWN_FFTW=ON
root@4b511a41dbf0:/# make 
root@4b511a41dbf0:/# make install
root@4b511a41dbf0:/# cd

Now let’s copy over a TPR file to see how fast GROMACS is within a DOCKER container

root@4b511a41dbf0:/# scp fowler@somewhere.else:benchmark.tpr .
root@4b511a41dbf0:/# source /usr/local/gromacs/bin/GMXRC
root@4b511a41dbf0:/# gmx mdrun -s benchmark -resethway -noconfout -maxh 0.1

Note that this is a single CPU DOCKER image. I was worried that since the DOCKER host was running inside a Linux VM it would be slow compared to running natively in Mac OS X so I ran three repeats of each and DOCKER was only 1.7% slower…

To save this DOCKER image locally, quit the session

$ docker commit -m "Installed GROMACS 5.1.2 for benchmarking" -a "Philip W Fowler" c5f1cf30c96b philipwfowler/gromacs-5.1.2
$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             SIZE
philipwfowler/gromacs-5.1.2   latest              73e44c120bfa        6 seconds ago       809 MB
ubuntu                        latest              c5f1cf30c96b        2 weeks ago         120.8 MB
hello-world                   latest              94df4f0ce8a4        3 weeks ago         967 B

Done. More soon on multiple cores, can-we-use-the-GPU? and using DOCKER on Amazon Web Services.

One comment

  1. This is a great article! I was able to get a rather tricky to compile gromacs based tool (trj_cavity) up and running on a docker image so others in my lab don’t have to deal with compiling it themselves. This is extremely helpful.
    Also, it seems that docker desktop automatically provides running images with a default pool of threads and gromacs seems to able to take advantage of this easily by just using the ‘-nt’ flag at runtime (via openMP threads).
    I am looking forward to seeing how to use GPU whenever you get to it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.