GROMACS in DOCKER: First Steps Philip Fowler, 23rd May 2016 DOCKER is cool. But what is it? From the DOCKER webpage Docker containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, runtime, system tools, system libraries – anything you can install on a server. This guarantees that it will always run the same, regardless of the environment it is running in. I like to think of it as somewhere in between virtualenv and a virtual machine. Although the DOCKER website is focussed on commercial software development, and so talks about building and shipping applications, DOCKER could be of huge use to myself as a computational scientist. For example, rather than make a series of input files for my simulations available, along with a list of which software versions I used, I could instead simply make a DOCKER image available that contains all the compiled software I used along with all the input files. Then anyone should, in principle, be able to reproduce my research. Make no mistake: reproducibility is, rightly, a coming trend. But surely all scientific results are reproduced?. Turns out if the experiment or simulation was difficult to do the answer is not so much. And when concerted efforts have been made to reproduce results reported in high impact journals, the answer is often, well, disconcerting at the very least. In a now famous study, Begley & Ellis from a pharmaceutical company, Amgen, reported that their in-house scientists were unable to reproduce 47 out of 53 landmark experimental studies in haematology and oncology. They were looking at novel, exciting findings which are more likely to be challenging to reproduce (although the pressure to over-sell is also stronger). I have no reason to think computational studies are much better. The past few years there have been a flurry of papers, comments and best practices. One can even now make a DOCKER image available via GitHub with a DOI so it can be cited independently of an article. As I’d like to do this in the future, I’ve started to play with DOCKER and GROMACS. Since my workstation is a Mac, the DOCKER host has to run within a lightweight Linux virtual machine. First I installed DOCKER. Then I opened a DOCKER Quick Terminal and checked everything was working by downloading the hello-world image and running it $ docker run hello-world Unable to find image 'hello-world:latest' locally latest: Pulling from library/hello-world 4276590986f6: Pull complete a3ed95caeb02: Pull complete Digest: sha256:4f32210e234b4ad5cac92efacc0a3d602b02476c754f13d517e1ada048e5a8ba Status: Downloaded newer image for hello-world:latest Hello from Docker. This message shows that your installation appears to be working correctly. Let’s get try something more real, like an Ubuntu 16.04 Server image. $ docker run -it ubuntu bash This drops me inside the Ubuntu image. Let’s compile GROMACS! root@4b511a41dbf0:/# apt-get update -y root@4b511a41dbf0:/# apt-get upgrade -y root@4b511a41dbf0:/# apt-get install build-essential cmake wget openssh-server -y root@4b511a41dbf0:/# wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-5.1.2.tar.gz root@4b511a41dbf0:/# tar zxvf gromacs-5.1.2.tar.gz root@4b511a41dbf0:/# cd gromacs-5.1.2 root@4b511a41dbf0:/# mkdir build root@4b511a41dbf0:/# cd build root@4b511a41dbf0:/# cmake .. -DGMX_BUILD_OWN_FFTW=ON root@4b511a41dbf0:/# make root@4b511a41dbf0:/# make install root@4b511a41dbf0:/# cd Now let’s copy over a TPR file to see how fast GROMACS is within a DOCKER container root@4b511a41dbf0:/# scp fowler@somewhere.else:benchmark.tpr . root@4b511a41dbf0:/# source /usr/local/gromacs/bin/GMXRC root@4b511a41dbf0:/# gmx mdrun -s benchmark -resethway -noconfout -maxh 0.1 Note that this is a single CPU DOCKER image. I was worried that since the DOCKER host was running inside a Linux VM it would be slow compared to running natively in Mac OS X so I ran three repeats of each and DOCKER was only 1.7% slower… To save this DOCKER image locally, quit the session $ docker commit -m "Installed GROMACS 5.1.2 for benchmarking" -a "Philip W Fowler" c5f1cf30c96b philipwfowler/gromacs-5.1.2 $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE philipwfowler/gromacs-5.1.2 latest 73e44c120bfa 6 seconds ago 809 MB ubuntu latest c5f1cf30c96b 2 weeks ago 120.8 MB hello-world latest 94df4f0ce8a4 3 weeks ago 967 B Done. More soon on multiple cores, can-we-use-the-GPU? and using DOCKER on Amazon Web Services. Share this:Twitter Related distributed computing molecular dynamics
clinical microbiology New preprint: processing SARS-CoV-2 genetics in the cloud 31st January 202431st January 2024 In this preprint, we describe how in July 2022 for two weeks seven sites in… Share this:Twitter Read More
antimicrobial resistance New publication: how quickly can be calculate the effect of a mutation on an antibiotic? 20th November 202020th November 2020 The idea for this paper arose during talking over coffee at the BioExcel Alchemical Free… Share this:Twitter Read More
molecular dynamics New Publication: Alchembed 12th June 2015 In much of my research I’ve looked at how proteins embedded in cell membranes behave. An… Share this:Twitter Read More
This is a great article! I was able to get a rather tricky to compile gromacs based tool (trj_cavity) up and running on a docker image so others in my lab don’t have to deal with compiling it themselves. This is extremely helpful. Also, it seems that docker desktop automatically provides running images with a default pool of threads and gromacs seems to able to take advantage of this easily by just using the ‘-nt’ flag at runtime (via openMP threads). I am looking forward to seeing how to use GPU whenever you get to it. Reply