Skip to content
Fowler Lab
Fowler Lab

Predicting antibiotic resistance de novo

  • News
  • Research
    • Overview
    • Manifesto
    • Software
    • Reproducibility
    • Publications
  • Members
  • Teaching
  • Contact
    • PhDs
  • Wiki
Fowler Lab
Fowler Lab

Predicting antibiotic resistance de novo

GROMACS 4.6: Scaling of a very large coarse-grained system

Philip Fowler, 23rd October 2013

So if I have a particular system I want to simulate, how many processing cores can I harness to run a single GROMACS version 4.6 job? If I only use a few then the simulation will take a long time to finish, if I use too many the cores will end up waiting for communications from other cores and so the simulation will be inefficient (and also take a long time to finish). In between is a regime where the code, in this case GROMACS, scales well. Ideally, of course, you’d like linear scaling i.e. if I run on 100 cores in parallel it is 100x faster than if I ran on just one.

The rule of thumb for GROMACS 4.6 is that it scales well until there are only ~130 atoms/core. In other words, the more atoms or beads in your system, the larger the number of computing cores you can run on before the scaling performance starts to degrade.

As you might imagine there is a hierarchy of computers we can run our simulations on; this starts at humble workstations, passes through departmental, university and regional computing clusters before ending up at national (Tier 1) and international (Tier 0) high performance computers (HPC).

In our lab we applied for, and got access to, a set of European Tier 0 supercomputers through PRACE. These are currently amongst the fastest and largest supercomputers in the world. We tested five supercomputers in all: CURIE (Paris, France; green lines), MareNostrum (Barcelona, Spain; black line), FERMI (Bologna, Italy; lilac), SuperMUC (Munich, Germany; blue) and HERMIT (Stuttgart, Germany; red ). Each has a different architecture and inevitably some are slightly newer than others. CURIE has three different partitions, called thin, fat and hybrid. The thin nodes constitute the bulk of the system; the fat nodes have more cores per node whilst the hybrid nodes combine conventional CPUs with GPUs.

We tested a coarse-grained 54,000 lipid bilayer (2.1 million MARTINI beads) on all seven different architectures and the performance is shown in the graph – note that the axes are logarithmic. Some machines did better than others; FERMI, which is an IBM BlueGene/Q, appears not to be well-suited to our benchmark system, but then one doesn’t expect fast per-core performance on a BlueGene as that is not how they are designed. Of the others, MareNostrum was fastest for small numbers of cores, but its performance began to suffer if more than 256 cores were used. SuperMUC and the Curie thin nodes were the fastest conventional supercomputers, with the Curie thin nodes performing better at large core counts. Interestingly, the Curie hybrid GPU nodes were very fast, especially bearing in mind the CPUs on these nodes are older and slower than those in the thin nodes. One innovation introduced into GROMACS 4.6 that I haven’t discussed previously is one can now run either using purely MPI processes or a combination of MPI processes and OpenMP threads. We were somewhat surprised to find, that, in nearly all cases, the pure MPI approach remained slightly faster than the new hybrid parallelisation.

Of course, you may see very different performance using your system with GROMACS 4.6. You just have to try and see what you get! In the next post I will show some detailed results on using GROMACS on GPUs.

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Mastodon (Opens in new window) Mastodon

Related

GPUs molecular dynamics

Post navigation

Previous post
Next post

Related Posts

computing

Running GROMACS on an AMD GPU using OpenCL

10th July 2015

I first used an Apple Mac when I was eight. Apart from a brief period…

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Mastodon (Opens in new window) Mastodon
Read More
molecular dynamics

Installing GROMACS with MPI support on a Mac

5th December 2014

GROMACS is an optimised molecular dynamics code, primarily used for simulating the behaviour of proteins….

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Mastodon (Opens in new window) Mastodon
Read More
distributed computing

GROMACS in DOCKER: First Steps

23rd May 2016

DOCKER is cool. But what is it? From the DOCKER webpage Docker containers wrap up…

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Mastodon (Opens in new window) Mastodon
Read More

Comments (2)

  1. Pingback: GROMACS 4.6: Running on GPUs | Philip W Fowler
  2. Pingback: New publication: Nothing to Sneeze At – A Dynamic and Integrative Computational Model of an Influenza A Virion | Philip W Fowler

Leave a Reply to GROMACS 4.6: Running on GPUs | Philip W Fowler Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy
    ©2025 Fowler Lab | WordPress Theme by SuperbThemes
     

    Loading Comments...