Analysing Simulation Data CECAM Workshop, Jülich, 14-15 October 2015

IMG_0987 (1)

This two day workshop on Analysing Simulation Data was part of the larger CECAM Macromolecular Simulation Software Workshop at the Forschnungzentrum, Jülich that I co-organised. It was the second workshop and immediately followed an introductory Software Carpentry workshop.

Prior to a few years ago I analysed all my simulation data using either VMD, often by writing Tcl scripts, or using a GROMACS g_tools, if one was available. Then I started using MDAnalysis, a python module. This enabled me to do two things: first MDAnalysis has its own analysis routines and therefore you could often do all the analysis you needed to in a simple python script. More powerfully, since it can read many different simulation formats, it can also act as a gatekeeper to the huge range of powerful python modules. The net result is I have been able to analyse my data in ways that previously would not have been possible (i.e. I would have had to write C code.)

For example we presented a paper (open access) at a Faraday Discussion meeting where we used the image-processing tools in scikt-image to analyse whether the presence of a small cell-signalling protein retarded the rate at which a three-component lipid bilayer phase separated. I posted some example code on GitHub. In other work, not all published, I have used scipy and numpy to, for example, calculate the power spectra of fluctuations in lipid bilayers (using fast fourier transforms).

Aim

The workshop brought together researchers, especially PhD students and postdoctoral researchers, and academic software developers.

The hope was that the researchers would come out of it feeling not only more confident about developing their own software and maybe even start contributing to an academic open source project but also that they could use the python ecosystem to analyse their data in new and interesting ways.

For the developers the hope was they would get to talk to a range of current and prospective users and gain a better understanding of how people are using their code (and maybe pick up some contributors along the way)

IMG_0970

Structure

I felt that a traditional didactic approach wouldn’t work; so no sessions of talks + questions. In the end I stole shamelessly from the excellent series of Collaborations Workshops run by the Software Sustainability Institute in the UK. The workshop worked towards and cumulated in a HackDay. I now believe HackDays are great ways of not only teaching but also as a way of building teams — I am writing a post of this for the SSI and will link to it here when it is up.

IMG_0988 (1)I invited developers from two biomolecular python projects: MDAnalysis and pmx. Given more budget I would have loved to invite other developers, e.g. from mdtraj. On the first day each project gave a short talk followed by around two hours of guided tutorial. Then at the end of day one, I invited participants to present analysis problems drawn from their own research that they would like to solve. Teams were allowed to form around six ideas. On day two these teams had around six hours to solve their problem, before presenting their solution to the rest of the workshop. The winning project, MDARTINI, aimed to make MDAnalysis more aware of the coarse-grained forcefield, MARTINI.

Feedback

Overall,

  • 94% of participants enjoyed the workshop,
  • 100% learnt something useful that will help their research
  • 100% would recommend a workshop like this one to other researchers.
  • 88% feel confident enough to contribute to an academic open source project.
graph-understand-enough-try

“I now understand enough to try using the following tools”

I then asked “I now understand enough to try using the following tools”. Given most participants had heard of MDAnalysis, but only a few had used it – and very few had heard of pmx – this is an encouraging shift. This was then followed up by: “I intend using the tools and methods to help my research”. Usually, the answers are a bit more pessimistic as people might understand a tool, but not have any intention of using it. Here, though it goes the other way.

graph-intend-using

“I intend using the tools and methods to help my research.”

To try and understand which parts of the workshop went well: “I enjoyed the following components of the workshop”. So, talks were ok, then the HackDay but the tutorials and meeting other researchers were most highly rated.

graph-enjoyed-components

“I enjoyed the following components of the workshop.”

Finally, to find out if there were any practical problems I asked “The following elements contributed to making the workshop a success”.

graph-elements-contributed-success

“The following elements contributed to making the workshop a success.”

The big problem here was the network; we had better connectivity in the small hotel in Jülich. It turned out there was a problem with the wireless router in the room and this was fixed a few days after this workshop. Nor did many people like the location in Jülich, however the various coffee breaks – which we were grateful to the Software Sustainability Institute for sponsoring – and the general social atmosphere were appreciated.

 

 

Lessons for next time

This type of workshop is very complicated and plenty can go wrong. Always have a Plan B. For example, assume that not everyone will be able to install all the necessary software on their laptops so come prepared with a (linux) virtual machine image that will work in all the tutorials. And don’t assume that the network will “just work”.

Leave a Comment

Your email address will not be published. Required fields are marked *