skills

Setting up a GROMACS cluster

Recently I’ve moved to the John Radcliffe hospital and my old lab kindly let me have some old servers that were switched off. This pushed me to learn how to setup them up as a compute cluster with a scheduler for running GROMACS jobs. I’ve wanted to learn this for years, having used many clusters myself, but haven’t plucked up the courage until now.

This post is a detailed walk-through on how I chose to do this. During the process I did a lot of Googling and have written the post I would have liked to have found; long, comprehensive and a bit verbose.

Since it is long so let’s break it down into four tasks.

Fingerless gloves and a woolly hat can be useful in a cold, noisy machine room

Fingerless gloves and a woolly hat can be useful in a cold, noisy machine room

1. Install Ubuntu on each machine

2. Setup networking, including sharing directories on the headnode via NFS

3. Using environment modules, compile GROMACS into one of the shared directories so all the machines in the cluster can run lmx

4. Install SLURM

HackDay: Data on Acid

Every year the Software Sustainability Institute (SSI) run a brilliant meeting called the Collaborations Workshop, usually in Oxford. This is an unconference lasting two days. At first glance it doesn’t look like it would be relevant to my research, but I always learn something new, meet interesting people and start, well, collaborations. The latest edition was last week and was the fourth I’ve attended. (Disclaimer: for the last year-and-a-bit I’ve been an SSI fellow which has been very useful – this is how I managed to train up to be a Software Carpentry Instructor. Alas my tenure has now ended).

For the last two years the workshop has been followed by a hackday which I’ve attended. Now I’m not a software developer, I’m a research scientist who uses million-line community-developed codes (like GROMACS and NAMD), but I do write code, often python, to analyse my simulations and also to automate my workflows. A hackday therefore, where many of the participants are research software engineers, pushes me clear out of my comfort zone. I remember last year trying to write python to access GitHub using its API and thinking “I’ve never done anything like this before and I’ve no idea what to do.”. This year was no different, except I’d pitched the idea so felt responsible for the success of the project.

The name of the project, Data on Acid, was suggested by Boris Adryan and the team comprised myself, Robert Haines, Alys Brett, Joe Parker and Ian Emsley. The input was data produced by a proof of principle project I’ve run to test if I can predict whether individual mutations to S.aureus DHFR cause resistance to trimethoprim. The idea was to then turn it into abstract forms, either visual or sound, so you can get an intuitive feel for the data. Or it could just be aesthetic.

To cut a long story short, we did it, it is up on GitHub and we came third in the competition! In the long term I’d like to develop it further and incorporate it into my volunteer crowd-sourced project, bashthebug, that aims to predict whether bacterial mutations cause antibiotic resistance or not (when it is funded that is).

Rushing here and there: planning an itinerary for a large scientific meeting

I’ve just returned from the 58th annual meeting of the US Biophysical Society in San Francisco. With around 7,000 scientists, multiple simultaneous sessions of talks and nearly a thousand posters every day, it is a large event, but not as big as many. Even so, working out what talks and posters you might want to see is a difficult task. Of course, you might not wish to prepare a schedule as this is somewhat of a personal thing, but I find it helpful just to know how the days will ebb and flow – if today is busy, will tomorrow be a bit quieter and let me recover? If I bump into someone I want to talk to, what will I miss? What posters might be interesting on the other side of the exhibition hall? I emphasise that attending interesting talks and seeing posters are not the only things one does at these conferences – talking to people is useful and fun too – but having that side of it organised does, I find, take your mind off “the next thing”.

Although this was my fifth BPS meeting, I feel it was the first time I was adequately organised, and believe me, I’ve been trying. Before I describe what I found worked for me this time, I’ll quickly describe the options available in the order you encounter them in the run-up to the meeting itself.

Options

1. Online itinerary planner. A useful tool this. You can search by presenter, keyword etc and then add selected talks and posters to your itinerary that is saved against your user. I usually search by both scientific labs (i.e. surname of the group leader) and keywords in the title or abstract. Worth starting several weeks in advance and coming back to as you tend to remember topics or groups you’d forgotten the first time. Better still, you can download your saved itinerary into your calendary program of choice. Unfortunately this is poorly implemented. For example, say you’ve picked out 30 posters on one day (very easy to do). The planner creates an “appointment” for every single poster session at the same time in your calendar. Then imagine you have a default alarm setting, as many people do, and finally picture the mayhem when your smart phone / laptop / tablet tries alerting you to 30 simultaneous events. The talks don’t fare much better: even if you only pick out one 15 minute talk the planner puts the whole 2 hour session into your calendar. Not helpful.

2. Online version of the program. On the first day of the conference you get a copy of the program as a soft-bound book, but this list is in chronological order so we haven’t got the book in our hands just yet. But you can see the program before you start travelling as an online book. Unfortunately this is one of those “worst-of-both-worlds” things: it is on the screen of my computer but it wants me to flick the pages? And then it makes a flicking noise to show the pages are going turned over? In other words, the UI slows you down in an effort to make you think it is a book. I would love to know if anyone actually used this in the intended way. Fortunately you could download the whole book as a PDF, although finding the correct button took a little searching.

3. BPS mobile app. An innovation this year. It let you search for sessions on your device and “check in” in-a -social-way to show what session you were in. This boiled down to me seeing what sessions people I didn’t know where in. Not very helpful. The idea is good, but it wasn’t written from the point of view of someone trying to navigate the myriad of sessions and thousands of posters whilst heavily jetlagged. Didn’t use it much so won’t write any more.

4. Paper version program. The old standby. Even without the abstracts it weighs in at a hefty 298 pages – some laptops are lighter. The traditional approach to meeting planning is to leaf through the program at breakfast in your hotel drawing circles around the talks you want to go to. I wouldn’t to try and go through the posters this way though. Solid and dependable but you don’t get until you register at the convention centre so you have to be pretty speedy if you want to develop a schedule more than a day in advance.

5. Going with the flow. Perhaps the easiest method: just follow other people who have similar interests to you, or find the room where the clapping seems loudest. Not very reliable but requires little in the way of preparation. Maybe in the coming years social media, like Twitter, will allow you gauge where to go, in real-time, as the conference progresses. Bizarrely (in my mind) most scientists are extremely conservative when it comes to social media and so, despite having a hashtag (#bps14) and an official blog, there were only 518 tweets making use of the hashtag (and exactly half of these came from three accounts). On average this is a single tweet for every 13 scientists in attendance over five days of the conference. For social media such as Twitter to provide an evolving picture of the conference, for example to show which talks appear especially interesting, you need a rapidly updating timeline, say a tweet a minute, which works out at 600 tweets a day, or 3,000 tweets over the course of the conference. So I’d say we are still someway from any kind of social media “tipping point” that would let you go with the flow.

My recipe

Again, I stress this is what worked for me this year so probably won’t work for you, and maybe won’t even work for me another year, but hopefully will give you some ideas or get you thinking about how to organise an itinerary.

(a). Search the online itinerary planner in advance. I make a list on a piece of paper of keywords and groups that I am interested in. Search on each term and add appropriate talks and posters to your itinerary – be selective and don’t just add all the posters from one group that you admire. Put a line through each term when you’ve searched for it! Come back to your list several times as you’ll find you come up with new groups or keywords to search on. Don’t add it your calendar! Instead, save the list, perhaps as a PDF.

(b). Search through the program. Download the program book as a PDF, don’t use the online version. Now go through the talks day-by-day and highlight any that look interesting (this is especially easy on an iPad using an App such as GoodReader). Don’t worry if you highlight a few that are in your itinerary.

calendar(c). Add talks to your calendar manually (see right). This takes a bit of time, and is a bit dull (and so is perfect for doing on the aeroplane). Go through the program and add talks to your calendar (iCal, Google Calendar, Outlook etc), remembering to include the room number. You can copy the text off the program PDF and paste it into your calendar to cut down on typing. Setting a default alarm can be helpful here as it can easily take 5 minutes to get from one room to another. Now go through the saved version of your online itinerary and add in any talks you’ve missed. It is also helpful to make a list of the posters you’ve picked out for each day at the same time so you can walk from one to the other in the Exhibition Hall.

(d). Look for patterns. Chances are you’ll have quite a few clashes. You’ll also find that you’ve picked out say 3 talks from a session of 8 talks in total. If so, it might be worth getting a coffee and staying for the whole session. Or you might choose to duck out and see another talk in another session and then come back. I prefer to wait until the day before deciding exactly which sessions to hit and how to deal with my clashes as what I decide will depend on where other people are going and, crucially, just how far it is between Room X and Room Y which you won’t find out until you are in the Convention Centre itself.

(e). Go with the flow in an organised way. Having put in this work to organise my schedule and put it in my calendar on my iPad, I was able to go from talk to talk without feeling a tiny bit panicked that I was missing something a crucial session. Having a list of poster numbers for each day was hugely helpful too as a cluster of posters together would indicate that that section would be interesting. For me the crucial point is that all this organisation freed me up during the conference. For example I bumped into several people as I was leaving a session but because I had a good idea of what I had next and how interesting it was I was able to decide whether to talk to them, perhaps over a coffee, or whether I really should hit that talk. It also meant I recognised that Monday, for example, was going to be a really tough day but Tuesday would be easier and then Wednesday, whilst shorter, was also going to be busy. In the end I left the big book in my hotel and just carried around my iPad which was much easier. I guess the final things I found helpful are: if you think of something or meet someone write it down immediately. You are busy, tired and probably also jetlagged – chances are you will not remember that idea, person or reference in a days time, let alone when back in the lab. Personally, I use Evernote for everything as my notes are synced across all my devices so if, for example, the battery on my iPad runs out, I just swap to the laptop or whatever I have with me.

That was my recipe this year at the Annual Meeting of the US Biophysical Society. It only took me five goes to get there (and some improvements in technology since I first went) but I feel I learnt a lot more, met a lot more people and had a lot more ideas than the previous four times I have attended. Take my recipe with a pinch of salt as it might not work for you. I am sure though that attending these large meetings can be made to be more manageable; you just have to find what works for you.

Habits of Highly Productive Academic Writers

Not my title, but the title of the half-day workshop I’ve just attended led by Helen Sword, an academic from New Zealand. It was extremely thought-provoking and has made me question how I write. Please see her webpage, Writer’s Diet for more information and resources. I’d especially recommend the “Writer’s Diet Test” where you can paste up to 1000 words and get it rated for how flabby or fit your prose is (and no I’m not telling).

As a research scientist my main measurable output is peer-reviewed papers published in scientific journals. I’ve always found writing papers difficult and frustrating so it was a relief to find that pretty much everyone else also finds writing hard. One thing that particularly struck home was the idea that there is no right way to write; on another training course I had been previously encouraged to write every day, even if only for 30 min, but this has never felt like a good idea as I have always needed to get in the flow. One of Helen’s conclusions from interviewing a large number of successful academic writers is that there is no one single right way – you have to find what works for you. This made me realise that over the years I have been, perhaps sub-conciously, experimenting with different approaches. Below I’ve described what works for me, as well as listing some ideas presented in the workshop that I found particularly appealing.

  1. Clarity and speed. I’ve found that when I’m writing up a research project if I don’t have a clear understanding of not just what I’ve done but what it means for the broader area then when I try to write it is hard, slow and my language gets very turgid and waffly. This I now take as a warning sign to go and think more and read more. I’ve found giving a talk or seminar about the work forces me to put everything in a logical order and makes me think about the weaknesses. Constructive questions from the audience are always a help too as are questions were it is apparent I didn’t explain something well enough. Then I get the figures “camera ready” – this includes writing the legends. Finally I make a mindmap of the paper which outlines what points I want to make and, crucially, the order that I want to make them. Only then do I start writing. Here I’ve found the key thing for me is to write almost as fast as I can type. Then my language is closer to if I were speaking and is more natural and direct and is less turgid, waffly and impersonal. Once I have a complete version I pace the lab reading it under my breath (i.e. out loud but not so anyone can hear me). I instinctively know when the language is garbled and I correct it straightaway at the computer. Sometimes I print out a version and scribble on it but this seems to take more time. Editing is always hard, but at least you have something almost ready in front of you.
  2. Frame of mind. I hadn’t really thought of this, but the quality of your writing is very likely to depend on your mood which in turn can depend on what you’ve just done and the surroundings you are in. For example I love writing on trains (in the quiet carriage!). There is something about staring out the window and then writing another couple of sentences. I have tried early in the morning as I’m usually up early but it never works too well. Instead I will try going for a coffee or doing some exercise and then start writing. It is all too easy to get sucked into “more hours spent staring at screen ‘= more writing done”. It is a creative activity, after all.
  3. Writing as a craft. I can’t really remember being taught much in the way of style or formal grammar at school beyond verbs, nouns and adjectives etc. I think I learnt more grammar from learning foreign languages and comparing back to English as is natural to do. Anway, I have certainly never been directly taught style, so I intend reading some books on style and prose. Let’s hope some of my stylist instincts are not too far off the mark. The other aspect to thinking of writing as a craft is that the main way to get better is practice. So I guess this blog will help…
  4. Monitor your progress. I admit my heart sank at the thought of counting how many words I’d written each week. Then I remembered RescueTime: this is an app you can put on your PC or Mac and it monitors what other applications you have active and then it emails you every week and tells you how much time you spent on each one. Since I typically only write papers using one application (TeXShopt) I can get a good estimate of how much time I’ve spent in any week on writing.
  5. Prioritise my writing. Want to write? Put it in my diary. Then let nothing else take priority.
  6. Using the Pomodoro technique to get over the hill. I find once I’m in the flow with writing I enjoy it and want to keep going, but, and you knew there was a but, it can be hard to get going. Pomodoro is a simple time management technique where you work on a specific task (writing) for 25 mins with no interruptions, no emails, no nothing. Then you take a 5 min break and do it again. I have a simple Pomordoro app for my Mac. You can’t live by Pomorodoro, but I have found it helpful to get me into my flow when I’m starting writing.

How (not to) present a poster at a scientific conference

Ok, so you are presenting a poster at a scientific conference. You’ve done the research, prepared and printed the poster and pinned it to the board, the poster session is approaching and you really want some feedback on your results and ideas. How do you maximise the number of people you talk to?

1. The Guard

Stand next to your poster at attention and wait. This signals readiness and a willing to discuss any aspects of your work. Do not make eye contact with approaching colleagues but as soon as they stop, pounce. This attentiveness is appreciated by all scientists, especially senior group leaders.

2. The Lure

Put some sweets in a paper cup pinned to the board and pretend to read your neighbour’s poster. Wait for someone to take one, then pounce. A weaker variant is to provide copies of your poster pinned to the board (tip: make sure they are FIRMLY attached – it buys you a few more seconds).

3. The Fake Crowd

Bribe your friends or colleagues to stand around your poster and talk loudly. Ask them to point and gesticulate wildly to indicate interest and controversy. This will naturally attract people.

4. The Suspiciously Quiet Poster

Not everyone will be attracted by the fake crowd. Try alternating it with the suspciously quiet poster. Go and hide by looking at the poster pinned to the back of the board. Watch for feet appearing at your poster and, well you guessed it, pounce.

Congratulations! You are now equipped to thrive in the cut-and-thrust world of the poster session. Especially if you don’t do any of these suggestions…..

Good science

“There was some good science in that seminar.”

“Yes? Sorry I feel asleep shortly after the first slide of maths.”

Familar? I expect every scientist occasionally gets the feeling that perhaps the person speaking is saying something interesting and important in say at a conference but why can’t they understand it? And why show us so much maths in such a small typeface?

Why do I feel slightly uncomfortable with this exchange? Well I think my discomfort begins with what is implied by “good science”. The exchange above implies that is an an abstract output such as an idea, an experimental result or a theory.

But is that all there is to being a good scientist? Producing “good science”? I hope not. I believe being a good scientist means having a wide range of skills, such working well with other scientists and mentoring students. This is in addition to producing good science. Crucially it includes the ability to clearly explain one’s research to anyone (or at a bare minimum a PhD student studying in your field). Or, as was put to me during my undergraduate degree by an influential professor: “If you can’t explain something to anybody, then you haven’t understood it.”. Let me be clear: there are of course very difficult concepts in all fields. What I have primarily in mind is seminars in your department or talks at conferences. In both cases the audience is either in your field, learning your field or is at a remove of one or two steps from your field. I am a computational biophysicist – I should be able to explain how (and why) I am simulating the dynamics of a particular protein to any molecular biologist.

The tension I am feeling I think therefore originates from the subtle distinction between what we mean by “good science” and “good scientist”: producing the former does not automatically make you the latter.

This would perhaps be straightforward enough if it didn’t sometimes feel that this is turned on its head – the more obtuse and difficult to follow a lecturer or speaker is, the more important their work must surely be. Then producing what at least appears to be “good science” (it must be because I can’t understand it) makes you a “good scientist” and, for example, being able to explain your work, maybe even to undergraduates or, worse still, school children, is a mark against your reputation. This backwards logic is unhelpful and should be confronted when found.

But can we really separate the outcome (the science) from the researchers? I doubt it; if other scientists cannot understand our work and, let’s hope, be interested in it, then our results are less likely to spur further research.

So perhaps our respondant above should have replied:

“No, I don’t think so. I couldn’t understand a thing and this is my field. If we can’t understand it how can it be good science?”