HackDay: Data on Acid

Every year the Software Sustainability Institute (SSI) run a brilliant meeting called the Collaborations Workshop, usually in Oxford. This is an unconference lasting two days. At first glance it doesn’t look like it would be relevant to my research, but I always learn something new, meet interesting people and start, well, collaborations. The latest edition was last week and was the fourth I’ve attended. (Disclaimer: for the last year-and-a-bit I’ve been an SSI fellow which has been very useful – this is how I managed to train up to be a Software Carpentry Instructor. Alas my tenure has now ended).

For the last two years the workshop has been followed by a hackday which I’ve attended. Now I’m not a software developer, I’m a research scientist who uses million-line community-developed codes (like GROMACS and NAMD), but I do write code, often python, to analyse my simulations and also to automate my workflows. A hackday therefore, where many of the participants are research software engineers, pushes me clear out of my comfort zone. I remember last year trying to write python to access GitHub using its API and thinking “I’ve never done anything like this before and I’ve no idea what to do.”. This year was no different, except I’d pitched the idea so felt responsible for the success of the project.

The name of the project, Data on Acid, was suggested by Boris Adryan and the team comprised myself, Robert Haines, Alys Brett, Joe Parker and Ian Emsley. The input was data produced by a proof of principle project I’ve run to test if I can predict whether individual mutations to S.aureus DHFR cause resistance to trimethoprim. The idea was to then turn it into abstract forms, either visual or sound, so you can get an intuitive feel for the data. Or it could just be aesthetic.

To cut a long story short, we did it, it is up on GitHub and we came third in the competition! In the long term I’d like to develop it further and incorporate it into my volunteer crowd-sourced project, bashthebug, that aims to predict whether bacterial mutations cause antibiotic resistance or not (when it is funded that is).

Lectures, Clickers and Quizzes

It’s 9.40am. You are sitting in a nice warm lecture theatre. There are no windows. The lecturer is talking, their slides projected onto a big screen. You’re feeling sleepy but this course doesn’t seem too hard – you can always learn the key concepts from the lecture notes before the exams. And so in May and the exams are looming and it is warm and sunny, you drag yourself into the library, pull out the lecture notes (which seemed so clear in the lecture) and, wait, what is this nonsense? It makes no sense…

There is a trap here; what was appearing to make sense in the lecture hasn’t been learnt properly, by which I mean sufficiently embedded in the brain of our student that they can remember and, hopefully, understand the central concepts. Everything in the lecture is set up for you to learn the material there and then. The risk, then, is that it almost-but-doesn’t-quite penetrate the grey matter and so gradually all that knowledge slowly evaporates…

As you can probably tell I believe you either understand the topic in the lecture, or not at all. So what can I, the lecturer, do to help?

I believe encouraging the students to think helps their recall and one way to do this is to get them to answer a question. It doesn’t have to be difficult, but I think you do have to ask it shortly after you’ve explained the concept so it is still fresh in their mind.

To do this I tried using clickers this year. These are small credit-card sized boxes with buttons on; each student gets one and then the lecturer, using special software on their laptop, puts a question on the screen and then they have to choose an answer. The results are then displayed on a graph and you can discuss which answer was correct and why.

An obvious barrier to using clickers is you have to buy them. So in previous years I have tried using a website, Socrative, and then asking the students to connect to it using their smartphones. But not everyone has a smartphone, which is unfair, and having to get on the wireless network, find the website etc which makes it clunky.

I aimed have a quiz half-way through each lecture (this is also a change and should therefore help their attention after the quiz) and another one at either the end, or occasionally, the beginning. Each quiz was very simple; between 6 and 10 simple statements to which they had to decide whether they were true or false. Very occasionally I would try and catch them out to illustrate a common misconception.

Two comments:

Using the clickers to do questions was very useful to assess [our] understanding as we went along.

The clickers are sick.

Some more quantitative feedback:

  • “The clickers were easy to use”. (96% agreed)
  • “The quizzes helped me remember the key concepts from each lecture” (95% agreed)
  • “I’d like more lectures to use clickers for quizzes” (90% agreed)

So I’m going to view the clickers as a success.

Software Carpentry Workshop, Oxford, 13-14 January 2015

So how did the workshop go? I thought it went a bit better than the first day, but, hey, I’m a bit biased. To get a better idea I sent the participants a similar questionnaire to the one I sent to the Software Carpentry workshop I organised before. Nearly all the participants (95%) agreed with the statement “I enjoyed the Software Carpentry workshop” which is great, but I guess the aim is to help people change how they use computers to do research.

I now understand enough to try using the following tools/approaches

I now understand enough to try using the following tools/approaches

Asking “I now understand enough to try using the following tools/approaches” gives a more nuanced view (see the graph on the left). Everyone seemed to understand shell scripting, but we can’t take all the credit as quite a few people would have known bash before.  In fact, all the different elements of the syllabus were well understood, which shows the course and materials were going a good job.

I intend using the tools and methods listed below to help my research

I intend using the tools and methods listed below to help my research




How about: “I intend using the tools and methods listed below to help my research”. Now we start to see some differences. Most people intend using shell scripting and python, maybe fewer people will pick up testing and git with only about half the participants thinking they would use SQL. Still, a good result.

Back in October 2012 the first Software Carpentry workshop I organised here in Oxford was hugely popular. We had to turn people away. I wondered if the demand might have reduced in the intervening time as more and more workshops have been run. But 95% of people thought “more workshops like this should be run in Oxford”. So we are some way off saturated the demand.

From some of the comments at the end of day 1 I was a bit concerned about the speed at which we were moving through the material, so I asked whether “the instructors went too fast”? 24% agreed, 52% disagreed and the rest were indifferent. I read that as the speed was ok: any faster and we would have lost more people, any slower and it would have become too boring for the more advanced participants. It was pleasing to see that everyone agreed with the statement “I feel I learnt something useful from the workshop that will help my research.”!

Thanks to Kwasi Kwakwa who volunteered to be the second instructor at short notice. A personal lesson for me is instructing is exhausting and it would be very difficult (and your teaching would suffer) to do one on your own. Also thanks to the helpers: Michael Morgan and Thomas Smith from CGAT and Jane Charlesworth from the MMM. Finally thanks to the SSI who not only helped with the admin, but also have supported myself and Jane through their fellowship programme this past year.







Software Carpentry Workshop in Oxford, Day 1

Today I’ve been instructing on a Software Carpentry workshop at the Wellcome Trust Centre for Human Genetics in Oxford; it’s the first time I’ve been lead instructor on a bootcamp. Today Kwasi Kwakwa and myself covered the shell and basic python; more python, then git and SQL tomorrow. So what went well? I was very pleased to find we had no installation issues, even though everyone had brought their own laptop and so we had a mixture of Macs, Windows and the odd Linux machine! I had four USB sticks with the Anaconda etc installers and we didn’t use a single one so the standard installation instructions must be working.

As is customary, just before they left we asked everyone to write on their post-it notes one good point and one thing that could be improved. Pleasing to see a good collection of positive comments:

Really enjoyed working through the ipython notebooks and being able to see and change the code and add notes in a visually pleasing way.

Well paced and explained from the bottom up, enjoyed it

But of course, it is the comments about things people didn’t like that are the key to making it better.

If I didn’t have some background in the subject I think it would have been too much for me

Can’t see the green brackets on the screen [in ipython]

I was completely lost in python. If you don’t have any previous background it is too much.

It will always be a challenge to cater for a wide range of backgrounds and experiences in these two day intensive courses. That is not to say that we should give up. I hope it will get better as the number of bootcamps increases. That way it will be easier to run bootcamps for the varying levels of experience.

Finally, don’t do what we did and use green and yellow post-it notes. I couldn’t tell them apart standing at the front. Still everyone drew a sad face or a cross on the yellow one which was fun. Also swap instructors more often than you might think: over an hour is too long. Oh, and bring a whiteboard pen!

A simple tutorial on analysing membrane protein simulations.

I’m teaching a short tutorial on how to analyse membrane protein simulations next week at the University of Bristol as part of a series arranged by CCPBioSim. As it is only 90 minutes long, it only covers two simple tasks but I show how you can do both with MDAnalysis (a python module) or in Tcl in VMD. Rather than write something and just distribute it to the people who are coming to the course, I’ve put the whole tutorial, including trajectory files and all the example code here on Github. Please feel free to clone it, make changes and send a pull request (or just send me any comments).

Trying to stop lectures from being so zzzz…Part 2

Last time, I wrote about the tactics I was planning on trying out in my lecture series this year. Well, the lectures are done, I’ve collected some feedback and so here are the results.

  1. Live polling. Around three-quarters of the students had a smartphone or tablet and knew how to connect it to the Wifi, so there were enough for “think, pair, share” questions. I ran the quizzes using Socrative typically half-way through the lecture and the questions were simple, the idea was just to reinforce what I had just gone through, not to challenge them. Overall, 75% of the students agreed that “the quizzes helped me remember the key concepts from each lecture” and only 8% agreed that “the electronic polling is unfair as not everyone has a smartphone”. So, I’d say it went pretty well. There were the odd random problem running a quiz, which is inevitable given the technology involved. Also, the lecture theatres I was in didn’t have dual projection which would have made it a bit easier. So quizzes are definitely a good idea but I’m not sure how much the technology adds. If I can get my hands on some clickers, I will use these in preference next year, but using smartphones is certainly now possible.
  2. Online reading lists. I explained about the reading list on Mendeley at the start of the first lecture and I said we’d have a competition with a prize for whoever improved the Public Group the most. By the penultimate lecture …. no-one had done anything so I think they were a bit non-plussed. Over half (54%) of the students were not sure whether “it was useful having references in Mendeley”. I still think it is worth doing as, whilst it might not be of immediate use, I believe it is helpful to introduce them early on (this is a first-year course) to both references and reference managers as by the time they reach the fourth year they will have to be reading the primary literature.
  3. Videos. I showed them Linus Pauling explaining how he discovered the geometry of an alpha-helix and I started and ended the course with a very nice video showing a small protein folding, courtesy of Folding@home. I also introduced them to FoldIt which is a great game where you get to fold proteins. All of this went down very well and 85% of the students disagreed when I said that “the videos and other material were a waste of time.”. This does rely on the lecture theatre having speakers you can plug your laptop into, mind you. Will definitely do this next year.
  4. Stretches. I can still remember how sleepy I felt in many of my lectures after about 40 min. So I got my students to stand-up and have a stretch about half-way through, often just before a quiz. This is a no-brainer: 90% of the students agreed that “”the half-way break and stretch helped my concentration. It turns out there is a lot of material about breaks in lectures on the internet. I’m not sure if I will follow the advice of one student who suggested that “when doing the stretch in the middle of the lecture [you should] lead a routine.”.

Trying to stop lectures from being so zzzz…

Why are lectures so sleep inducing? I remember well the effort required to keep your eyelids apart after 35-40 minutes. So, now that I am the lecturer, how can I keep my students at least awake, and hopefully interested? I am not going to talk about the most obvious point, which is to be an enthusiastic and engaging lecturer. Instead, I shall briefly list the tactics I am trying this year in my lecture series, which started this week.

  1. Live polling. Each year I ask the students how many have a device (smartphone, tablet, laptop) that could connect to a website in the lecture theatre. The proportion has been growing steadily and is now around 75-80%. It will never reach 100%, but so long as there enough such that everyone is at least sitting by someone who has a device I think that is good enough. Socrative is a good, clean website that allows one to create and run quizzes and since last year they have released a newer version in beta. So, I’m going to try running a simple quiz made up of 4-6 true/false questions halfway through each lecture. I could, of course, use clickers but we only have a few sets and using the students’ own phones could be easier.
  2. Online reading lists. Who ever looked up, let alone read the papers you usually find on the lecture handouts? Part of this, I feel, is the difficulty in finding the paper on the web, especially when you’ve never done this before. So I have made a Mendeley Public Group which contains all the references from my slides. Here one can click on a link and, voila, be taken straight the paper (subject to paywalls / VPNs etc). One can also add new references, or (in the app anyway) make notes on each paper that everyone else can see. To encourage some of the students to take this up I’m running a competition to see who can improve the Public Group for this lecture series the most.
  3. Videos. So far we’ve looked at a simulation of a protein folding from Folding@home. In a coming lecture I’ve got Linus Pauling discussing how he came up with the idea for alpha helices.
  4. Stretches. These also help break up the lecture. The idea is that Just getting the students to stand up, have a stretch and sit down about halfway through will help them maintain their concentration for the full 50 minutes.

I’ll write another post after the course is finished documenting what worked (and perhaps what didn’t).

Software Carpentry Feedback


As well as asking the attendees how they thought the workshop had gone, I sent them a questionnaire before the workshop. The idea was to see what their expectations were and if the workshop then met them. For example we asked “How would you describe your expertise in the following tools?” and the results are on the right. Overall most people didn’t feel they knew much about the tools we had identified as being potentially most useful. We also asked “What you would like the workshop to cover?” and the answers indicated these tools were relevant (results not shown).


fig-post-understand-2So, how did the workshop do? Well, 92% of the attendees agreed or strongly agreed with the statement “I enjoyed the Software Carpentry Workshop” and 96% “[felt they] learnt something useful from the workshop that will help my research.”. Everyone who had come from an experimental lab thought that “other members of my lab would benefit from a workshop like this”. A good start, but did it improve their understanding? So we also asked “I understand enough to try using the the following tools” and most people agreed (see left)! Promising, but maybe it was the sugar from the donuts kicking in.

fig-post-intend-2To try and resolve things we then asked “I intend using the tools to help my research” and lo, some of those agrees not unsurprisingly sneak to the left and join the disagrees (see the graph on the right). I’m happy and seeing as 92% agreed with “A workshop like this should be run annually in Biochemistry” maybe I’ll be running another one.

Few comments:

“The course was very informative and useful for my research! Thanks”

“I now see the value of a more ‘scientific’ approach to programming in science, in terms of version tracking, reproduciblity and validity. I try to be thorough in my approach to my research and that should extend to my programming. This workshop has been an excellent first step in that direction.”

“Excellent course, thanks for letting me take part.”