New software: pygsi Philip Fowler, 31st August 2018 Whenever a paper involving sequencing the genome of bacteria (or other species for that matter), the researcher is obliged to deposit the (usually short reads) in either the European Nucleotide Archive (ENA) and the Short Read Archive (SRA) along with some metadata. Sounds good, but there has been a flaw until recently; whilst one could deposit the short-read files, one could only search the associated metadata. This meant that, say you wanted to search the ENA for samples containing MCR-1, an important recently identified gene that confers colistin resistance, if it wasn’t explicitly mentioned in the metadata (and most of the time it wouldn’t have been as it wouldn’t have been identified yet!), you’d have had to download all the possible short read files and then trawl through them. In other words, the ENA and SRA were archives; easy to put data into, difficult to search and interrogate. Zam Iqbal and his group have developed an index for all the bacterial and viral pathogen genetic data in the ENA/SRA as of late 2017 which is searchable. It is called BIGSI and you can try it here (the resemblance to an early Google is not, I suspect, a coincidence) and you can find the preprint here. Doesn’t seem like much, but suddenly we can ask all sorts of interesting questions. Like: how many samples contain MCR-1? One problem is when we are looking for a gene we are usually looking for the reference sequence and associated minor variants (e.g. couple of SNP differences). With the current BIGSI interface this is hard, since you’d have to systematically give it all possible variants of your base k-mer. Fortunately, systematically is something computers are good at, so as a hack (because ultimately I imagine something like BIGSI will become a service at the EBI and this sort of functionality will be included), I wrote a Python package that takes a gene and then walks along the sequence and asking BIGSI how many times each minor variant occurs. Since each variant requires a web API call, it isn’t rapid, but you can work through a single gene overnight. The package, including a more detailed description and examples, can be downloaded from its GitHub repository. Share this: Click to share on X (Opens in new window) X Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Mastodon (Opens in new window) Mastodon Related antimicrobial resistance computing
computing Goodbye glados 11th July 2018 Setting up my own computing cluster with a batch queuing system and then using it… Share this: Click to share on X (Opens in new window) X Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Mastodon (Opens in new window) Mastodon Read More
New publication: Predicting antibiotic resistance in complex protein targets using alchemical free energy methods 26th August 202224th October 2022 In this paper, Alice Brankin calculates how different mutations in the DNA gyrase affect the… Share this: Click to share on X (Opens in new window) X Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Mastodon (Opens in new window) Mastodon Read More
antimicrobial resistance New publication: Assessing Drug Susceptibility in Tuberculosis 28th September 201829th September 2018 A paper was published in the New England Journal of Medicine earlier this week by… Share this: Click to share on X (Opens in new window) X Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Mastodon (Opens in new window) Mastodon Read More