New software: pygsi Philip Fowler, 31st August 2018 Whenever a paper involving sequencing the genome of bacteria (or other species for that matter), the researcher is obliged to deposit the (usually short reads) in either the European Nucleotide Archive (ENA) and the Short Read Archive (SRA) along with some metadata. Sounds good, but there has been a flaw until recently; whilst one could deposit the short-read files, one could only search the associated metadata. This meant that, say you wanted to search the ENA for samples containing MCR-1, an important recently identified gene that confers colistin resistance, if it wasn’t explicitly mentioned in the metadata (and most of the time it wouldn’t have been as it wouldn’t have been identified yet!), you’d have had to download all the possible short read files and then trawl through them. In other words, the ENA and SRA were archives; easy to put data into, difficult to search and interrogate. Zam Iqbal and his group have developed an index for all the bacterial and viral pathogen genetic data in the ENA/SRA as of late 2017 which is searchable. It is called BIGSI and you can try it here (the resemblance to an early Google is not, I suspect, a coincidence) and you can find the preprint here. Doesn’t seem like much, but suddenly we can ask all sorts of interesting questions. Like: how many samples contain MCR-1? One problem is when we are looking for a gene we are usually looking for the reference sequence and associated minor variants (e.g. couple of SNP differences). With the current BIGSI interface this is hard, since you’d have to systematically give it all possible variants of your base k-mer. Fortunately, systematically is something computers are good at, so as a hack (because ultimately I imagine something like BIGSI will become a service at the EBI and this sort of functionality will be included), I wrote a Python package that takes a gene and then walks along the sequence and asking BIGSI how many times each minor variant occurs. Since each variant requires a web API call, it isn’t rapid, but you can work through a single gene overnight. The package, including a more detailed description and examples, can be downloaded from its GitHub repository. Share this:Twitter Related antimicrobial resistance computing
Desirable features for any antibiotic resistance catalogue 31st October 202331st October 2023 In the past few years a growing number of catalogues containing mutations associated with resistance… Share this:Twitter Read More
antimicrobial resistance New publication: Validating a bespoke 96-well plate for high-throughput drug susceptibility testing of M. tuberculosis 28th August 201829th September 2018 This paper, published in Antimicrobial Agents and Chemotherapy, determines the reproducibility and accuracy of minimum… Share this:Twitter Read More
computing Running my first Software Carpentry workshop 1st November 2012 “Can you email me that script you used to do your analysis?” “Sure. It isn’t… Share this:Twitter Read More