Oral Presentation Australian Society for Microbiology Annual Scientific Meeting 2017

The genomic deluge: making sense of genome sequencing data (#48)

Brian Forde 1 2 3
  1. Australian Infectious Diseases Research Centre, University of Queensland, Brisbane, QLD, Australia
  2. The School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
  3. Australian Centre for Ecogenomics, University of Queensland, Brisbane, QLD, Australia

DNA sequencing technologies have revolutionised genomic research by providing rapid, cost effective genome-scale sequencing data. In microbiology genome sequencing has allowed for more comprehensive analysis of the structure and content of microbial genomes, provided novel insights into their evolution, and revealed astonishingly complex population and community structures. Sequencing strategies vary from short length, high throughput methods most suited to studying variation and population dynamics to methods that maximise read lengths which are essential for producing complete genomes and epigenomes, and for resolving structurally complex regions. The low cost of genome sequencing has resulted in the generation of an unprecedented volume of sequencing data deposited in online data archives. However, understanding and interpreting these data can hinder the translation from raw sequence to informative and accurate results. Here I will discuss the analysis, interpretation and applications of long read genome sequencing data drawing on examples from my work on multi drug resistant Escherichia coli and Klebsiella pneumoniae. I will focus on practical aspects of genome assembly and the characterisation of structural variation with particular regard to the "dynamic genome" and common missassembly errors. Finally I will outline key shortcomings of long read sequencing platforms and offer some guidance on current best-practice for their application to microbiology.