Bacterial Genomics Tutorial¶
This is an introductory tutorial for learning computational genomics mostly on the Linux command-line. It is part of the Genome Science course at Massey University, currently taught by Olin Silander.
The tutorial outlines how to analyse real-world next-generation sequencing (NGS) data that is from a research lab. The final aim is to identify genome variations in evolved lines of wild bacteria that can explain observed biological phenotypes.
Some of the structure of this tutorial was inspired directly by Sebastian Schmeier, who taught Genome Science from 2016-2020.
Contents¶
- 1. Introduction
- 2. The command line interface
- 3. Tool installation
- 4. Quality control
- 4.1. Preface
- 4.2. Overview
- 4.3. Learning outcomes
- 4.4. Reminder: the experimental setup
- 4.5. Structuring your directories
- 4.6. The short-read Illumina data
- 4.7. The fastq file format
- 4.8. The short-read QC process
- 4.9. Visualising the results of the short-read QC process
- 4.10. The long-read Oxford Nanopore data
- 5. Genome assembly
- 6. Snakemake - automation and reproducibility
- 7. Read mapping
- 8. Taxonomic investigation
- 9. Variant calling
- 10. Genome annotation
- 11. Variants-of-interest
- 12. Orthology and Phylogeny
- 13. Quick command reference
- 14. Downloads