Should I use short-reads or long-reads?
In the past year or so, we find ourselves with multiple technology options for conducting microbial scale Whole Genome Sequencing (WGS). We can now choose between Illumina short-read sequencing or Nanopore long-read sequencing for small genome WGS. (There are also other options that we’ll cover elsewhere). We offer both types of sequencing methods for (prokaryotic and eukaryotic) genomes under 50 Mb in length, with a focus on bacterial genomes. So the question becomes – which method should I use for microbial scale sequencing? Let’s compare and contrast the two methods.
|Single Nucelotide Polymorphism (SNP)||Yes||Yes|
|Short Indels (insertions/deletions)||Yes||Yes|
|Copy Number Variations (CNV's)||Yes||Yes|
|Q-score (Phred score) > Q30 |
(99.9% basecall or consensus accuracy)
|Large Structural Variants||No||Yes|
|Long Repeat Sequences||No||Yes|
|Long N50 Length||No||Yes|
|Generate a Reference Genome||No||Yes|
|Cost < 10 Multiplexed Samples||$||$$|
|Cost >= 10 Multiplexed Samples||$||$|
Table 1. A feature comparison of Illumina short-read vs. Nanopore long-read sequencing.
As shown in Table 1, short-read and long-read sequencing technologies share some common features but also differ in their sequencing capabilities. Both technologies are capable of detecting SNP’s, detecting short indels, detecting CNV’s and obtaining Phred Q-scores > Q30 (99.9% basecall or consensus accuracy). However, long-read technology is capable of sequencing genomic regions that are difficult for short-reads to resolve, including large structural variants, long repeat sequences, translocations, inversions and duplications.
Also, in our sequencing runs, we use paired-end short-reads which have read lengths = 2 x 150bp = 300 bp. Long-read read lengths are highly variable with N50 between 25 Kbp and 75 Kbp. Thus, long-reads cover a considerably greater region of a sample genome than short-reads.
Nanopore recently announced the availability of a new R10.4.1 flowcell and Q20+/Kit14/duplex sequencing chemistry. This combination of flowcell and chemistry allows us to generate reference-grade microbial genomes without the need for short-read/long-read hybrid assembly. We cannot assemble fully finished reference genomes from short-reads alone, so having this option with just long-reads is a major benefit.
The sequencing costs for Nanopore long-reads are somewhat higher than Illumina short-reads when multiplexing a small number of samples, say, <10 samples per run. However, the sequencing costs reach approximate parity if you multiplex 10 samples per run. There are cost benefits for scaling up the number of samples per run in terms of required labor expenditures, total overhead expenses, flowcell and kit resources and other factors. In order to determine the most economical sequencing options, the best approach is to contact us and review your research project goals and budget.