Flashcard Fridays – How accurate is next generation sequencing?

As for most unspecific questions, my best answer is “it depends”.* And it depends on a lot of things: the sequencing platform, the experiment design, the lab protocol, the quality of the biological sample, the mood of the lab tech, etc. I could go on and on and on and on and on.

In the last few years, I”ve seen hundreds of NGS samples from dozens of different labs from all over the world and the only thing about the quality and accuracy of next generation sequencing I”m totally sure of that it varies. Greatly.

Although there are a few well known problems around (like homopolymer errors, for example), if you do a literature search, you can see that there”s not an overwhelming amount of articles dealing with this question. NGS companies are not overly enthusiastic about sharing accuracy information either, they mostly seem to provide actual statistics for only a few example datasets. I have a gut feeling, that these “model students” don”t really represent the whole “student population” that well.

So here is a list of recent articles that provide some information about error rates, false positive rates and other accuracy related measures. Feel free to share any other articles or information in the comment section!

Routine performance and errors of 454 HLA exon sequencing in diagnostics. Niklas et al. 2013

Nice overview of possible causes and different types of errors. They found a 0.18 overall error rate and about 31% of reads contained one or more errors.

Estimation of sequencing error rates in short reads. Wang et al. 2012

The article presents a method (implemented as an R package) for estimating error rate in short reads.

PCR-Induced Transitions Are the Major Source of Error in Cleaned Ultra-Deep Pyrosequencing Data. Brodin et al. 2013

454 with very low error rates, again.

Direct Comparisons of Illumina vs. Roche 454 Sequencing Technologies on the Same Microbial Community DNA Sample. Luo et al. 2012

Illumina vs. 454

Sequence-specific error profile of Illumina sequencers. Nakamura et al. 2011

Keep in mind, that this article is not that recent.

Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems. Minoche et al. 2011

Not so recent Illumina, second round.

Shining a Light on Dark Sequencing: Characterising Errors in Ion Torrent PGM Data. Bragg et al. 2013

Great article with original results and a thorough review about Ion Torrent sequencing problems.

 

* Another valid answer to this particular question would be: “What do you mean by accurate?” (Or by next generation sequencing, for that matter.)