STATISTICAL COMPARISON OF METHODS TO ESTIMATE THE ERROR PROBABILITY IN SHORT-READ ILLUMINA SEQUENCING

Irina Abnizova; Tom Skelly; Fedor Naumenko; Nava Whiteford; Clive Brown; Tony Cox

doi:10.1142/s021972001000463x

STATISTICAL COMPARISON OF METHODS TO ESTIMATE THE ERROR PROBABILITY IN SHORT-READ ILLUMINA SEQUENCING

Irina Abnizova(Wellcome Sanger Institute), Tom Skelly(Wellcome Sanger Institute), Fedor Naumenko, Nava Whiteford(Oxford Nanopore Technologies (United Kingdom)), Clive Brown(Oxford Nanopore Technologies (United Kingdom)), Tony Cox(Wellcome Sanger Institute)

Journal of Bioinformatics and Computational Biology

March 5, 2010

10.1142/s021972001000463x

Cited by 13

Abstract

As was the case in the beginning of the sequencing era, the new generation of short-read sequencing technologies still requires both accuracy of data processing methods and reliable measures of that accuracy. Inspired by the classic of the genre, the Phred method, we generalized those findings in the area of base quality value calibration. We introduce a simple, straightforward statistically established way to measure the performance of a calibrator, and to find an optimal way to assess its reliability. We illustrate the method by assessing the performance of several calibrators/predictors for Illumina, Genome Analyser 2 (GA2) data. The choice of the best predictor is based on optimization of validity, discriminative ability and discrimination power for several candidate predictors. We applied the method on data from one experimental run for genome of the phage varphiX, and found the best predictor out of ten candidates to be 'Purity', a statistics derived from corrected cluster intensities. The source code for the comparison of the predictors is available from the authors by request.

Related Papers

No related papers found

Powered by citation graph analysis