A bioinformatic filter for improved base-call accuracy and polymorphism detection using the Affymetrix GeneChip® whole-genome resequencing platform

Gagan A. Pandya, Michael H. Holmes(J. Craig Venter Institute), Sirisha Sunkara(J. Craig Venter Institute), Andrew B. Sparks(J. Craig Venter Institute), Yun Bai(J. Craig Venter Institute), Kathleen Verratti(J. Craig Venter Institute), Kelly Saeed(J. Craig Venter Institute), Pratap Venepally(J. Craig Venter Institute), Behnam Jarrahi(J. Craig Venter Institute), Robert Fleischmann(J. Craig Venter Institute), Scott N. Peterson(J. Craig Venter Institute)
Nucleic Acids Research
November 15, 2007
Cited by 18Open Access
Full Text

Abstract

DNA resequencing arrays enable rapid acquisition of high-quality sequence data. This technology represents a promising platform for rapid high-resolution genotyping of microorganisms. Traditional array-based resequencing methods have relied on the use of specific PCR-amplified fragments from the query samples as hybridization targets. While this specificity in the target DNA population reduces the potential for artifacts caused by cross-hybridization, the subsampling of the query genome limits the sequence coverage that can be obtained and therefore reduces the technique's resolution as a genotyping method. We have developed and validated an Affymetrix Inc. GeneChip(R) array-based, whole-genome resequencing platform for Francisella tularensis, the causative agent of tularemia. A set of bioinformatic filters that targeted systematic base-calling errors caused by cross-hybridization between the whole-genome sample and the array probes and by deletions in the sample DNA relative to the chip reference sequence were developed. Our approach eliminated 91% of the false-positive single-nucleotide polymorphism calls identified in the SCHU S4 query sample, at the cost of 10.7% of the true positives, yielding a total base-calling accuracy of 99.992%.


Related Papers

No related papers found

Powered by citation graph analysis