REAPR: a universal tool for genome assembly evaluation

Martin Hunt(Wellcome Sanger Institute), Taisei Kikuchi(University of Miyazaki), Mandy Sanders(Wellcome Sanger Institute), Chris Newbold(Wellcome Sanger Institute), Matthew Berriman(Wellcome Sanger Institute), Thomas D. Otto(Wellcome Sanger Institute)
Genome biology
May 27, 2013
Cited by 451Open Access
Full Text

Abstract

Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.


Related Papers

No related papers found

Powered by citation graph analysis