Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner

Mathieu Blanchette(Howard Hughes Medical Institute), W. James Kent(University of California, Santa Cruz), Cathy Riemer(Pennsylvania State University), Laura Elnitski(Pennsylvania State University), Arian F. A. Smit(Institute for Systems Biology), Krishna M. Roskin(University of California, Santa Cruz), Robert Baertsch(University of California, Santa Cruz), Kate R. Rosenbloom(University of California, Santa Cruz), Hiram Clawson(University of California, Santa Cruz), Eric D. Green(National Institutes of Health), David Haussler(Howard Hughes Medical Institute), Webb Miller(Pennsylvania State University)
Genome Research
April 1, 2004
Cited by 1,581Open Access
Full Text

Abstract

We define a "threaded blockset," which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for "threaded blockset aligner") builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser.


Related Papers

No related papers found

Powered by citation graph analysis