Ross A. Lippert

Accurate and efficient integration for molecular dynamics simulations at constant temperature and pressure

Ross A. Lippert, Cristian Predescu, Douglas J. Ierardi et al.|The Journal of Chemical Physics|2013

Cited by 223Open Access

In molecular dynamics simulations, control over temperature and pressure is typically achieved by augmenting the original system with additional dynamical variables to create a thermostat and a barostat, respectively. These variables generally evolve on timescales much longer than those of particle motion, but typical integrator implementations update the additional variables along with the particle positions and momenta at each time step. We present a framework that replaces the traditional integration procedure with separate barostat, thermostat, and Newtonian particle motion updates, allowing thermostat and barostat updates to be applied infrequently. Such infrequent updates provide a particularly substantial performance advantage for simulations parallelized across many computer processors, because thermostat and barostat updates typically require communication among all processors. Infrequent updates can also improve accuracy by alleviating certain sources of error associated with limited-precision arithmetic. In addition, separating the barostat, thermostat, and particle motion update steps reduces certain truncation errors, bringing the time-average pressure closer to its target value. Finally, this framework, which we have implemented on both general-purpose and special-purpose hardware, reduces software complexity and improves software modularity.

Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem

Ross A. Lippert|Briefings in Bioinformatics|2002

Cited by 217Open Access

With the consensus human genome sequenced and many other sequencing projects at varying stages of completion, greater attention is being paid to the genetic differences among individuals and the abilities of those differences to predict phenotypes. A significant obstacle to such work is the difficulty and expense of determining haplotypes--sets of variants genetically linked because of their proximity on the genome--for large numbers of individuals for use in association studies. This paper presents some algorithmic considerations in a new approach for haplotype determination: inferring haplotypes from localised polymorphism data gathered from short genome 'fragments.' Formalised models of the biological system under consideration are examined, given a variety of assumptions about the goal of the problem and the character of optimal solutions. Some theoretical results and algorithms for handling haplotype assembly given the different models are then sketched. The primary conclusion is that some important simplified variants of the problem yield tractable problems while more general variants tend to be intractable in the worst case.

SNPs Problems, Complexity, and Algorithms

Giuseppe Lancia, Vineet Bafna, Sorin Istrail et al.|Lecture notes in computer science|2001

Cited by 211

Whole-genome shotgun assembly and comparison of human genome assemblies

Sorin Istrail, Granger G. Sutton, Liliana Florea et al.|Proceedings of the National Academy of Sciences|2004

Cited by 184Open Access

We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304-1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860-921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats.

Optimal Haplotype Block-Free Selection of Tagging SNPs for Genome-Wide Association Studies

Bjarni V. Halldórsson, Vineet Bafna, Ross A. Lippert et al.|Genome Research|2004

Cited by 136Open Access

It is widely hoped that the study of sequence variation in the human genome will provide a means of elucidating the genetic component of complex diseases and variable drug responses. A major stumbling block to the successful design and execution of genome-wide disease association studies using single-nucleotide polymorphisms (SNPs) and linkage disequilibrium is the enormous number of SNPs in the human genome. This results in unacceptably high costs for exhaustive genotyping and presents a challenging problem of statistical inference. Here, we present a new method for optimally selecting minimum informative subsets of SNPs, also known as "tagging" SNPs, that is efficient for genome-wide selection. We contrast this method to published methods including haplotype block tagging, that is, grouping SNPs into segments of low haplotype diversity and typing a subset of the SNPs that can discriminate all common haplotypes within the blocks. Because our method does not rely on a predefined haplotype block structure and makes use of the weaker correlations that occur across neighboring blocks, it can be effectively applied across chromosomal regions with both high and low local linkage disequilibrium. We show that the number of tagging SNPs selected is substantially smaller than previously reported using block-based approaches and that selecting tagging SNPs optimally can result in a two- to threefold savings over selecting random SNPs.

Is this you? Claim your profile.

Top publicationsby citations