I

Ingileif B. Hallgrímsdóttir

California Institute of Technology

ORCID: 0000-0002-4710-0047

Publishes on Genetic Associations and Epidemiology, Genetic Mapping and Diversity in Plants and Animals, Bioinformatics and Genomic Networks. 31 papers and 17.6k citations.

31Publications
17.6kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Algebraic Statistics for Computational Biology
Lior Pachter, L. Pachter, Lior Pachter et al.|Cambridge University Press eBooks|2005
Cited by 509

The quantitative analysis of biological sequence data is based on methods from statistics coupled with efficient algorithms from computer science. Algebra provides a framework for unifying many of the seemingly disparate techniques used by computational biologists. This book, first published in 2005, offers an introduction to this mathematical framework and describes tools from computational algebra for designing new algorithms for exact, accurate results. These algorithms can be applied to biological problems such as aligning genomes, finding genes and constructing phylogenies. The first part of this book consists of four chapters on the themes of Statistics, Computation, Algebra and Biology, offering speedy, self-contained introductions to the emerging field of algebraic statistics and its applications to genomics. In the second part, the four themes are combined and developed to tackle real problems in computational genomics. As the first book in the exciting and dynamic area, it will be welcomed as a text for self-study or for advanced undergraduate and beginning graduate courses.

Placenta and appetite genes GDF15 and IGFBP7 are associated with hyperemesis gravidarum
Marlena S. Fejzo, Olga V. Sazonova, J. Fah Sathirapongsasuti et al.|Nature Communications|2018
Cited by 184Open Access

Abstract Hyperemesis gravidarum (HG), severe nausea and vomiting of pregnancy, occurs in 0.3–2% of pregnancies and is associated with maternal and fetal morbidity. The cause of HG remains unknown, but familial aggregation and results of twin studies suggest that understanding the genetic contribution is essential for comprehending the disease etiology. Here, we conduct a genome-wide association study (GWAS) for binary (HG) and ordinal (severity of nausea and vomiting) phenotypes of pregnancy complications. Two loci, chr19p13.11 and chr4q12, are genome-wide significant ( p < 5 × 10 −8 ) in both association scans and are replicated in an independent cohort. The genes implicated at these two loci are GDF15 and IGFBP7 respectively, both known to be involved in placentation, appetite, and cachexia. While proving the casual roles of GDF15 and IGFBP7 in nausea and vomiting of pregnancy requires further study, this GWAS provides insights into the genetic risk factors contributing to the disease.

Human metabolic profiles are stably controlled by genetic and environmental variation
George Nicholson, Mattias Rantalainen, Anthony D. Maher et al.|Molecular Systems Biology|2011
Cited by 180Open Access

¹H Nuclear Magnetic Resonance spectroscopy (¹H NMR) is increasingly used to measure metabolite concentrations in sets of biological samples for top-down systems biology and molecular epidemiology. For such purposes, knowledge of the sources of human variation in metabolite concentrations is valuable, but currently sparse. We conducted and analysed a study to create such a resource. In our unique design, identical and non-identical twin pairs donated plasma and urine samples longitudinally. We acquired ¹H NMR spectra on the samples, and statistically decomposed variation in metabolite concentration into familial (genetic and common-environmental), individual-environmental, and longitudinally unstable components. We estimate that stable variation, comprising familial and individual-environmental factors, accounts on average for 60% (plasma) and 47% (urine) of biological variation in ¹H NMR-detectable metabolite concentrations. Clinically predictive metabolic variation is likely nested within this stable component, so our results have implications for the effective design of biomarker-discovery studies. We provide a power-calculation method which reveals that sample sizes of a few thousand should offer sufficient statistical precision to detect ¹H NMR-based biomarkers quantifying predisposition to disease.

Association mapping from sequencing reads using k-mers
Cited by 144Open Access

Genome wide association studies (GWAS) rely on microarrays, or more recently mapping of sequencing reads, to genotype individuals. The reliance on prior sequencing of a reference genome limits the scope of association studies, and also precludes mapping associations outside of the reference. We present an alignment free method for association studies of categorical phenotypes based on counting <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>k</mml:mi> </mml:math> -mers in whole-genome sequencing reads, testing for associations directly between <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>k</mml:mi> </mml:math> -mers and the trait of interest, and local assembly of the statistically significant <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>k</mml:mi> </mml:math> -mers to identify sequence differences. An analysis of the 1000 genomes data show that sequences identified by our method largely agree with results obtained using the standard approach. However, unlike standard GWAS, our method identifies associations with structural variations and sites not present in the reference genome. We also demonstrate that population stratification can be inferred from <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>k</mml:mi> </mml:math> -mers. Finally, application to an E.coli dataset on ampicillin resistance validates the approach.