Christopher Chang

Second-generation PLINK: rising to the challenge of larger and richer datasets

Christopher Chang, Carson C. Chow, Laurent CAM Tellier et al.|GigaScience|2015

Cited by 13.8kOpen Access

BACKGROUND: PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format. FINDINGS: To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, [Formula: see text]-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0). CONCLUSIONS: The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.

Sensitive and specific multi-cancer detection and localization using methylation signatures in cell-free DNA

M.C. Liu, Geoffrey R. Oxnard, Eric A. Klein et al.|Annals of Oncology|2020

Cited by 1.5kOpen Access

Osteonecrosis: Current perspectives on pathogenesis and treatment

Christopher Chang, Adam Greenspan, M. Erick Gershwin|Seminars in Arthritis and Rheumatism|1993

Cited by 199

Identification of individuals by trait prediction using whole-genome sequencing data

Christoph Lippert, Riccardo Sabatini, M. Cyrus Maher et al.|Proceedings of the National Academy of Sciences|2017

Cited by 159Open Access

Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited. However, we have developed a maximum entropy algorithm that integrates multiple predictions to determine which genomic samples and phenotype measurements originate from the same person. Using this algorithm, we have reidentified an average of >8 of 10 held-out individuals in an ethnically mixed cohort and an average of 5 of either 10 African Americans or 10 Europeans. This work challenges current conceptions of personal privacy and may have far-reaching ethical and legal implications.

Medical relevance of protein-truncating variants across 337,205 individuals in the UK Biobank study

Christopher DeBoever, Yosuke Tanigawa, Maléne E. Lindholm et al.|Nature Communications|2018

Cited by 139Open Access

Protein-truncating variants can have profound effects on gene function and are critical for clinical genome interpretation and generating therapeutic hypotheses, but their relevance to medical phenotypes has not been systematically assessed. Here, we characterize the effect of 18,228 protein-truncating variants across 135 phenotypes from the UK Biobank and find 27 associations between medical phenotypes and protein-truncating variants in genes outside the major histocompatibility complex. We perform phenome-wide analyses and directly measure the effect in homozygous carriers, commonly referred to as "human knockouts," across medical phenotypes for genes implicated as being protective against disease or associated with at least one phenotype in our study. We find several genes with strong pleiotropic or non-additive effects. Our results illustrate the importance of protein-truncating variants in a variety of diseases.

Christopher Chang

Is this you? Claim your profile.

Top publicationsby citations