E

Evan Koch

Harvard University

ORCID: 0000-0002-1124-4559

Publishes on Genetic Associations and Epidemiology, Evolution and Genetic Dynamics, Genetic Mapping and Diversity in Plants and Animals. 52 papers and 789 citations.

52Publications
789Total Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Population-specific causal disease effect sizes in functionally important regions impacted by selection
Huwenbo Shi, Steven Gazal, Masahiro Kanai et al.|Nature Communications|2021
Cited by 137Open Access

Many diseases exhibit population-specific causal effect sizes with trans-ethnic genetic correlations significantly less than 1, limiting trans-ethnic polygenic risk prediction. We develop a new method, S-LDXR, for stratifying squared trans-ethnic genetic correlation across genomic annotations, and apply S-LDXR to genome-wide summary statistics for 31 diseases and complex traits in East Asians (average N = 90K) and Europeans (average N = 267K) with an average trans-ethnic genetic correlation of 0.85. We determine that squared trans-ethnic genetic correlation is 0.82× (s.e. 0.01) depleted in the top quintile of background selection statistic, implying more population-specific causal effect sizes. Accordingly, causal effect sizes are more population-specific in functionally important regions, including conserved and regulatory regions. In regions surrounding specifically expressed genes, causal effect sizes are most population-specific for skin and immune genes, and least population-specific for brain genes. Our results could potentially be explained by stronger gene-environment interaction at loci impacted by selection, particularly positive selection.

De Novo Mutation Rate Estimation in Wolves of Known Pedigree
Evan Koch, Rena M. Schweizer, Teia M. Schweizer et al.|Molecular Biology and Evolution|2019
Cited by 76Open Access

Knowledge of mutation rates is crucial for calibrating population genetics models of demographic history in units of years. However, mutation rates remain challenging to estimate because of the need to identify extremely rare events. We estimated the nuclear mutation rate in wolves by identifying de novo mutations in a pedigree of seven wolves. Putative de novo mutations were discovered by whole-genome sequencing and were verified by Sanger sequencing of parents and offspring. Using stringent filters and an estimate of the false negative rate in the remaining observable genome, we obtain an estimate of ∼4.5 × 10-9 per base pair per generation and provide conservative bounds between 2.6 × 10-9 and 7.1 × 10-9. Although our estimate is consistent with recent mutation rate estimates from ancient DNA (4.0 × 10-9 and 3.0-4.5 × 10-9), it suggests a wider possible range. We also examined the consequences of our rate and the accompanying interval for dating several critical events in canid demographic history. For example, applying our full range of rates to coalescent models of dog and wolf demographic history implies a wide set of possible divergence times between the ancestral populations of dogs and extant Eurasian wolves (16,000-64,000 years ago) although our point estimate indicates a date between 25,000 and 33,000 years ago. Aside from one study in mice, ours provides the only direct mammalian mutation rate outside of primates and is likely to be vital to future investigations of mutation rate evolution.

Population sequencing data reveal a compendium of mutational processes in the human germ line
Cited by 73Open Access

Biological mechanisms underlying human germline mutations remain largely unknown. We statistically decompose variation in the rate and spectra of mutations along the genome using volume-regularized nonnegative matrix factorization. The analysis of a sequencing dataset (TOPMed) reveals nine processes that explain the variation in mutation properties between loci. We provide a biological interpretation for seven of these processes. We associate one process with bulky DNA lesions that are resolved asymmetrically with respect to transcription and replication. Two processes track direction of replication fork and replication timing, respectively. We identify a mutagenic effect of active demethylation primarily acting in regulatory regions and a mutagenic effect of long interspersed nuclear elements. We localize a mutagenic process specific to oocytes from population sequencing data. This process appears transcriptionally asymmetric.

Long Range Linkage Disequilibrium across the Human Genome
Cited by 70Open Access

Long-range linkage disequilibria (LRLD) between sites that are widely separated on chromosomes may suggest that population admixture, epistatic selection, or other evolutionary forces are at work. We quantified patterns of LRLD on a chromosome-wide level in the YRI population of the HapMap dataset of single nucleotide polymorphisms (SNPs). We calculated the disequilibrium between all pairs of SNPs on each chromosome (a total of >2×10(11) values) and evaluated significance of overall disequilibrium using randomization. The results show an excess of associations between pairs of distant sites (separated by >0.25 cM) on all of the 22 autosomes. We discuss possible explanations for this observation.