R

Ruoyun Hui

Turing Institute

ORCID: 0000-0002-5689-7131

Publishes on Forensic and Genetic Research, Forensic Anthropology and Bioarchaeology Studies, Yersinia bacterium, plague, ectoparasites research. 25 papers and 1.8k citations.

25Publications
1.8kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Insights into human genetic variation and population history from 929 diverse genomes
Cited by 1.1kOpen Access

Genome sequences from diverse human groups are needed to understand the structure of genetic variation in our species and the history of, and relationships between, different populations. We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented common genetic variation private to southern Africa, central Africa, Oceania, and the Americas, but an absence of such variants fixed between major geographical regions. We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the past 10,000 years, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations.

Insights into human genetic variation and population history from 929 diverse genomes
Anders Bergström, Shane McCarthy, Ruoyun Hui et al.|bioRxiv (Cold Spring Harbor Laboratory)|2019
Cited by 156Open Access

Abstract Genome sequences from diverse human groups are needed to understand the structure of genetic variation in our species and the history of, and relationships between, different populations. We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented private genetic variation in southern and central Africa and in Oceania and the Americas, but an absence of fixed, private variants between major geographical regions. We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the last 10,000 years, a potentially major population growth episode after the peopling of the Americas, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations. We also demonstrate benefits to the study of population relationships of genome sequences over ascertained array genotypes. These genome sequences are freely available as a resource with no access or analysis restrictions.

Detecting archaic introgression using an unadmixed outgroup
Laurits Skov, Ruoyun Hui, Vladimir Shchur et al.|PLoS Genetics|2018
Cited by 132Open Access

Human populations outside of Africa have experienced at least two bouts of introgression from archaic humans, from Neanderthals and Denisovans. In Papuans there is prior evidence of both these introgressions. Here we present a new approach to detect segments of individual genomes of archaic origin without using an archaic reference genome. The approach is based on a hidden Markov model that identifies genomic regions with a high density of single nucleotide variants (SNVs) not seen in unadmixed populations. We show using simulations that this provides a powerful approach to identifying segments of archaic introgression with a low rate of false detection, given data from a suitable outgroup population is available, without the archaic introgression but containing a majority of the variation that arose since initial separation from the archaic lineage. Furthermore our approach is able to infer admixture proportions and the times both of admixture and of initial divergence between the human and archaic populations. We apply the model to detect archaic introgression in 89 Papuans and show how the identified segments can be assigned to likely Neanderthal or Denisovan origin. We report more Denisovan admixture than previous studies and find a shift in size distribution of fragments of Neanderthal and Denisovan origin that is compatible with a difference in admixture time. Furthermore, we identify small amounts of Denisova ancestry in South East Asians and South Asians.

Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes
Ruoyun Hui, Eugenia D’Atanasio, Lara M. Cassidy et al.|Scientific Reports|2020
Cited by 123Open Access

Although ancient DNA data have become increasingly more important in studies about past populations, it is often not feasible or practical to obtain high coverage genomes from poorly preserved samples. While methods of accurate genotype imputation from > 1 × coverage data have recently become a routine, a large proportion of ancient samples remain unusable for downstream analyses due to their low coverage. Here, we evaluate a two-step pipeline for the imputation of common variants in ancient genomes at 0.05-1 × coverage. We use the genotype likelihood input mode in Beagle and filter for confident genotypes as the input to impute missing genotypes. This procedure, when tested on ancient genomes, outperforms a single-step imputation from genotype likelihoods, suggesting that current genotype callers do not fully account for errors in ancient sequences and additional quality controls can be beneficial. We compared the effect of various genotype likelihood calling methods, post-calling, pre-imputation and post-imputation filters, different reference panels, as well as different imputation tools. In a Neolithic Hungarian genome, we obtain ~ 90% imputation accuracy for heterozygous common variants at coverage 0.05 × and > 97% accuracy at coverage 0.5 ×. We show that imputation can mitigate, though not eliminate reference bias in ultra-low coverage ancient genomes.