M

Matthew Halvorsen

University of North Carolina at Chapel Hill

ORCID: 0000-0002-6707-2418

Publishes on Obsessive-Compulsive Spectrum Disorders, Autism Spectrum Disorder Research, Genetic Associations and Epidemiology. 118 papers and 2k citations.

118Publications
2kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Disease-Associated Mutations That Alter the RNA Structural Ensemble
Cited by 341Open Access

Genome-wide association studies (GWAS) often identify disease-associated mutations in intergenic and non-coding regions of the genome. Given the high percentage of the human genome that is transcribed, we postulate that for some observed associations the disease phenotype is caused by a structural rearrangement in a regulatory region of the RNA transcript. To identify such mutations, we have performed a genome-wide analysis of all known disease-associated Single Nucleotide Polymorphisms (SNPs) from the Human Gene Mutation Database (HGMD) that map to the untranslated regions (UTRs) of a gene. Rather than using minimum free energy approaches (e.g. mFold), we use a partition function calculation that takes into consideration the ensemble of possible RNA conformations for a given sequence. We identified in the human genome disease-associated SNPs that significantly alter the global conformation of the UTR to which they map. For six disease-states (Hyperferritinemia Cataract Syndrome, beta-Thalassemia, Cartilage-Hair Hypoplasia, Retinoblastoma, Chronic Obstructive Pulmonary Disease (COPD), and Hypertension), we identified multiple SNPs in UTRs that alter the mRNA structural ensemble of the associated genes. Using a Boltzmann sampling procedure for sub-optimal RNA structures, we are able to characterize and visualize the nature of the conformational changes induced by the disease-associated mutations in the structural ensemble. We observe in several cases (specifically the 5' UTRs of FTL and RB1) SNP-induced conformational changes analogous to those observed in bacterial regulatory Riboswitches when specific ligands bind. We propose that the UTR and SNP combinations we identify constitute a "RiboSNitch," that is a regulatory RNA in which a specific SNP has a structural consequence that results in a disease phenotype. Our SNPfold algorithm can help identify RiboSNitches by leveraging GWAS data and an analysis of the mRNA structural ensemble.

Annotating pathogenic non-coding variants in genic regions
Sahar Gelfman, Quanli Wang, K. Melodi McSweeney et al.|Nature Communications|2017
Cited by 155Open Access

Identifying the underlying causes of disease requires accurate interpretation of genetic variants. Current methods ineffectively capture pathogenic non-coding variants in genic regions, resulting in overlooking synonymous and intronic variants when searching for disease risk. Here we present the Transcript-inferred Pathogenicity (TraP) score, which uses sequence context alterations to reliably identify non-coding variation that causes disease. High TraP scores single out extremely rare variants with lower minor allele frequencies than missense variants. TraP accurately distinguishes known pathogenic and benign variants in synonymous (AUC = 0.88) and intronic (AUC = 0.83) public datasets, dismissing benign variants with exceptionally high specificity. TraP analysis of 843 exomes from epilepsy family trios identifies synonymous variants in known epilepsy genes, thus pinpointing risk factors of disease from non-coding sequence data. TraP outperforms leading methods in identifying non-coding variants that are pathogenic and is therefore a valuable tool for use in gene discovery and the interpretation of personal genomes.While non-coding synonymous and intronic variants are often not under strong selective constraint, they can be pathogenic through affecting splicing or transcription. Here, the authors develop a score that uses sequence context alterations to predict pathogenicity of synonymous and non-coding genetic variants, and provide a web server of pre-computed scores.

The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity
Slavé Petrovski, Ayal B. Gussow, Quanli Wang et al.|PLoS Genetics|2015
Cited by 151Open Access

Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene's proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene's regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen's Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, ncCADD and ncGWAVA, and find both scores are significantly predictive of human dosage sensitive genes and appear to carry information beyond conservation, as assessed by ncGERP. These results highlight that the intolerance of noncoding sequence stretches in the human genome can provide a critical complementary tool to other genome annotation approaches to help identify the parts of the human genome increasingly likely to harbor mutations that influence risk of disease.

Examination of the shared genetic basis of anorexia nervosa and obsessive–compulsive disorder
Zeynep Yılmaz, Matthew Halvorsen, Julien Bryois et al.|Molecular Psychiatry|2018
Cited by 135Open Access

Anorexia nervosa (AN) and obsessive–compulsive disorder (OCD) are often comorbid and likely to share genetic risk factors. Hence, we examine their shared genetic background using a cross-disorder GWAS meta-analysis of 3495 AN cases, 2688 OCD cases, and 18,013 controls. We confirmed a high genetic correlation between AN and OCD (rg = 0.49 ± 0.13, p = 9.07 × 10−7) and a sizable SNP heritability (SNP h2 = 0.21 ± 0.02) for the cross-disorder phenotype. Although no individual loci reached genome-wide significance, the cross-disorder phenotype showed strong positive genetic correlations with other psychiatric phenotypes (e.g., rg = 0.36 with bipolar disorder and 0.34 with neuroticism) and negative genetic correlations with metabolic phenotypes (e.g., rg = −0.25 with body mass index and −0.20 with triglycerides). Follow-up analyses revealed that although AN and OCD overlap heavily in their shared risk with other psychiatric phenotypes, the relationship with metabolic and anthropometric traits is markedly stronger for AN than for OCD. We further tested whether shared genetic risk for AN/OCD was associated with particular tissue or cell-type gene expression patterns and found that the basal ganglia and medium spiny neurons were most enriched for AN–OCD risk, consistent with neurobiological findings for both disorders. Our results confirm and extend genetic epidemiological findings of shared risk between AN and OCD and suggest that larger GWASs are warranted.