Genomic data in the All of Us Research ProgramAbstract Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics 1–4 . The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health 5,6 . Here we describe the programme’s genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.
Centers for Mendelian Genomics: A decade of facilitating gene discoveryVariant‐level matching for diagnosis and discovery: Challenges and opportunitiesHere we describe MyGene2, Geno2MP, VariantMatcher, and Franklin; databases that provide variant-level information and phenotypic features to researchers, clinicians, healthcare providers and patients. Following the footsteps of the Matchmaker Exchange project that connects exome, genome, and phenotype databases at the gene level, these databases have as one goal to facilitate connection to one another using Data Connect, a standard for discovery and search of biomedical data from the Global Alliance for Genomics and Health (GA4GH).
PhenoDB, GeneMatcher and VariantMatcher, tools for analysis and sharing of sequence dataElizabeth Wohler, Renan Paulo Martin, Sean Griffith et al.|Orphanet Journal of Rare Diseases|2021 BACKGROUND: With the advent of whole exome (ES) and genome sequencing (GS) as tools for disease gene discovery, rare variant filtering, prioritization and data sharing have become essential components of the search for disease genes and variants potentially contributing to disease phenotypes. The computational storage, data manipulation, and bioinformatic interpretation of thousands to millions of variants identified in ES and GS, respectively, is a challenging task. To aid in that endeavor, we constructed PhenoDB, GeneMatcher and VariantMatcher. RESULTS: PhenoDB is an accessible, freely available, web-based platform that allows users to store, share, analyze and interpret their patients' phenotypes and variants from ES/GS data. GeneMatcher is accessible to all stakeholders as a web-based tool developed to connect individuals (researchers, clinicians, health care providers and patients) around the globe with interest in the same gene(s), variant(s) or phenotype(s). Finally, VariantMatcher was developed to enable public sharing of variant-level data and phenotypic information from individuals sequenced as part of multiple disease gene discovery projects. Here we provide updates on PhenoDB and GeneMatcher applications and implementation and introduce VariantMatcher. CONCLUSION: Each of these tools has facilitated worldwide data sharing and data analysis and improved our ability to connect genes to phenotypic traits. Further development of these platforms will expand variant analysis, interpretation, novel disease-gene discovery and facilitate functional annotation of the human genome for clinical genomics implementation and the precision medicine initiative.
A Two–Scale Solution Algorithm for the Elastic Wave EquationTetyana Vdovina, Susan E. Minkoff, Sean Griffith|SIAM Journal on Scientific Computing|2009 Operator-based upscaling is a two-scale algorithm that speeds up the solution of the wave equation by producing a coarse grid solution which incorporates much of the local fine-scale solution information. We present the first implementation of operator upscaling for the elastic wave equation. By using the velocity-displacement formulation of the three-dimensional elastic wave equation, basis functions that are linear in all three directions, and applying mass lumping, the subgrid solve (first stage of the two-step algorithm) reduces to solving explicit difference equations. At the second stage of the algorithm, we upscale both velocity and displacement by using local subgrid information to formulate the coarse-grid problem. The coarse-grid system matrix is independent of time, sparse, and banded. This paper explores both serial and parallel implementations of the method. The main simplifying assumption of the method (special zero boundary conditions imposed on coarse blocks in the first stage of the algorithm) leads to an easily parallelizable algorithm because very little communication is required between processors. In fact, for this upscaling implementation calculation of the load vector for the coarse solve dominates the cost of a time step. We show that for a homogeneous medium convergence is second-order in space and time so long as both the coarse and fine grids are simultaneously refined. A series of heterogeneous-medium numerical experiments demonstrate that the upscaled solution captures the fine-scale fluctuations in the input parameters accurately. Most notable for use in a seismic inversion algorithm, the upscaling algorithm accurately locates the depth of reflectors (interface changes).