Mapping the human genetic architecture of COVID-19Abstract The genetic make-up of an individual contributes to the susceptibility and response to viral infection. Although environmental, clinical and social factors have a role in the chance of exposure to SARS-CoV-2 and the severity of COVID-19 1,2 , host genetics may also be important. Identifying host-specific genetic factors may reveal biological mechanisms of therapeutic relevance and clarify causal relationships of modifiable environmental risk factors for SARS-CoV-2 infection and outcomes. We formed a global network of researchers to investigate the role of human genetics in SARS-CoV-2 infection and COVID-19 severity. Here we describe the results of three genome-wide association meta-analyses that consist of up to 49,562 patients with COVID-19 from 46 studies across 19 countries. We report 13 genome-wide significant loci that are associated with SARS-CoV-2 infection or severe manifestations of COVID-19. Several of these loci correspond to previously documented associations to lung or autoimmune and inflammatory diseases 3–7 . They also represent potentially actionable mechanisms in response to infection. Mendelian randomization analyses support a causal role for smoking and body-mass index for severe COVID-19 although not for type II diabetes. The identification of novel host genetic factors associated with COVID-19 was made possible by the community of human genetics researchers coming together to prioritize the sharing of data, results, resources and analytical frameworks. This working model of international collaboration underscores what is possible for future genetic discoveries in emerging pandemics, or indeed for any complex human disease.
Processing of big heterogeneous genomic datasets for tertiary analysis of Next Generation Sequencing dataMOTIVATION: We previously proposed a paradigm shift in genomic data management, based on the Genomic Data Model (GDM) for mediating existing data formats and on the GenoMetric Query Language (GMQL) for supporting, at a high level of abstraction, data extraction and the most common data-driven computations required by tertiary data analysis of Next Generation Sequencing datasets. Here, we present a new GMQL-based system with enhanced accessibility, portability, scalability and performance. RESULTS: The new system has a well-designed modular architecture featuring: (i) an intermediate representation supporting many different implementations (including Spark, Flink and SciDB); (ii) a high-level technology-independent repository abstraction, supporting different repository technologies (e.g., local file system, Hadoop File System, database or others); (iii) several system interfaces, including a user-friendly Web-based interface, a Web Service interface, and a programmatic interface for Python language. Biological use case examples, using public ENCODE, Roadmap Epigenomics and TCGA datasets, demonstrate the relevance of our work. AVAILABILITY AND IMPLEMENTATION: The GMQL system is freely available for non-commercial use as open source project at: http://www.bioinformatics.deib.polimi.it/GMQLsystem/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
GenoSurf: metadata driven semantic search system for integrated genomic datasetsMany valuable resources developed by world-wide research institutions and consortia describe genomic datasets that are both open and available for secondary research, but their metadata search interfaces are heterogeneous, not interoperable and sometimes with very limited capabilities. We implemented GenoSurf, a multi-ontology semantic search system providing access to a consolidated collection of metadata attributes found in the most relevant genomic datasets; values of 10 attributes are semantically enriched by making use of the most suited available ontologies. The user of GenoSurf provides as input the search terms, sets the desired level of ontological enrichment and obtains as output the identity of matching data files at the various sources. Search is facilitated by drop-down lists of matching values; aggregate counts describing resulting files are updated in real time while the search terms are progressively added. In addition to the consolidated attributes, users can perform keyword-based searches on the original (raw) metadata, which are also imported; GenoSurf supports the interplay of attribute-based and keyword-based search through well-defined interfaces. Currently, GenoSurf integrates about 40 million metadata of several major valuable data sources, including three providers of clinical and experimental data (TCGA, ENCODE and Roadmap Epigenomics) and two sources of annotation data (GENCODE and RefSeq); it can be used as a standalone resource for targeting the genomic datasets at their original sources (identified with their accession IDs and URLs), or as part of an integrated query answering system for performing complex queries over genomic regions and metadata.
ViruSurf: an integrated database to investigate viral sequencesViruSurf, available at http://gmql.eu/virusurf/, is a large public database of viral sequences and integrated and curated metadata from heterogeneous sources (RefSeq, GenBank, COG-UK and NMDC); it also exposes computed nucleotide and amino acid variants, called from original sequences. A GISAID-specific ViruSurf database, available at http://gmql.eu/virusurf_gisaid/, offers a subset of these functionalities. Given the current pandemic outbreak, SARS-CoV-2 data are collected from the four sources; but ViruSurf contains other virus species harmful to humans, including SARS-CoV, MERS-CoV, Ebola and Dengue. The database is centered on sequences, described from their biological, technological and organizational dimensions. In addition, the analytical dimension characterizes the sequence in terms of its annotations and variants. The web interface enables expressing complex search queries in a simple way; arbitrary search queries can freely combine conditions on attributes from the four dimensions, extracting the resulting sequences. Several example queries on the database confirm and possibly improve results from recent research papers; results can be recomputed over time and upon selected populations. Effective search over large and curated sequence data may enable faster responses to future threats that could arise from new viruses.
Association of COVID-19 Vaccinations With Intensive Care Unit Admissions and Outcome of Critically Ill Patients With COVID-19 Pneumonia in Lombardy, ItalyImportance: Data on the association of COVID-19 vaccination with intensive care unit (ICU) admission and outcomes of patients with SARS-CoV-2-related pneumonia are scarce. Objective: To evaluate whether COVID-19 vaccination is associated with preventing ICU admission for COVID-19 pneumonia and to compare baseline characteristics and outcomes of vaccinated and unvaccinated patients admitted to an ICU. Design, Setting, and Participants: This retrospective cohort study on regional data sets reports: (1) daily number of administered vaccines and (2) data of all consecutive patients admitted to an ICU in Lombardy, Italy, from August 1 to December 15, 2021 (Delta variant predominant). Vaccinated patients received either mRNA vaccines (BNT162b2 or mRNA-1273) or adenoviral vector vaccines (ChAdOx1-S or Ad26.COV2). Incident rate ratios (IRRs) were computed from August 1, 2021, to January 31, 2022; ICU and baseline characteristics and outcomes of vaccinated and unvaccinated patients admitted to an ICU were analyzed from August 1 to December 15, 2021. Exposures: COVID-19 vaccination status (no vaccination, mRNA vaccine, adenoviral vector vaccine). Main Outcomes and Measures: The incidence IRR of ICU admission was evaluated, comparing vaccinated people with unvaccinated, adjusted for age and sex. The baseline characteristics at ICU admission of vaccinated and unvaccinated patients were investigated. The association between vaccination status at ICU admission and mortality at ICU and hospital discharge were also studied, adjusting for possible confounders. Results: Among the 10 107 674 inhabitants of Lombardy, Italy, at the time of this study, the median [IQR] age was 48 [28-64] years and 5 154 914 (51.0%) were female. Of the 7 863 417 individuals who were vaccinated (median [IQR] age: 53 [33-68] years; 4 010 343 [51.4%] female), 6 251 417 (79.5%) received an mRNA vaccine, 550 439 (7.0%) received an adenoviral vector vaccine, and 1 061 561 (13.5%) received a mix of vaccines and 4 497 875 (57.2%) were boosted. Compared with unvaccinated people, IRR of individuals who received an mRNA vaccine within 120 days from the last dose was 0.03 (95% CI, 0.03-0.04; P < .001), whereas IRR of individuals who received an adenoviral vector vaccine after 120 days was 0.21 (95% CI, 0.19-0.24; P < .001). There were 553 patients admitted to an ICU for COVID-19 pneumonia during the study period: 139 patients (25.1%) were vaccinated and 414 (74.9%) were unvaccinated. Compared with unvaccinated patients, vaccinated patients were older (median [IQR]: 72 [66-76] vs 60 [51-69] years; P < .001), primarily male individuals (110 patients [79.1%] vs 252 patients [60.9%]; P < .001), with more comorbidities (median [IQR]: 2 [1-3] vs 0 [0-1] comorbidities; P < .001) and had higher ratio of arterial partial pressure of oxygen (Pao2) and fraction of inspiratory oxygen (FiO2) at ICU admission (median [IQR]: 138 [100-180] vs 120 [90-158] mm Hg; P = .007). Factors associated with ICU and hospital mortality were higher age, premorbid heart disease, lower Pao2/FiO2 at ICU admission, and female sex (this factor only for ICU mortality). ICU and hospital mortality were similar between vaccinated and unvaccinated patients. Conclusions and Relevance: In this cohort study, mRNA and adenoviral vector vaccines were associated with significantly lower risk of ICU admission for COVID-19 pneumonia. ICU and hospital mortality were not associated with vaccinated status. These findings suggest a substantial reduction of the risk of developing COVID-19-related severe acute respiratory failure requiring ICU admission among vaccinated people.