Lawrence Berkeley National Laboratory
ORCID: 0000-0002-5898-5848Publishes on Genomics and Chromatin Dynamics, Genomics and Phylogenetic Studies, RNA and protein synthesis mechanisms. 9 papers and 16.2k citations.
Add your photo, update your bio, and get notified when your ranking changes.
Zebrafish, a popular organism for studying embryonic development and for modeling human diseases, has so far lacked a systematic functional annotation program akin to those in other animal models. To address this, we formed the international DANIO-CODE consortium and created a central repository to store and process zebrafish developmental functional genomic data. Our data coordination center ( https://danio-code.zfin.org ) combines a total of 1,802 sets of unpublished and re-analyzed published genomic data, which we used to improve existing annotations and show its utility in experimental design. We identified over 140,000 cis-regulatory elements throughout development, including classes with distinct features dependent on their activity in time and space. We delineated the distinct distance topology and chromatin features between regulatory elements active during zygotic genome activation and those active during organogenesis. Finally, we matched regulatory elements and epigenomic landscapes between zebrafish and mouse and predicted functional relationships between them beyond sequence similarity, thus extending the utility of zebrafish developmental genomics to mammals.
The gut microbiome produces vitamins, nutrients, and neurotransmitters, and helps to modulate the host immune system-and also plays a major role in the metabolism of many exogenous compounds, including drugs and chemical toxicants. However, the extent to which specific microbial species or communities modulate hazard upon exposure to chemicals remains largely opaque. Focusing on the effects of collateral dietary exposure to the widely used herbicide atrazine, we applied integrated omics and phenotypic screening to assess the role of the gut microbiome in modulating host resilience in Drosophila melanogaster. Transcriptional and metabolic responses to these compounds are sex-specific and depend strongly on the presence of the commensal microbiome. Sequencing the genomes of all abundant microbes in the fly gut revealed an enzymatic pathway responsible for atrazine detoxification unique to Acetobacter tropicalis. We find that Acetobacter tropicalis alone, in gnotobiotic animals, is sufficient to rescue increased atrazine toxicity to wild-type, conventionally reared levels. This work points toward the derivation of biotic strategies to improve host resilience to environmental chemical exposures, and illustrates the power of integrative omics to identify pathways responsible for adverse health outcomes.
Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.
Abstract Background Rapid, reagent-free pathogen-agnostic diagnostics that can be performed at the point of need are vital for preparedness against future outbreaks. Yet, many current strategies (polymerase chain reaction, lateral flow immunoassays) are pathogen-specific and require reagents; whereas others such as sequencing-based methods; while agnostic, are not (as yet) conducive for use at the point of need. Herein, we present hyperspectral sensing as an opportunity to overcome these barriers, realizing truly agnostic reagent-free diagnostics. This approach can identify both pathogen and host signatures, without complex logistical considerations, in complex clinical samples. The spectral signature of biomolecules across multiple wavelength regimes provides rich biochemical information, which, coupled with machine learning, can facilitate expedited diagnosis of disease states, the feasibility of which is demonstrated here. Innovation First, we present ProSpectral™ V1, a novel, miniaturized (∼8 lbs) hyperspectral platform with ultra-high (2-5 nm full-width, half-max, i.e., FWHM) spectral resolution that incorporates two mini-spectrometers (visual and near-infrared). This engineering innovation has enabled reagent-free biosensing for the first time. To enable expedient outcomes, we developed state-of-the-art machine learning algorithms for near real-time analysis of multi-wavelength spectral signatures in complex samples. Taken together, these innovations enable near-field ready, reagent-free, expedient agnostic diagnostics in complex clinical samples. Herein, we demonstrate the feasibility of this synergy of ProSpectral™ V1 with machine learning to accurately identify SARS-CoV-2 infection status in double-blinded saliva samples in real-time (3 seconds/measurement). The infection status of the samples was validated with the CDC-approved polymerase-chain reaction (PCR). We report accuracies comparable to first-in-class PCR tests. Further, we provide preliminary support that this signal is specific to SARS-CoV-2, and not associated with other respiratory conditions. Interpretation Preparedness against unanticipated pathogens and democratization of diagnostics requires moving away from technologies that demand specific reagents; and relying on intrinsic biochemical properties that can, theoretically, inform on all pathologies. Integration of hyperspectral sensors and in-line machine learning analytics, as reported here, shows the feasibility of such diagnostics. If realized to full potential, the ProSpectral™ V1 platform can enable agnostic diagnostics, thereby improving situational awareness and decision-making at the point of need; especially in resource-limited settings – enabling the distribution of newly developed tests for emerging pathogens with only a simple software update. Funding The U.S. Department of Energy, the Defense Threat Reduction Agency, Lawrence Berkeley National Laboratory, Los Alamos National Laboratory, and Pattern Computer Inc. Research in context Evidence before this study Our inability to quickly and effectively deploy and use reliable diagnostics at the point of need is a major limitation in our arsenal against infectious diseases. We searched PubMed and Google Scholar for articles published before May 2024 in English applying hyperspectral sensing technologies of pathogen detection with terms, “hyperspectral,” “pathogens”, and “COVID-19”. Various factors such as speed, sensitivity, availability of reagents, deployability, requirements (expertise, resources), and others determine our choice of diagnostic. Today, diagnosis of infection remains largely pathogen-specific, requiring ligands specific to the target of interest. Indeed, Polymerase Chain Reaction (PCR)-based methods, the gold-standard technology to diagnose COVID-19, are pathogen-specific and have to be re-evaluated with the emergence of new variants. Lateral flow immunoassays, while readily deployable, are associated with lower sensitivity and specificity, and require the development of ligands, which can be time-consuming when addressing unanticipated or new threats. Select pathogen-agnostic methods such as sequencing are evolving and becoming more feasible, but still require sample processing, reagents, cold-chain, and expert handlers - and hence are not (as yet) available for routine point-of-care use. In contrast, the characterization of biochemical signatures across multiple spectral regimes (hyperspectral) can facilitate reagent-free agnostic diagnostics. Yet, many spectroscopic methods are either limited to narrow wavelength ranges; or are too large for use in the point-of-care setting; and may require complex and time-consuming analytics. Added value of this study This manuscript presents a paradigm-shifting miniaturized hyperspectral sensor with embedded machine learning-enabled analytics that can overcome the above limitations, making reagent-free agnostic diagnostics achievable. To our knowledge, this establishes the fastest hyperspectral diagnostic platform (3 seconds/measurement), with no preprocessing and in a small form factor, and executable with liquid (clinical) samples, without ligands or reagents. Our data demonstrates that the sensitivity of this assay is comparable to gold-standard PCR-based assays; and that the signatures are specific to COVID-19 and not associated with influenza and other respiratory pathogens – establishing the truly agnostic nature of the platform. The sensor consists of two embedded spectrometers, covering spectral bandwidth 400-1700 nm, which covers spectral patterns associated with relevant biological moieties. With appropriate data processing, we demonstrate balanced accuracies between 0·97 and 1·0 under a 10-fold cross-validation (depending on the ML/AI algorithm used for prediction). Implications of all the available evidence With the optimization of algorithms and analytical methods and the development of appropriate spectral databases, the ProSpectral™ hyperspectral diagnostics platform can be a flexible tool for rapid, reagent-free pathogen-agnostic detection/diagnosis of disease at the point of need, which can be a disruptive force in our preparedness to counter emerging diseases and threats.