E

Emma J. Cooke

Innovate UK

ORCID: 0000-0002-7894-8112

Publishes on Gene expression and cancer classification, Genomics and Phylogenetic Studies, Plant Molecular Biology Research. 9 papers and 933 citations.

9Publications
933Total Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

<i>Arabidopsis</i>Defense against<i>Botrytis cinerea</i>: Chronology and Regulation Deciphered by High-Resolution Temporal Transcriptomic Analysis    
Cited by 355Open Access

Transcriptional reprogramming forms a major part of a plant's response to pathogen infection. Many individual components and pathways operating during plant defense have been identified, but our knowledge of how these different components interact is still rudimentary. We generated a high-resolution time series of gene expression profiles from a single Arabidopsis thaliana leaf during infection by the necrotrophic fungal pathogen Botrytis cinerea. Approximately one-third of the Arabidopsis genome is differentially expressed during the first 48 h after infection, with the majority of changes in gene expression occurring before significant lesion development. We used computational tools to obtain a detailed chronology of the defense response against B. cinerea, highlighting the times at which signaling and metabolic processes change, and identify transcription factor families operating at different times after infection. Motif enrichment and network inference predicted regulatory interactions, and testing of one such prediction identified a role for TGA3 in defense against necrotrophic pathogens. These data provide an unprecedented level of detail about transcriptional changes during a defense response and are suited to systems biology analyses to generate predictive models of the gene regulatory networks mediating the Arabidopsis response to B. cinerea.

A local regulatory network around three<scp>NAC</scp>transcription factors in stress responses and senescence in<scp>A</scp>rabidopsis leaves
Richard Hickman, Claire Hill, Christopher A. Penfold et al.|The Plant Journal|2013
Cited by 210Open Access

A model is presented describing the gene regulatory network surrounding three similar NAC transcription factors that have roles in Arabidopsis leaf senescence and stress responses. ANAC019, ANAC055 and ANAC072 belong to the same clade of NAC domain genes and have overlapping expression patterns. A combination of promoter DNA/protein interactions identified using yeast 1-hybrid analysis and modelling using gene expression time course data has been applied to predict the regulatory network upstream of these genes. Similarities and divergence in regulation during a variety of stress responses are predicted by different combinations of upstream transcription factors binding and also by the modelling. Mutant analysis with potential upstream genes was used to test and confirm some of the predicted interactions. Gene expression analysis in mutants of ANAC019 and ANAC055 at different times during leaf senescence has revealed a distinctly different role for each of these genes. Yeast 1-hybrid analysis is shown to be a valuable tool that can distinguish clades of binding proteins and be used to test and quantify protein binding to predicted promoter motifs.

Rfam 15: RNA families database in 2025
Nancy Ontiveros‐Palacios, Emma J. Cooke, Eric P. Nawrocki et al.|Nucleic Acids Research|2024
Cited by 151Open Access

The Rfam database, a widely used repository of non-coding RNA families, has undergone significant updates in release 15.0. This paper introduces major improvements, including the expansion of Rfamseq to 26 106 genomes, a 76% increase, incorporating the latest UniProt reference proteomes and additional viral genomes. Sixty-five RNA families were enhanced using experimentally determined 3D structures, improving the accuracy of consensus secondary structures and annotations. R-scape covariation analysis was used to refine structural predictions in 26 families. Gene Ontology (GO) and Sequence Ontology annotations were comprehensively updated, increasing GO term coverage to 75% of families. The release adds 14 new Hepatitis C Virus RNA families and completes microRNA family synchronization with miRBase, resulting in 1603 microRNA families. New data types, including FULL alignments, have been implemented. Integration with APICURON for improved curator attribution and multiple website enhancements further improve user experience. These updates significantly expand Rfam's coverage and improve annotation quality, reinforcing its critical role in RNA research, genome annotation and the development of machine learning models. Rfam is freely available at https://rfam.org.

A Robust Bayesian Two-Sample Test for Detecting Intervals of Differential Gene Expression in Microarray Time Series
Oliver Stegle, Katherine Denby, Emma J. Cooke et al.|Journal of Computational Biology|2010
Cited by 100

Understanding the regulatory mechanisms that are responsible for an organism's response to environmental change is an important issue in molecular biology. A first and important step towards this goal is to detect genes whose expression levels are affected by altered external conditions. A range of methods to test for differential gene expression, both in static as well as in time-course experiments, have been proposed. While these tests answer the question whether a gene is differentially expressed, they do not explicitly address the question when a gene is differentially expressed, although this information may provide insights into the course and causal structure of regulatory programs. In this article, we propose a two-sample test for identifying intervals of differential gene expression in microarray time series. Our approach is based on Gaussian process regression, can deal with arbitrary numbers of replicates, and is robust with respect to outliers. We apply our algorithm to study the response of Arabidopsis thaliana genes to an infection by a fungal pathogen using a microarray time series dataset covering 30,336 gene probes at 24 observed time points. In classification experiments, our test compares favorably with existing methods and provides additional insights into time-dependent differential expression.

Bayesian hierarchical clustering for microarray time series data with replicates and outlier measurements
Emma J. Cooke, Richard S. Savage, Paul Kirk et al.|BMC Bioinformatics|2011
Cited by 71Open Access

BACKGROUND: Post-genomic molecular biology has resulted in an explosion of data, providing measurements for large numbers of genes, proteins and metabolites. Time series experiments have become increasingly common, necessitating the development of novel analysis tools that capture the resulting data structure. Outlier measurements at one or more time points present a significant challenge, while potentially valuable replicate information is often ignored by existing techniques. RESULTS: We present a generative model-based Bayesian hierarchical clustering algorithm for microarray time series that employs Gaussian process regression to capture the structure of the data. By using a mixture model likelihood, our method permits a small proportion of the data to be modelled as outlier measurements, and adopts an empirical Bayes approach which uses replicate observations to inform a prior distribution of the noise variance. The method automatically learns the optimum number of clusters and can incorporate non-uniformly sampled time points. Using a wide variety of experimental data sets, we show that our algorithm consistently yields higher quality and more biologically meaningful clusters than current state-of-the-art methodologies. We highlight the importance of modelling outlier values by demonstrating that noisy genes can be grouped with other genes of similar biological function. We demonstrate the importance of including replicate information, which we find enables the discrimination of additional distinct expression profiles. CONCLUSIONS: By incorporating outlier measurements and replicate values, this clustering algorithm for time series microarray data provides a step towards a better treatment of the noise inherent in measurements from high-throughput genomic technologies. Timeseries BHC is available as part of the R package 'BHC' (version 1.5), which is available for download from Bioconductor (version 2.9 and above) via http://www.bioconductor.org/packages/release/bioc/html/BHC.html?pagewanted=all.