Prevalence of quadruplexes in the human genomeGuanine-rich DNA sequences of a particular form have the ability to fold into four-stranded structures called G-quadruplexes. In this paper, we present a working rule to predict which primary sequences can form this structure, and describe a search algorithm to identify such sequences in genomic DNA. We count the number of quadruplexes found in the human genome and compare that with the figure predicted by modelling DNA as a Bernoulli stream or as a Markov chain, using windows of various sizes. We demonstrate that the distribution of loop lengths is significantly different from what would be expected in a random case, providing an indication of the number of potentially relevant quadruplex-forming sequences. In particular, we show that there is a significant repression of quadruplexes in the coding strand of exonic regions, which suggests that quadruplex-forming patterns are disfavoured in sequences that will form RNA.
G-quadruplexes in promoters throughout the human genomeCertain G-rich DNA sequences readily form four-stranded structures called G-quadruplexes. These sequence motifs are located in telomeres as a repeated unit, and elsewhere in the genome, where their function is currently unknown. It has been proposed that G-quadruplexes may be directly involved in gene regulation at the level of transcription. In support of this hypothesis, we show that the promoter regions (1 kb upstream of the transcription start site TSS) of genes are significantly enriched in quadruplex motifs relative to the rest of the genome, with >40% of human gene promoters containing one or more quadruplex motif. Furthermore, these promoter quadruplexes strongly associate with nuclease hypersensitive sites identified throughout the genome via biochemical measurement. Regions of the human genome that are both nuclease hypersensitive and within promoters show a remarkable (230-fold) enrichment of quadruplex elements, compared to the rest of the genome. These quadruplex motifs identified in promoter regions also show an interesting structural bias towards more stable forms. These observations support the proposal that promoter G-quadruplexes are directly involved in the regulation of gene expression.
An RNA G-quadruplex in the 5′ UTR of the NRAS proto-oncogene modulates translationSunita Kumari, Anthony Bugaut, J Huppert et al.|Nature Chemical Biology|2007 Putative DNA Quadruplex Formation within the Human <i>c-kit</i> OncogeneSarah Rankin, Anthony P. Reszka, J Huppert et al.|Journal of the American Chemical Society|2005 The DNA sequence, d(AGGGAGGGCGCTGGGAGGAGGG), occurs within the promoter region of the c-kit oncogene. We show here, using a combination of NMR, circular dichroism, and melting temperature measurements, that this sequence forms a four-stranded quadruplex structure under physiological conditions. Variations in the sequences that intervene between the guanine tracts have been examined, and surprisingly, none of these modified sequences forms a quadruplex arrangement under these conditions. This suggests that the occurrence of quadruplex-forming sequences within the human and other genomes is less than was hitherto expected. The c-kit quadruplex may be a new target for therapeutic intervention in cancers where there is elevated expression of the c-kit gene.
Four-stranded nucleic acids: structure, function and targeting of G-quadruplexesJ Huppert|Chemical Society Reviews|2008 There are many structures that can be adopted by nucleic acids other than the famous Watson-Crick duplex form. This tutorial review describes the guanine rich G-quadruplex structure, highlighting the chemical interactions governing its formation, and the topological variants that exist. The methods that are used to study G-quadruplex structures are described, with examples of the information that may be derived from these different methods. Next, the proposed biological functions of G-quadruplexes are discussed, highlighting especially their presence in telomeric regions and gene promoters. G-quadruplex structures are the subject of considerable interest for the development of small-molecule ligands, and are also the targets of a wide variety of natural proteins.