An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAsMarilyn Kozak|Nucleic Acids Research|1987 5'-Noncoding sequences have been compiled from 699 vertebrate mRNAs. (GCC) GCCA/GCCATGG emerges as the consensus sequence for initiation of translation in vertebrates. The most highly conserved position in that motif is the purine in position -3 (three nucleotides upstream from the ATG codon); 97% of vertebrate mRNAs have a purine, most often A, in that position. The periodical occurrence of G (in positions -3, -6, -9) is discussed. Upstream ATG codons occur in fewer than 10% of vertebrate mRNAs-at-large; a notable exception are oncogene transcripts, two-thirds of which have ATG codons preceding the start of the major open reading frame. The leader sequences of most vertebrate mRNAs fall in the size range of 20 to 100 nucleotides. The significance of shorter and longer 5'-noncoding sequences is discussed.
Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomesCompilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAsMarilyn Kozak|Nucleic Acids Research|1984 5-Noncoding sequences have been tabulated for 211 messenger RNAs from higher eukaryotic cells. The 5'-proximal AUG triplet serves as the initiator codon in 95% of the mRNAs examined. The most conspicuous conserved feature is the presence of a purine (most often A) three nucleotides upstream from the AUG initiator codon; only 6 of the mRNAs in the survey have a pyrimidine in that position. There is a predominance of C in positions -1, -2, -4 and -5, just upstream from the initiator codon. The sequence CCAGCCAUG (G) thus emerges as a consensus sequence for eukaryotic initiation sites. The extent to which the ribosome binding site in a given mRNA matches the -1 to -5 consensus sequence varies: more than half of the mRNAs in the tabulation have 3 or 4 nucleotides in common with the CCACC consensus, but only ten mRNAs conform perfectly.
Initiation of translation in prokaryotes and eukaryotesPossible role of flanking nucleotides in recognition of the AUG initiator codon by eukaryotic ribosomesMarilyn Kozak|Nucleic Acids Research|1981 Sequences flanking the initiator codon in eukaryotic mRNAs are not random. Out of 153 messages examined, 151 have either a purine in position -3, or a G in position +4, or both. Thus, [A/G]XXAUGG emerges as the favored sequence for eukaryotic initiation sites. Nucleotides flanking nonfunctional AUG triplets, which occur in the 5'-noncoding region of a few eukaryotic messages, are different from those found at most functional sites. Whereas most authentic initiator codons are preceded by a purine (usually A) in position -3, most nonfunctional AUGs have a pyrimidine in that position. The observed asymmetry suggests that purines in positions -3 and +4 might facilitate recognition of the AUG condon during formation of initiation complexes. To test this idea, in vitro binding studies were carried out with 32P-labeled oligonucleotides. Binding of AUG-containing oligonucleotides to wheat germ ribosomes was significantly enhanced by placing a purine in position -3 or +4. The scanning model, which postulates that 40S ribosomal subunits attach at the 5'-end of a message and migrate down to the AUG codon, is discussed in light of these new observations. A modified version of the scanning mechanism is proposed.