Spliced leader RNA trans-splicing in dinoflagellatesHuan Zhang, Yubo Hou, Lilibeth Miranda et al.|Proceedings of the National Academy of Sciences|2007 Through the analysis of hundreds of full-length cDNAs from fifteen species representing all major orders of dinoflagellates, we demonstrate that nuclear-encoded mRNAs in all species, from ancestral to derived lineages, are trans-spliced with the addition of the 22-nt conserved spliced leader (SL), DCCGUAGCCAUUUUGGCUCAAG (D = U, A, or G), to the 5' end. SL trans-splicing has been documented in a limited but diverse number of eukaryotes, in which this process makes it possible to translate polycistronically transcribed nuclear genes. In SL trans-splicing, SL-donor transcripts (SL RNAs) contain two functional domains: an exon that provides the SL for mRNA and an intron that contains a spliceosomal (Sm) binding site. In dinoflagellates, SL RNAs are unusually short at 50-60 nt, with a conserved Sm binding motif (AUUUUGG) located in the SL (exon) rather than the intron. The initiation nucleotide is predominantly U or A, an unusual feature that may affect capping, and hence the translation and stability of the recipient mRNA. The core SL element was found in mRNAs coding for a diverse array of proteins. Among the transcripts characterized were three homologs of Sm-complex subunits, indicating that the role of the Sm binding site is conserved, even if the location on the SL is not. Because association with an Sm-complex often signals nuclear import for U-rich small nuclear RNAs, it is unclear how this Sm binding site remains on mature mRNAs without impeding cytosolic localization or translation of the latter. The sequences reported in this paper have been deposited in the GenBank database (accession nos. AF 512889, DQ 864761-DQ 864971, DQ 867053-DQ 867070, DQ 884413-DQ 884451, EF 133854-EF 133905, EF 133961-EF 134003, EF 134083-EF 134402, EF 141835, and EF 143070-EF 143105).
Serious Overestimation in Quantitative PCR by Circular (Supercoiled) Plasmid Standard: Microalgal pcna as the Model GeneQuantitative real-time PCR (qPCR) has become a gold standard for the quantification of nucleic acids and microorganism abundances, in which plasmid DNA carrying the target genes are most commonly used as the standard. A recent study showed that supercoiled circular confirmation of DNA appeared to suppress PCR amplification. However, to what extent to which different structural types of DNA (circular versus linear) used as the standard may affect the quantification accuracy has not been evaluated. In this study, we quantitatively compared qPCR accuracies based on circular plasmid (mostly in supercoiled form) and linear DNA standards (linearized plasmid DNA or PCR amplicons), using proliferating cell nuclear gene (pcna), the ubiquitous eukaryotic gene, in five marine microalgae as a model gene. We observed that PCR using circular plasmids as template gave 2.65-4.38 more of the threshold cycle number than did equimolar linear standards. While the documented genome sequence of the diatom Thalassiosira pseudonana shows a single copy of pcna, qPCR using the circular plasmid as standard yielded an estimate of 7.77 copies of pcna per genome whereas that using the linear standard gave 1.02 copies per genome. We conclude that circular plasmid DNA is unsuitable as a standard, and linear DNA should be used instead, in absolute qPCR. The serious overestimation by the circular plasmid standard is likely due to the undetected lower efficiency of its amplification in the early stage of PCR when the supercoiled plasmid is the dominant template.
Distinct Gene Number-Genome Size Relationships for Eukaryotes and Non-Eukaryotes: Gene Content Estimation for Dinoflagellate GenomesThe ability to predict gene content is highly desirable for characterization of not-yet sequenced genomes like those of dinoflagellates. Using data from completely sequenced and annotated genomes from phylogenetically diverse lineages, we investigated the relationship between gene content and genome size using regression analyses. Distinct relationships between log(10)-transformed protein-coding gene number (Y') versus log(10)-transformed genome size (X', genome size in kbp) were found for eukaryotes and non-eukaryotes. Eukaryotes best fit a logarithmic model, Y' = ln(-46.200+22.678X', whereas non-eukaryotes a linear model, Y' = 0.045+0.977X', both with high significance (p<0.001, R(2)>0.91). Total gene number shows similar trends in both groups to their respective protein coding regressions. The distinct correlations reflect lower and decreasing gene-coding percentages as genome size increases in eukaryotes (82%-1%) compared to higher and relatively stable percentages in prokaryotes and viruses (97%-47%). The eukaryotic regression models project that the smallest dinoflagellate genome (3x10(6) kbp) contains 38,188 protein-coding (40,086 total) genes and the largest (245x10(6) kbp) 87,688 protein-coding (92,013 total) genes, corresponding to 1.8% and 0.05% gene-coding percentages. These estimates do not likely represent extraordinarily high functional diversity of the encoded proteome but rather highly redundant genomes as evidenced by high gene copy numbers documented for various dinoflagellate species.
High-Level Diversity of Dinoflagellates in the Natural Environment, Revealed by Assessment of Mitochondrial <i>cox1</i> and <i>cob</i> Genes for Dinoflagellate DNA BarcodingSenjie Lin, Huan Zhang, Yubo Hou et al.|Applied and Environmental Microbiology|2008 DNA barcoding is a diagnostic technique for species identification using a short, standardized DNA. An effective DNA barcoding marker would be very helpful for unraveling the poorly understood species diversity of dinoflagellates in the natural environment. In this study, the potential utility for DNA barcoding of mitochondrial cytochrome c oxidase 1 (cox1) and cytochrome b (cob) was assessed. Among several primer sets examined, the one amplifying a 385-bp cob fragment was most effective for dinoflagellates. This short cob fragment is easy to sequence and yet possess reasonable taxon resolution. While the lack of a uniform gap between interspecific and intraspecific distances poses difficulties in establishing a phylum-wide species-discriminating distance threshold, the variability of cob allows recognition of species within particular lineages. The potential of this cob fragment as a dinoflagellate species marker was further tested by applying it to an analysis of the dinoflagellate assemblages in Long Island Sound (LIS) and Mirror Lake in Connecticut. In LIS, a highly diverse assemblage of dinoflagellates was detected. Some taxa can be identified to the species and some to the genus level, including a taxon distinctly related to the bipolar species Polarella glacialis, and the large number of others cannot be clearly identified, due to the inadequate database. In Mirror Lake, a Ceratium species and an unresolved taxon were detected, exhibiting a temporal transition from one to the other. We demonstrate that this 385-bp cob fragment is promising for lineage-wise dinoflagellate species identification, given an adequate database.
Development of a Dinoflagellate-Oriented PCR Primer Set Leads to Detection of Picoplanktonic Dinoflagellates from Long Island SoundSenjie Lin, Huan Zhang, Yubo Hou et al.|Applied and Environmental Microbiology|2006 We developed dinoflagellate-specific 18S rRNA gene primers. PCR amplification using these oligonucleotides for a picoplanktonic DNA sample from Long Island Sound yielded 24 clones, and all but one of these clones were dinoflagellates primarily belonging to undescribed and Amoebophrya-like lineages. These results highlight the need for a systematic investigation of picodinoflagellate diversity in both coastal and oceanic ecosystems.