Advertisement
Research Article

Non-Coding RNA Prediction and Verification in Saccharomyces cerevisiae

  • Laura A. Kavanaugh,

    Affiliation: Department of Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, North Carolina, United States of America

    X
  • Fred S. Dietrich mail

    fred.dietrich@duke.edu

    Affiliation: Department of Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, North Carolina, United States of America

    X
  • Published: January 02, 2009
  • DOI: 10.1371/journal.pgen.1000321

Reader Comments (1)

Post a new comment on this article

Higher Ordered Structure is a Genome-Wide Proclivity

Posted by forsdyke on 03 Jan 2009 at 20:25 GMT

It is good to see a paper on RNA secondary structure acknowledging (i) the pioneering studies of Maizel and coworkers, and (ii) the mistaken disparagement of (i) by Rivas and Eddy. However, several aspects of this paper are questionable:

1. The authors employ the Rivas-Eddy method, rather than that of Maizel, to calculate Z-scores. The key difference between the two methods is that Maizel compares the folding of a natural sequence with the mean of many shuffled versions of the same sequence (mononucleotide shuffling), whereas Eddy (like Workman and Krogh) shuffles but retains the original dinucleotide composition. As I have elaborated [1], a sequence contributes to structure by virtue of both its sequence and its base composition. By retaining dinucleotide composition, the Eddy method retains a considerable part of the original base order that contributed to the structure. Thus, Eddy method Z-scores discard important folding information. This can lead to false conclusions (i.e. “poor thermodynamic footprints”).

2. The authors claim that “structural thermodynamic stability is an effective tool for predicting ncRNA genes.” Yet, as controls, they employ mainly randomly shuffled sequences (Table 3). Only 1800 bases of the 151161 total bases were not shuffled. To obtain these 1800 bases, six intergenic sequences were somehow selected from the numerous intergenic sequences in the yeast genome. Of these six, one turned out to be a “false positive” in that it produced a Z-score less than -3.5. It appears the authors do not recognize that the proclivity for significant secondary and higher ordered structures is a fundamental feature of biological nucleic acids [2]. When this structure is less developed in a region, it is likely that the sequence has responded to some pressure that interferes with structure development. Thus, structure is the default condition. The discovery of the authors of regions with more structure than the controls (where base order has been partially randomized), further documents the universality of structure proclivity.

3. The premise that there is a triad of (i) protein-encoding genes (transcribed as mRNA), (ii) ncRNA genes (transcribed as tRNAs, rRNAs, snoRNAs, etc.) and (iii) something non-genic (non-transcribed), may be false. As elaborated elsewhere [3], all genomic DNA is potentially transcribable. If “genes” are defined as segments of DNA that are transcribed under some circumstance, then the entire genome is genic.

References

1. Forsdyke DR (2007) Calculation of folding energies of single-stranded nucleic acid sequences: conceptual issues. J. Theor. Biol. 248:745-753.
2. Forsdyke DR (2006) Evolutionary Bioinformatics. Springer, New York.
3. Forsdyke DR, Madill CA, Smith SD (2002) Immunity as a function of the unicellular state: implications of emerging genomic data. Trends Immunol. 23:575-579.