[2005/09/08] RNA Research Uncovers a Previously Ignored Universe of Genetic Information
RNA Research Uncovers a Previously Ignored Universe of Genetic Information
09/08/2005 A
slow revolution is occurring in the study of genetic information.
Until recently, the only interesting items in DNA sequences were the genes
– the genetic codes for proteins. Since these usually represented
only a small fraction of an organism¡¯s genome, it was assumed the rest of
the material was ¡°junk DNA¡± – sequences that were either mutated leftovers
of real genes (pseudogenes), spacers (introns), nonsense strands, or
regions that merely provided structural support for the more important
genes. Indications that something was wrong with this
picture have arisen over the last few years. For one, geneticists
were surprised to count only about 30,000 protein-coding genes in the
human genome; more recent counts have dropped the number to 25,000.
How could such a complex organism as a human being arise from such a small
library of genetic information? Another clue was the mismatch
between messenger RNAs and proteins. Messenger RNA (mRNA) is the
transcript of the DNA template that carries the genetic information
outside the nucleus of eukaryotic cells to a ribosome, where it is
translated into the amino acid language of proteins. Scientists
found that many mRNAs never got that far. Were they simply
disassembled and recycled? A third clue was the discovery of vast
quantities of small RNAs in the cell.
Some were found to apparently regulate the expression of genes; what did
the others do? Additionally, the mystery of introns,
viewed as useless nonsense strands of DNA cut out of genes by spliceosomes,
deepened when some were shown to be remarkably conserved
between primitive and advanced organisms, suggesting they had a
function. Is it possible scientists have vastly underestimated the
amount of information in the cell, like walking into a forest and assuming
the only living things there are the trees? Perhaps a kind of ¡°gene
chauvinism¡± has masked the reality of a much higher order of
complexity. The cover story of the Sept. 2 issue of
Science, ¡°Mapping RNA Form and Function,¡± explored this
question. Of the 18 articles about RNA and its functional role in
the cell, here are a few glimpses of the emerging picture that is putting
to rest the old notion – that biological information is comprised only of
genes and proteins.
Parallel universe: Guy Riddihough, in the introductory
article,1 ventured into the ¡°forest of RNA dark matter¡± and
found a wonderland:
For a long time, RNA has lived in the
shadow of its more famous chemical cousin DNA and of the
proteins that supposedly took over RNA¡¯s functions in
the transition from the ¡®RNA world¡¯ to the modern one. The shadow cast has been so deep
that a whole universe (or so it seems) of RNA—predominantly of
the noncoding variety—has remained hidden from view, until
recently.... The discovery that much of the mammalian genome is
transcribed, in some places without gaps (so-called
transcriptional ¡°forests¡±), shines a bright light on this
embarrassing plenitude: an order of magnitude more transcripts
than genes.... Many of these noncoding RNAs ... are
conserved across species, yet their functions (if any)
are largely unknown.... (Emphasis added in all
quotes.)
As if that were not enough, he noted that ¡°even the
coding and base-paring capacity of RNA can be altered–by RNA editing, in
which bases in the RNA are changed on the fly.¡± It appears there
is much life in the forest than just the trees.
Hidden infrastructure: Matthew W. Vaughn and Rob
Martienssen2 discussed the probability that vast numbers of
small RNAs (sRNA) may be essential for regulation of genes. Some
of these micro-RNAs (miRNAs) and small interfering RNAs (siRNAs) have
already been identified in gene regulation, but many more remain to be
studied. In one plant, 1.5 million sRNAs composed of 75,000 unique
sequences were recently found, suggesting that ¡°many more genes
may be under the control of sRNAs than had been previously
imagined.¡± These noncoding RNAs, usually 20-something bases
long, keep a bag of tricks up their sleeves:
They can direct cleavage of other
transcripts and can also promote second-strand synthesis by
RNA-dependent RNA polymerase (RdRP), resulting in dsRNAs
[double-stranded RNA]. In addition, siRNAs are implicated in
recruiting heterochromatic modifications that result in
transcriptional silencing.
The authors mentioned
several ways in which these sRNAs had escaped detection due to the
methods used.
Pseudo – Not: Vaughn and Martienssen also noted the
relationship of sRNAs to pseudogenes. Once thought to be mutated
relics of true genes because they often contain premature stop codons,
pseudogenes might be sources for siRNAs that regulate the true genes
they resemble: ¡°they could act transitively on transcripts from
paralogous protein-coding genes by promoting cleavage or interfering
with translation,¡± they continued. ¡°More than half of the
pseudogene sRNAs matched sequences elsewhere in the genome, indicating
that this may be the case and suggesting a mechanism for coordinated
trans-acting regulation of closely related members of gene
families.¡±
What Are They There For? Now that we know large numbers
of small RNAs exist, what do they do? John S. Mattick3
suggested that they are not ¡°transcriptional noise,¡± but rather
¡°constitute a critical hidden layer of gene regulation in complex
organisms, the understanding of which requires new approaches in
functional genomics.¡± This will be a big task, he warns. One
study of one such small RNA found it acting as a scaffold for the
assembly of protein complexes and for coordinating nuclear traffic,
helping localize gene products to their correct subcellular
compartments. This one case reveals ¡°a new dimension of
organizational control in cell biology and development,¡± and
¡°illustrates the magnitude of the task that is in front of us,
which may be an equal or greater challenge than that we already
face in working out the biochemical function and biological
role of all of the known and predicted proteins and their
isoforms.¡± Since cataloging the human proteome is the next
daunting task after deciphering the genome, this statement should put
geneticists on notice.
New Glasses Needed: One assumption guiding previous
research was that if a sequence was ¡°evolutionarily conserved¡± (i.e.,
largely unchanged from primitive to advanced organisms), this indicated
it was probably functional. Mattick cast doubt on that assumption:
¡°Notably, evolutionary conservation may not be a reliable signature of
functional ncRNAs¡± [non-coding RNAs]. The conserved ones may act
on many substrates, he noted, but non-conserved ones may have few and be
less restrained to vary. Many ncRNAs, Mattock thinks, may be
¡°evolving quickly¡± and escaping detection by methods that look for
sequence conservation. Here is another indication that
¡°junk DNA¡± actually represents information we haven¡¯t yet decoded:
It is also clear that the majority of
the genomes of animals is indeed transcribed, which
suggests that these genomes are either replete with
largely useless transcription or that these noncoding RNA
sequences are fulfilling a wide range of unexpected functions in
eukaryotic biology. These sequences include introns (Fig.
1), which account for at least 30% of the human genome but have
been largely overlooked because they have been assumed
to be simply degraded after splicing. However, it has
been shown that many miRNAs and all known small nucleolar
RNAs in animals are sourced from introns (of both
protein-coding and noncoding transcripts), and it is simply not
known what proportion of the transcribed introns are
subsequently processed into smaller functional
RNAs. It is possible, and logically plausible, that
these sequences are also a major source of regulatory RNAs in
complex organisms.
That higher animals should
run on ¡°complex genetic programming¡± should ¡°come as no surprise,¡± he
concluded. It means, though, that ¡°we may have seriously
misunderstood the nature of genetic programming in the higher
organisms by assuming that most genetic information is expressed
as and transacted by proteins.¡± Truly we have embarked on a long
road.
Mt. Improbable Looms Higher: Jean-Michel
Claverie4 echoed Mattock¡¯s estimation of the task, saying it
is ¡°only recently that the sheer scale of the phenomenon¡± of functional
non-coding RNA has been realized. He pointed to research on the
mouse genome that half its ¡°transcriptome¡± (the corpus of RNA
transcribed from DNA) consists of non-coding RNA (ncRNA). He found
a eureka moment: ¡°These results provide a solution to the discrepancy
between the number of (protein-coding) genes and the number of
transcripts,¡± he wrote. Missing them has been an artifact of our
methods. ¡°Noncoding transcripts originating from intergenic
regions, introns, or antisense strands have probably been right before
our eyes for 8 years without having been discovered!¡±
Prokaryotes Say Me, Too: Claverie doubted that the
discovery of functional ncRNA is limited to eukaryotes: ¡°The notion that
transcription is limited to protein-coding genes is also being
challenged in microbial systems.¡± He pointed to E. coli
which contains many transcripts from intergenic sequences and antisense
strands (i.e., transcribed from the opposite strand of DNA). His
ending paragraph should humble Watson and Crick, who thought they had it
figured out 50 years ago:
The intergenic, intronic, and
antisense transcribed sequences that were once deemed
artifactual are now a testimony to our collective refusal to
depart from an oversimplified gene model. But what if
transcription is even more complex? Could it, for instance,
lead to mRNAs generated from two different chromosomes (Fig.
1)? A year ago, we would have immediately
suspected such sequences as further artifacts arising from
large-scale cDNA [complementary DNA, a strand that forms a template
for mRNA] sequencing programs. But now? Perhaps it¡¯s
time to go back to the cDNA sequence databases and reevaluate the
numerous unexpected objects they contain. Transcription will
never be simple again, but how complex will it get?
The Life and Times of mRNA: Melissa Moore5
provided a more whimsical view of the actors in the genetic play.
Dismissing the simplistic ¡°short obituary¡± of RNAs as simply a ¡°central
conduits in the flow of information from DNA to protein,¡± she
wrote, ¡°this dry and simplistic description captures nothing of
the intricacies, intrigues, and vicissitudes defining the life
history of even the most mundane mRNA. In addition, of course,
some mRNAs lead lives that, if not quite meriting an unauthorized
biography, certainly have enough twists and turns to warrant a more
detailed nucleic acid interest story.¡± She offered a prècis
for her novel, giving us a glimpse into the frenzy of activity in the
life of mRNA:
We will follow the lives of eukaryotic
mRNAs from the point at which they are birthed from the nucleus
until they are done in by gangs of exonucleases lying in wait
in dark recesses of the cytoplasm. Along the way, mRNAs may be
shuttled to and from or anchored at specific subcellular
locations, be temporarily withheld from the translation
apparatus, have their 3' ends trimmed and extended,
fraternize with like-minded mRNAs encoding proteins of related
function, and be scrutinized by the quality-control police.
It turns out the mRNA is not just a carrier of
information, but a ¡°posttranscriptional operon¡± with many roles in the
cell. For instance, some RNAs bind with proteins to form messenger
ribonucleoprotein particles (mRNPs): ¡°Individual mRNP components can be
thought of as adaptors that allow mRNAs to interface with
the numerous intracellular machineries mediating their
subcellular localization, translation, and decay, as well as the
various signal transduction systems.¡± For a sampler, Moore
listed a ¡°cheat sheet¡± of 11 such mRNPs and their functions. Her
article gave some up-close-and-personal vignettes of some of the
players, personifying their birth, baptism (entry into the
¡°transcriptionally active pool¡±), examination, recruitment, retirement,
dispatch and burial.
Space does not permit delving into the
other 13 articles that describe such things as RNA¡¯s role in the ribosome,
how RNA is recycled, and other interesting topics.6 These
samples should suffice to show that the information content of the cell
has probably been vastly oversimplified before now. Remarkably, some
researchers are looking at this new universe of RNA regulation and seeing
an evolutionary path leading back into the fog of prehistory. Since
the leading origin-of-life theory is the so-called ¡°RNA World¡± scenario,
some are speculating about whether today¡¯s small RNAs are relics of a lost
world in which early RNAs shared the roles of genetic storage and
catalysis. Readers are referred to earlier entries on RNA and the
origin of life
for further study. Addendum: Genes themselves, too, may
contain much more information than previously realized. Several
articles recently hinted at how genetic information could vastly outstrip
the mere gene count. One mechanism of compressing information on DNA
is alternative splicing: the spliceosome, after removing the introns,
apparently can rearrange the exons into multiple products in some cases,
something like the way kids take Lego blocks and make a variety of
machines out of them. Another possibility for information storage is
the overlooked opposite DNA strand, or ¡°antisense¡± strand. Even
though it represents a ¡°photographic negative¡± of the normal strand, some
mRNAs can apparently read it and generate additional, different protein
products from it. These and other mechanisms, such as frame-shifted
transcription, the histone code, or the ability of mRNAs to join
transcripts from different chromosomes, suggest that the information coded
in genes is just the tip of a very large info-berg.
1Guy Riddihough, ¡°In the Forests of RNA Dark
Matter,¡± Science,
Vol 309, Issue 5740, 1507, 2 September 2005, [DOI:
10.1126/science.309.5740.1507]. 2Vaughn and Martienssen,
¡°It¡¯s a Small RNA World, After All,¡± Science,
Vol 309, Issue 5740, 1525-1526, 2 September 2005, [DOI:
10.1126/science.1117805]. 3John S. Mattick, ¡°The Functional
Genomics of Noncoding RNA,¡± Science,
Vol 309, Issue 5740, 1527-1528, 2 September 2005, [DOI:
10.1126/science.1117806]. 4Jean-Michel Claverie, ¡°Fewer
Genes, More Noncoding RNA,¡± Science,
Vol 309, Issue 5740, 1529-1530 , 2 September 2005, [DOI:
10.1126/science.1116800]. 5Melissa J. Moore, ¡°From Birth to
Death: The Complex Lives of Eukaryotic mRNAs,¡± Science,
Vol 309, Issue 5740, 1514-1518, 2 September 2005, [DOI:
10.1126/science.1111443]. 6For popular reports on these
subjects, see EurekAlert
#1, EurekAlert
#2, EurekAlert
#3 (the ¡°software of life¡±), and a press release from U of
Delaware.
Which theory – intelligent design or
Darwinism – would have predicted this complexity? Is there any
hint of an evolutionary sequence leading up to this highly-coordinated,
quality-controlled, information-rich system? (Recall that RNA does not form readily in water, and is highly unstable;
its presence in the cell is only made possible by stringent programmed
operations with quality control.) The gap between a mythical ¡°RNA
World¡± and the living world of real functioning RNA in the cell could
never have been wider. As the cloud cover lifts, the summit of Mt.
Improbable stretches higher into the sky. Darwinism had
enough trouble explaining the 4-letter (G,C,A,T) triplet-codon genetic
code. Simple Watson-Crick base pairing and the old one-gene
one-enzyme principle, the so-called ¡°Central Dogma¡± of genetics was
taught as The Big Picture till we knew better. Now that junk DNA
is out,
the whole cellular information flowchart appears as complex as that of a
well-run city, where each employee has a role. Each
information-rich molecule is born, lives an active life and is retired,
as Moore personified it. It¡¯s time for the Darwin Party to let go
of the steering wheel and let the Intelligent Design community drive
science out of the naturalistic rut it¡¯s in. Knowing how to read
the signs of intelligent causation, they can help get science back onto
the freeway of enriched understanding.