Notes on the Evidence for Extensive RNA Editing in Humans

The common notion running through molecular biology is that the information present in DNA is transferred to RNA and then to protein.

Back in 2010, researchers made a potentially ground-breaking observation.

They found that within any given individual, there are tens of thousands of places where transcribed RNA does not match the template DNA from which it is derived.

Notes on the evidence for extensive RNA editing in humans

This Phenomenon is Known as RNA Editing

In humans, it is generally thought to be limited to conversions of the base adenosine to the base inosine (which is read as guanine by DNA sequencers), and occasionally from cytosine to uracil.

However, these authors reported something new. They found that any type of base can be converted to any other type of base. If their observations are correct, these findings represent a fundamental change in how we view the process of gene regulation.

The Study

The authors of this study sequenced the mRNA expressed by an individual (or rather, cDNA from a cell line derived from the individual). They then obtained DNA sequences from the same individual and compared the two.

Any difference between the RNA and DNA was taken as an indication of RNA editing. However, because it is impossible to sequence an entire mRNA or genome in a single pass, the researchers used short reads (of 50 bases) from the mRNA of the individual and reads of various lengths from the DNA of the individual.

They then matched these sequencing reads to the genome (or transcriptome) to see where they came from.

Complications

However, there are several complications with this method: choosing the best spot to take reads from involves several assumptions, including how you weight insertions and deletions and possible sequencing errors.

These sorts of mapping issues are well understood and have been widely discussed in the literature on SNP calling from sequencing data, which is another situation where the researcher is looking for a difference between a sequencing read and the genome.

A naïve SNP caller that just looks for differences between aligned reads and a genome will output tens (or probably hundreds) of thousands of false-positive SNPs which must be filtered out by various criteria.

Therefore, mismapping of reads in paralogous regions can lead to false signals of RNA editing, and these false signals can even be replicated in follow-up experiments like those done by Li et al. (2011).

This is because the two forms of RNA and protein are indeed present in the cell, giving the illusion of RNA editing.

However, the two forms of RNA and protein do not come from the same DNA sequence, and thus are not evidence of RNA editing.

Besides, mapping biases around splice sites (and other sorts of insertions/deletions in the genome) will cause mismapping and false inference of RNA editing.

So in conclusion, while RNA editing is a potentially important phenomenon in humans, there is significant scope for further research, and skepticism of studies carried out so far certainly seems warranted.