Question: RNA-editing pipeline for miRNAs
3.1 years ago by
sombrajo920 wrote:


This is my very first post and I'm also a newbie in performing bioinformatic analysis, so forgive me if something sounds off. I'm planning on editing this post with additional data in case I'm missing something relevant here.

I'm analyzing SE miRNA-seq data and for this, I used miRDeep2 software and followed by performing DESeq analysis of both known miRNAs and predicted novel miRNAs.

I'm now interested in identifying if some of these miRNAs could have been edited post transcriptionally (most likely by substitutional A-to-I RNA editing), and distinguishing them from SNPs.

I found a pipeline ( which seems to have been used and cited by quite a few publications. The protocol used is explained here step by step:

I'd like to ask if anyone here has ever used it or know a better/up-to-date protocol. The trimming step and the mapping step seem to work fine but then I proceed to step 3:

Mapping the mismatches against the pre-miRNA sequences: The purpose of this step is to move from reads aligned against the genome (the end-point of the previous step) to counts of each of the four possible nucleotides at each position along the pre-miRNA sequence, for all the pre-miRNAs. Performing this transformation will allow us to focus our analysis only on bona fide miRNA and to use, in the following step, binomial statistics in order to detect significant modifications inside them...

... The main output of this script is a text file (termed 'main_output.txt') containing the counts of each of the four possible nucleotides at each position along all the pre-miRNA sequences...

So after this step, I get an output file which I don't really understand but I don't think it looks right (basically, all 0 at the last section of the file and 0s alongside the file for each of the positions of each miRNA). It looks like this:

I thought maybe the problem had something to do with the formatting of my input data (.fq files) so I even tried with the data they used for their analysis and still get the same results. If someone has ever followed this pipeline, I would gladly accept some guidance, I'm a bit lost here.

Thank you in advance,


