Question: To identify 3'end extended small ncRNA from small RNA seq data
gravatar for sethugunja
4.7 years ago by
sethugunja60 wrote:


Recently, we have sequenced our RNA samples ranging from 70- 200 nt  by Illumina Hiseq platform. Here are the details:

Type of seq: Small RNA seq (size 70 - 200 nt)

Seq platform: Illumina Hiseq 2000

Read length: 50, Single end

Conditions: Normal (3 replicates) vs Patient (3 replicates)

Reads:~17 million reads in each replicate

Aim: To identify the 3' end extended sequences (polyAs) in the snoRNAs (unmatured snoRNAs)

Is there any particular pipeline? or Is there any particular tool to find them?.

Any other suggestions, Please let me know



ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by sethugunja60

Hopefully others will reply with a premade tool, but I would think the general idea would be to first perform mapping as normal and then take the unmapped reads and split them to allow anchoring. You'd then try to map the anchors. The 3' extension would be the sequence of the remainder of an anchored read that maps on the 3' end. This is sort of how tophat  works, though it'd make more sense to simply write a custom pipeline than to modify tophat.

ADD REPLYlink written 4.7 years ago by Devon Ryan89k

Thankyou for your prompt reply, 

Can you please explain how to split and anchor them?


ADD REPLYlink written 4.7 years ago by sethugunja60

Whatever program you write/find would take each read and segment it into non-overlapping stretches (of maybe 15-20 bases each). In this context, anchoring would be performed by simply mapping these segments to either the genome or a library of small RNAs (this is probably more efficient).

ADD REPLYlink written 4.7 years ago by Devon Ryan89k
gravatar for Chirag Nepal
4.7 years ago by
Chirag Nepal2.2k
Chirag Nepal2.2k wrote:

You should start by mapping to genome as Devon suggested. Then identify how reads are mapped to the snorna, if that is your interest.


In my opinion, as your library size is 70-200nt, we have size selected to enrich for snoRNAs and snRNAs, but no mirnas or other small RNAs (like the ones derivided from end of snoRNAs). I am not aware, if you can find extended poly-A in snoRNAs, simply because the way how snornas are processed. Snorna are generally encoded in introns and excised by splicesome and there might not be polyA specifc to snoRNA. While the polyA of snoRNA host gene will be downstream of 3'UTR of host gene.


ADD COMMENTlink written 4.7 years ago by Chirag Nepal2.2k

Hi Chirag,

We know that there are polyAs in unmatured snoRNAs in patient becoz we have analysed the abundance of only polyA and total snoRNAs. The abundance of polyA snoRNAs were quite high in Patient vs Normal (qPCR data). This suggests that the proportion of unmatured snoRNAs are high in patient. So we did sequencing to identify the sequence and length of the polyAs in particular snoRNAs.

Heres the picture showing the maturation of H/ACA box snoRNAs.

ADD REPLYlink modified 4.7 years ago by Devon Ryan89k • written 4.7 years ago by sethugunja60

I've inlined and shrunk the image. I suspect that most of us have institutional access.

ADD REPLYlink written 4.7 years ago by Devon Ryan89k
gravatar for Ido Tamir
4.7 years ago by
Ido Tamir5.0k
Ido Tamir5.0k wrote:

I would use an aligner that allows incomplete alignments like bowtie2 in local mode or bwa mem. Then filter by min alignment length, 5' or 3' soft-clipped sequences etc .... This however will require multiple runs to get the alignment parameters right. A training dataset would be of high value to reduce false negative rate or be overly sensitive. Take some known small RNAs and clip or extend them a little bit (5' and 3') and align this dataset to the genome or a non-coding RNA file and check if you could recover all of them.

ADD COMMENTlink written 4.7 years ago by Ido Tamir5.0k

Hi Ido,

As I m from the non bioinformatics, I couldnt understand fully. Could you please take time and explain me in detail?



ADD REPLYlink written 4.7 years ago by sethugunja60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1371 users visited in the last hour