Question: Extracting reads belonging to different RNA families from Rfam
gravatar for toralmanvar
2.1 years ago by
toralmanvar840 wrote:

I have Illumina data of 1 x50bp from small RNA library. I am interested in identifying known and novel miRNAs in my plant, whose reference genome is not available. Here first I want to count the reads aligning to non-coding rRNAs like tRNA, rRNA, snoRNA, snRNA, siRNA etc except miRNAs. After removing those reads, I would like to align remaining reads with miRBase using blastn or bowtie to identify known miRNAs. Now problem is that, I want to use Rfam for detecting reads of different RNA class, thus I concatenated the fasta files of 2686 families (2,487,655 sequences) but Rfam v13 sequences does not have proper information of the class they represent in their header. So my concern is how can I count and extract the reads mapping to different classes of RNAs using Rfam.

ADD COMMENTlink written 2.1 years ago by toralmanvar840

helloo @toralmanvar

i dont know about the rfam issue but be careful while choosing blastn in your case. Blast is a heuristics and originally developed for quick database searche. it is not guaranteed to find all occurrences of a short sequence like miRNA; it will miss ~40% of possible hits when dealing with sequences of 20bp. also, it produce local alignments rather than end-to-end global matches. so i will not recommended. but here are few suggestionps

  • you need to reduce the word size to about half the query length
  • increase the E-value to see more hits with low p-value because short sequences are more likely to occur by chance. why? a short query is more likely to occur by chance in the database. Therefore, even a perfect match can have low statistical significance and may not be reported. Increasing the E value allows you to look farther down in the hit list and see matches that would normally be discarded because of low statistical significance (see this link).

alos, read this paper

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by naive_user70

Actually I want to classify the ncRNAs like the table shown here.

ADD REPLYlink written 2.1 years ago by toralmanvar840
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1453 users visited in the last hour