Question: FeatureCounts for miRNAseq
gravatar for Muc09
8 days ago by
Muc090 wrote:

I've looking for and answer on similar questions but i don't find what i suppose it's the problem. First I have 9 files of smallRNA-seq reads from human. I have aligned them with Bowtie2. I got a .sam file for each sample. Now I am counting with featureCounts. The results show 0% assigned reads for all the files. For alignment i use Ensemble human genome (GRCh38) My .gff3 file is from mirBase v22 (hsa.gff3) I used FeatureCounts with the following command:

featureCounts - t miRNA -g 'Name' -a /path/to/hsa.gff3 -o /path_to_all/*.sam

Similar Output for 9 samples:
|| Process SAM file sample_name.sam...
|| Single-end reads are included.
|| Total alignments : 58350348
|| Successfully assigned alignments : 26377 (0.0%)
|| Running time : 4.79 minutes

ADD COMMENTlink written 8 days ago by Muc090

Im guessing your question is, why do you have ~58m reads aligned but on ~26k reads assign to features?

The answer is probably because your gff file combined with your sequencing strategy. Try using a more "generic" gff file to see what's been sequenced/aligned.

But another issue i see is that you're using Bowtie2, which is not splice-aware. I'm not sure about smallRNAs but you're better off using STAR or another splice-aware aligner.

ADD REPLYlink written 8 days ago by Amar620

Actually for microRNA's you want to align without gaps since you are looking at small reads. Typically they will be 20-30 bp. So using bowtie v.1.x would be a better choice as an aligner.

ADD REPLYlink written 8 days ago by genomax75k

Thanks for the correction. What's the difference between micro and small RNA? or are they different terms for the same thing.

ADD REPLYlink written 8 days ago by Amar620

Small RNA are a superset of all (<200 nt) where as miRNA are much smaller (~22 nt) and thus require un-gapped alignment to detect.

ADD REPLYlink written 7 days ago by genomax75k

Ok, thanks for the advice. I'll try using the Homo_sapiens.GRCh38.98.gtf to check more in detail. Also, the main reason for choosing Bowtie2 came for this publication: doi/10.1261/rna.055509.115. I also try different tools and aligners and got better results with Bowtie2 as the publication suggest.

ADD REPLYlink written 8 days ago by Muc090

Couple of things to check in this situation:

What % of reads are uniquely mapped? Remember that featureCounts doesn't count things that are secondary mappings or multimappings. Is your sequencing paired or single, because bowtie has issues when you do paired end and the two ends overlap too much, which would almost certainly always be the case with miRNA-seq.

Finally, its possible that your reads are running over the end of the miRNA annotation, in which case featureCounts will ignore them. I think there is a setting to tell it not to do this.

ADD REPLYlink modified 7 days ago • written 7 days ago by i.sudbery6.3k

My library is single-end. I already try the advice from Amar ("generic gff file") and now i figured out that my reads got lots of information of snoRNA, so that's the main reason of low % when i use hsa.gff3. Anyway, thank you for your answers.

ADD REPLYlink written 7 days ago by Muc090

Its completely normal for a large fraction of your reads to be snoRNA or snRNA or other categories of small RNA, but I still wouldn't expect the amount mapping to miRNA to be that small.

Did you trim the reads before mapping? What was the post trimming size distribution (as measured by fastqc)?

ADD REPLYlink written 6 days ago by i.sudbery6.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 978 users visited in the last hour