Question: FeatureCounts for miRNAseq
gravatar for Muc09
9 months ago by
Muc0910 wrote:

I've looking for and answer on similar questions but i don't find what i suppose it's the problem. First I have 9 files of smallRNA-seq reads from human. I have aligned them with Bowtie2. I got a .sam file for each sample. Now I am counting with featureCounts. The results show 0% assigned reads for all the files. For alignment i use Ensemble human genome (GRCh38) My .gff3 file is from mirBase v22 (hsa.gff3) I used FeatureCounts with the following command:

featureCounts - t miRNA -g 'Name' -a /path/to/hsa.gff3 -o /path_to_all/*.sam

Similar Output for 9 samples:
|| Process SAM file sample_name.sam...
|| Single-end reads are included.
|| Total alignments : 58350348
|| Successfully assigned alignments : 26377 (0.0%)
|| Running time : 4.79 minutes

ADD COMMENTlink written 9 months ago by Muc0910

Im guessing your question is, why do you have ~58m reads aligned but on ~26k reads assign to features?

The answer is probably because your gff file combined with your sequencing strategy. Try using a more "generic" gff file to see what's been sequenced/aligned.

But another issue i see is that you're using Bowtie2, which is not splice-aware. I'm not sure about smallRNAs but you're better off using STAR or another splice-aware aligner.

ADD REPLYlink written 9 months ago by Mark800

Actually for microRNA's you want to align without gaps since you are looking at small reads. Typically they will be 20-30 bp. So using bowtie v.1.x would be a better choice as an aligner.

ADD REPLYlink written 9 months ago by genomax89k

Thanks for the correction. What's the difference between micro and small RNA? or are they different terms for the same thing.

ADD REPLYlink written 9 months ago by Mark800

Small RNA are a superset of all (<200 nt) where as miRNA are much smaller (~22 nt) and thus require un-gapped alignment to detect.

ADD REPLYlink written 9 months ago by genomax89k

Ok, thanks for the advice. I'll try using the Homo_sapiens.GRCh38.98.gtf to check more in detail. Also, the main reason for choosing Bowtie2 came for this publication: doi/10.1261/rna.055509.115. I also try different tools and aligners and got better results with Bowtie2 as the publication suggest.

ADD REPLYlink written 9 months ago by Muc0910

Couple of things to check in this situation:

What % of reads are uniquely mapped? Remember that featureCounts doesn't count things that are secondary mappings or multimappings. Is your sequencing paired or single, because bowtie has issues when you do paired end and the two ends overlap too much, which would almost certainly always be the case with miRNA-seq.

Finally, its possible that your reads are running over the end of the miRNA annotation, in which case featureCounts will ignore them. I think there is a setting to tell it not to do this.

ADD REPLYlink modified 9 months ago • written 9 months ago by i.sudbery9.1k

My library is single-end. I already try the advice from Amar ("generic gff file") and now i figured out that my reads got lots of information of snoRNA, so that's the main reason of low % when i use hsa.gff3. Anyway, thank you for your answers.

ADD REPLYlink written 9 months ago by Muc0910

Its completely normal for a large fraction of your reads to be snoRNA or snRNA or other categories of small RNA, but I still wouldn't expect the amount mapping to miRNA to be that small.

Did you trim the reads before mapping? What was the post trimming size distribution (as measured by fastqc)?

ADD REPLYlink written 9 months ago by i.sudbery9.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1684 users visited in the last hour