featureCounts with 3' tag sequencing
0
0
Entering edit mode
5.1 years ago
luca ▴ 70

Hi everyone, I performed an RNA-seq approach on mice using 3'tag sequencing. I mapped the reads on the mouse genome using STAR (on average >70% reads mapped uniquely) and I wanted to get the raw counts with featureCounts. The code I am using is this:

featureCounts -a Mus_musculus.GRCm38.95.gtf -t exon -g gene_id --primary -T 16 -o counts_w_extraAttributes-primary.txt E12.5/Aligned.sortedByCoord.out.bam E14.5/Aligned.sortedByCoord.out.bam E18.5/Aligned.sortedByCoord.out.bam...

The output from featureCount is kind of strange (to me at least) because it says that the "Successfully assigned alignments" is, on average, 40%. I think it is quite low as number, so I was wondering if I am doing something incorrect?

Thanks for your helpful replies, Best Luca

RNA-Seq alignment • 1.5k views
ADD COMMENT
0
Entering edit mode

Have you tried to add -M option to see how the counts change? Also important to keep in mind that while STAR may have been able to map a certain % of reads unless there is a feature defined for a region, reads will not be counted. Is 3'-tag sequencing capturing a certain strand (top/bottom) then you should specify that as well (-s option). By default featureCounts treats data as unstranded (-s 0).

Edit: I am going to edit this post since I have hit my post limit for the day.

If your kit was stranded then definitely use the right -s option (sounds like -s 1 is that option).

ADD REPLY
0
Entering edit mode

Dear genomax, Thanks for your reply. I tried adding the -M option and the % of Successfully assigned alignments increases on average by 15/20%. I have not specified any strand with the -s option but the kit is strand specific. I checked and the best results are with -s 1. Do you think I should include -M and count also the multi mapping reads?

ADD REPLY
0
Entering edit mode

Thanks genomax! In relation to the multi mapping reads, is there a "gold standard" procedure (i.e. to include them or exclude them)? Thanks Luca

ADD REPLY
0
Entering edit mode

Multi-mapped reads are generally excluded since you can't be sure of the gene/region they originated from. Some aligners allow you to place them at a random spot out of all the places that they map to.

There are alternate strategies (e.g. mapping instead of alignment in salmon, https://salmon.readthedocs.io/en/latest/ ) which can be used to deal with them. Since you have 3'-end specific data I am not sure you can use that option.

ADD REPLY
0
Entering edit mode

Thanks! I will follow your suggestion and ignore multi mapped reads

Luca

ADD REPLY

Login before adding your answer.

Traffic: 1760 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6