Question about featureCounts
0
3
Entering edit mode
4.5 years ago
chichaochen ▴ 30

Hi: I was trying featureCounts for with these 2 following settings.
1.

featureCounts -t exon -g gene_id (the rest default, not specified)

2.

featureCounts -t gene -g gene_id (the rest default, not specified)

I expected to see more reads (fragments) in condition 2, since the region of "gene" in the annotation gtf covers both "exon" and intron region. However, it's surprising to see that for a lot of the genes i got more counts in condition 1. Could anybody help with this? Thanks in advance!

RNA-Seq featurecounts • 1.9k views
ADD COMMENT
1
Entering edit mode

I'd guess it depends on overlapping gene loci which can be resolved on exon level.

Have a look at the summary table and compare the ambiguous count.

ADD REPLY
0
Entering edit mode

Just speculating, so not posting it as an answer. Method 1 gets read counts per exon. What does it do if a paired-end read has the forward read aligned to exon 1 and its mate aligned to exon 2, or even a single read aligned across a splice junction? It might add 1 count to exon 1 and 1 count to exon 2. Now if you add up the counts for each exon of that gene, that read pair ends up adding two counts for that gene. If you are just counting over the entire gene region, then that read pair would only add one count. I'm not sure that's actually what featureCounts does, but it's possible.

ADD REPLY
2
Entering edit mode

The first command isn't counting per-exon, it's counting reads/pairs where at least one overlaps a gene's exons. In short, the first command is the standard command for RNA-seq and uses the default values for everything. If one wanted to count per-exon, one would need to change the -g option.

ADD REPLY
0
Entering edit mode

Hey Devon Ryan,

I have only paired end reads (RNAseq) mapping to its genome by bowtie2, and I use featureCounts, too. Could you help me understand my code :

featureCounts -t gene -g "Id" (the rest default, not specified)

Does this code make sure that paired-end reads matching to a gene only count 1? Or 2?

I am not sure should I specify -p to count fragment not reads, and could you help me understand if the following code give the same output as the above one?

featureCounts -p -t gene -g "Id" (the rest default, not specified)

I am wondering since I only have paired reads these two codes should be both OK? right?

Last question: when should I specify strandness? does strandSpecific=2 or 1 give half of count than not specified strandness(0)?

I am very confused about this, and I am not sure if what I understand is right or wrong. I hope to get some confirmation or correction from you.

Thanks a lot. Yanfang

ADD REPLY
0
Entering edit mode

Thanks for the replies. To deal with the overlapping issue as pointed out by you guys, i tried to split the count equally to all overlapping features by :

1.featureCounts -p -O --fraction -t exon -g gene_id
2.featureCounts -p -O --fraction -t gene -g gene_id

I still saw some genes Counts in condition 1 >condition 2. Now i am more confused . @@...

ADD REPLY
3
Entering edit mode

When in doubt, have a look in IGV (make sure to load the GTF you're using) and maybe have featureCounts output the SAM file with reads annotated according to their feature assignment. Then you can debug why reads are being assigned the way they are.

ADD REPLY

Login before adding your answer.

Traffic: 2946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6