Question: Question about featureCounts
1
gravatar for chichaochen
11 months ago by
chichaochen10
chichaochen10 wrote:

Hi: I was trying featureCounts for with these 2 following settings.
1.

featureCounts -t exon -g gene_id (the rest default, not specified)

2.

featureCounts -t gene -g gene_id (the rest default, not specified)

I expected to see more reads (fragments) in condition 2, since the region of "gene" in the annotation gtf covers both "exon" and intron region. However, it's surprising to see that for a lot of the genes i got more counts in condition 1. Could anybody help with this? Thanks in advance!

rna-seq featurecounts • 379 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by chichaochen10
1

I'd guess it depends on overlapping gene loci which can be resolved on exon level.

Have a look at the summary table and compare the ambiguous count.

ADD REPLYlink written 11 months ago by michael.ante3.6k

Just speculating, so not posting it as an answer. Method 1 gets read counts per exon. What does it do if a paired-end read has the forward read aligned to exon 1 and its mate aligned to exon 2, or even a single read aligned across a splice junction? It might add 1 count to exon 1 and 1 count to exon 2. Now if you add up the counts for each exon of that gene, that read pair ends up adding two counts for that gene. If you are just counting over the entire gene region, then that read pair would only add one count. I'm not sure that's actually what featureCounts does, but it's possible.

ADD REPLYlink modified 11 months ago • written 11 months ago by colin.kern920

The first command isn't counting per-exon, it's counting reads/pairs where at least one overlaps a gene's exons. In short, the first command is the standard command for RNA-seq and uses the default values for everything. If one wanted to count per-exon, one would need to change the -g option.

ADD REPLYlink written 11 months ago by Devon Ryan96k

Thanks for the replies. To deal with the overlapping issue as pointed out by you guys, i tried to split the count equally to all overlapping features by :

1.featureCounts -p -O --fraction -t exon -g gene_id
2.featureCounts -p -O --fraction -t gene -g gene_id

I still saw some genes Counts in condition 1 >condition 2. Now i am more confused . @@...

ADD REPLYlink modified 11 months ago by genomax89k • written 11 months ago by chichaochen10
1

When in doubt, have a look in IGV (make sure to load the GTF you're using) and maybe have featureCounts output the SAM file with reads annotated according to their feature assignment. Then you can debug why reads are being assigned the way they are.

ADD REPLYlink written 11 months ago by Devon Ryan96k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1717 users visited in the last hour