This is my htseq-count command for gene level counts:
htseq-count -f bam -r name -s yes -a 10 -t exon -i gene_id -m union sample_Aligned.out.bam gencode.v23.annotation_subset.gtf > sample_Aligned.out.table.txt ENSG00000075624.13 171006 ENSG00000107554.15 379 ENSG00000107796.12 1508 ENSG00000111640.14 61287 ENSG00000112237.12 1299 ENSG00000112592.12 812 ENSG00000131203.12 27 ENSG00000143632.14 27 ENSG00000144713.12 12691 ENSG00000156508.17 58724 ENSG00000159251.6 5 ENSG00000163017.13 142 ENSG00000163631.16 14 ENSG00000184009.9 131168 ENSG00000188676.13 0 __no_feature 84834542 __ambiguous 0 __too_low_aQual 0 __not_aligned 2970335 __alignment_not_unique 12964119
ENSG00000075624.13 is ACTB an actin gene which has 171006 reads mapped to the gene. And this is my command for quantifying exon level counts where I change the -i parameter to
htseq-count -f bam -r name -s yes -a 10 -t exon -i exon_id -m union sample_Aligned.out.bam gencode.v23.annotation_subset.gtf > sample_Aligned.out.table.txt
For ACTB, I am getting the following counts for its 39 exons sorted by the counts:
ENSE00001902558.1 ACTB 327 ENSE00001731256.1 ACTB 38 ENSE00001911024.1 ACTB 9 ENSE00001775033.1 ACTB 1 ENSE00001958169.1 ACTB 0 ENSE00003669840.1 ACTB 0 ENSE00003664639.1 ACTB 0 ...
So here the total exon counts are 375 which is not even close to the gene level count 171006.
Similarly, at the exon level for ENSG00000131203.12 I am getting 0 counts but at gene level I have 27 counts.
What am I doing wrong? Any help would be much appreciated.