Quantification of a gene that has copies in multiple chromosomes using featureCounts
6 weeks ago

I would appreciate if anyone could help me understand the following issue I have with gene quantification using featureCounts.

As you can see in an example featureCounts output below, there are some genes that span to more than one chromosomes. I think featureCounts estimates the gene length by counting total number of bases in the exons of the gene copies in multiple chromosomes,

GeneID                                                                                       Chr

64109 chrX;chrX;chrX;chrX;chrX;chrX;chrX;chrX;chrX;chrY;chrY;chrY;chrY;chrY;chrY;chrY;chrY;chrY

Start 1190449;1193218;1196780;1198562;1198801;1202402;1206433;1208806;1212556;1190449;1193218;1196780;1198562;1198801;1202402;1206433;1208806;1212556

End 1191160;1193302;1196900;1198724;1199145;1202535;1206599;1208908;1212815;1191160;1193302;1196900;1198724;1199145;1202535;1206599;1208908;1212815

Strand Length

27097 -;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-   4180

So, the gene ‘64109’ belong to chromosomes X as well as Y. The total length is 4180, which is ~sum of all the exons in both X and Y chromosomes. My concern is, is the gene count based on gene length across multiple chromosomes sensible? For example, what if the copies of gene ‘’64109’ in X and Y chromosomes have different biological function? I think there is some important understanding I am lacking here. An explanation would be great!

If this is indeed not right way to quantify a gene mapping to multiple chromosomes, how do you address this issue? Should such genes be removed from analysis?

gene RNAseq featureCounts • 113 views
6 weeks ago
Zhilong Jia ★ 1.8k

Human has 23 paired chromosomes. And among them, one paired chromosomes are X and Y. A figure in the link will clarify this point.


