Change of -t argument in featureCounts affecting annotation
0
3
Entering edit mode
8 weeks ago
Marco Pannone ▴ 670

Hey everybody

I am trying to find the explanation for a dilemma I came across during some RNA-seq data analysis.

I noticed that switching between -t exon and -t gene in featureCounts highly affects the output read counts for certain genes. For example, some housekeeping genes gave the expected high read counts across all samples when using -t exon, while when using -t gene read counts went either to 0 or to very low values close to 0.

I can't fully understand why so, since I expect that -t gene should also include exons in the annotation procedure.

I appreciate any comments in this regard.

Thanks!

rnaseq subread annotation featureCounts • 287 views
0
Entering edit mode

I don't think any answer or comment here can substitute for you looking through the annotation file. Chances are that your assumption (genes include exons), while true in biological sense, doesn't hold true in your file. Don't know the reason, but that almost has to be the explanation. I suggest you look through the annotation file, and specifically search for the genes where you observed discrepancy. I suspect their gene definitions/boundaries to be defined incorrectly.

0
Entering edit mode

Thanks for your reply. I have downloaded and tried two different .gtf files, one from Ensembl and one from Gencode, ending up having the same results mentioned in my post. I am definitely going to look through the annotation file, but it sounds a bit strange that both the annotation files (and from highly reliable sources) might have discrepancies.

0
Entering edit mode

When you have eliminated the impossible, whatever remains, however improbable, must be the truth.

You probably know the quote, or can find its origin easily. The problem must be with the operator, the program, or the files it uses. Since between the two runs the operator hasn't changed, and featureCounts is a mature and well-tested program, it would appear the culprit is in the annotation file. I suspect that some genes will have only exons defined, but not a complete gene boundary.

0
Entering edit mode

I really like the quote! Just googled it, I did not know about it, but from now on I will definitely remember it.

You are right, I will look into the annotation files and find the reason for my concern. Thanks again for the time spent on my question, highly appreciated.