Hey everyone,
I just realized that until GENCODE v41 the so called "main annotation file for most users" was Comprehensive gene annotation for CHR. Since v42 it is Basic gene annotation for CHR. Why the change, and would it make a difference, if I only need the exon regions?
Thanks for that. But release 23 still shows the comprehensive file as "main annotation file for most users".
Change happened when going from release 22 ( https://www.gencodegenes.org/human/release_22.html ) to 23.
Ah, I see what you mean. So from 22 to 23 the
basicversion in general was introduced. But as I see it from version 42 to 43 GENCODE decided that basic should be new default.I think that is just a recommendation for most users. If not interested in haplotypes and other complexities then use the
basic.Hmm, what I am currently trying to is extracting the total length of all non-overlapping exons (see here) However, It makes a difference, if I use the basic annotations or the complete one. I find 62703 total genes with their corresponding exon length (these are the same then the ones I find with the
GenomicFeaturesRpackage. However, 19862 of them differ if I use basic or comprehensive annotations.Are you accounting for the haplotypes?
Not really. Actually, I am just looking for a good way to calculate TPM, since for this I need the gene length. Real gene length doesn’t make sense though since it’s mRNA. And since I don’t know which transcript isoform I am actually sequencing I think the most fitting would be take the total length of all available exons. I know this is not completely correct, but I can’t think of a better way.