Hi all, I have come across something I have never seen before. I am working with some data from an outside source which appears to be processed RNA-seq files. Like other processed RNA-seq files I have ran into they are tab delimited files with columns for gene length, expected gene length, TPM, and counts for each probe identifier. Here is where things get weird, for any two samples and for the same probe set identifier the gene lengths are different and difference can be quite large! I have never seen this, the gene lengths have always been the same when working from sample to sample the expected lengths may vary a little bit. This ultimately has an effect on how the TPM is calculated and just makes me wonder what I am I missing. Does anybody have a clue why this might be the case.