Last week I have been trying to compare RNA-seq isoform-level quantification to Nanostring data in order to assess the reproducibility between both platforms. I have two samples with both types of data available and a gene signature in which I'm specially interested. The tools I have been using are STAR for mapping and RSEM for quantification, with hg38 as reference genome and hg38.ensGene.gtf (downloaded from UCSC site).
The pipeline runs without problems but the results do not match my expectations at all. For some of this genes, I have observed that the quantification of "expected_counts" and "TPM" is 0, even though when I open the bam files in IGV I can see reads mapping to these isoforms. An example is VEGFA, for which I link a screenshot.
Reads are clearly mapping to VEGFA, yet some of the isoforms have 0 as their TPM (the one marked in grey for example, which is the largest). This results reproduce when using MapSplice as alignment algorithm instead of STAR.
Why is this happening? Why is RSEM assigning reads to some isoforms and not others when they are so similar? Please help.
Yours truly, Arturo