I am wondering if anyone estimated transcript-level counts from plate based scRNA datasets like smart-seq that capture full length transcripts instead of 3' ends. I tried to use salmon but the estimated counts/TPMs seems to be highly inflated.
For example, for this transcript, the output from a single cell looks like this.
Name Length EffectiveLength TPM NumReads ENST00000344843.7 721 250 247.016 471.945
When I looked the same cell bam file in IGV, that transcript has only around 50 reads mapped. Salmon
NumReads is 471. This happens for a lot of transcripts.
I am wondering why the values are inflated ? One potential reason could be due to the default scaling factor used, as for bulk-rna where the total counts tend to be in millions.
I would like to know if I anyone estimated the transcript counts from scRNA before. There are tools like Alevin but they seem to work only with 3' enriched droplet based methods.