I have RNA-Seq data of an experiment that involves global shift in transcript abundance and were spike-ins were added (to try to overcome bias based on this global shift). Reads were aligned and counted using STAR and RSEM. I have been trying to find consistent approaches to normalise RNA-Seq counts using ERCC Spike-ins counts, but all I find are quite disperse approaches.
One of the approaches I read consisted in normalising the RPKMs of their genes by dividing them by the sum of the RPKMs of their spike-ins. Influenced by this, I aimed to normalise TPMs the same way: TPM of genes divided by the Sum of the TPMs of spike-ins.
Even though it seems quite straightforward to me, I do not find anyone doing something similar and I am quite new to the field. Would this spike-in-normalised TPMs be reliable? If not, can someone point out where the mistake would be?
Not something you will like but here goes:
https://support.bioconductor.org/p/88413/
What is ERCC spike-in really saying about my sequencing run?