TPM for scRNAseq
0
1
Entering edit mode
2.5 years ago
little_more ▴ 70

Hi all!

I'm trying to integrate scRNAseq data from 2 different papers: Neftel et al. and Darmanis et al.. The data from Darmanis is obtained from SmartSeq2 and is in raw counts, while Neftel paper provides 2 datasets: one from 10X in raw counts (UMIs, I suppose) and the other from SmartSeq2 in TPM (raw data are not available). I'm working with scanpy and I haven't seen any tutorials that use TPM for scRNAseq (usually only CPM), but I figured I need to transform Neftel 10X data and Darmanis Smartseq2 data into TPM so that I can then integrate all datasets together (since I can't obtain raw counts from TPM).

Now, I am not sure how to normalize for gene length. As far as I understood it, TPMs provided for SmartSeq by the Neftel group were obtained using RSEM, and it seems like RSEM uses effective transcript lengths calculated independently for each sample as the weighted average of effective lengths of its isoforms (weighted by 'IsoPct').

So my questions are:

1) if I just download all transcript lengths from Biomart, compute some average values for transcripts of each gene, and then use it to calculate TPMs, will it be reasonable to use these TPMs to further integrate the three datasets? Or are they somewhat different and incompatible?

2) should I just use 1 as transcript length for the 10X dataset?

2) is it even a good idea to transform raw counts to TPM for this type of analysis or should I just remove the Neftel Smartseq2 dataset from the analysis and proceed with raw counts and CPM? I also plan to identify cell clusters in a combined dataset and find marker genes for them.

scRNAseq • 1.4k views
ADD COMMENT
1
Entering edit mode

Unless the scRNA-seq variants of salmon or kallisto were used to calculate the TPMs, I would recommend either reprocessing the data to get raw counts, or excluding the dataset. Calculation of TPMs without building transcript level models often leads to misleading results since you don't know what isoform(s) are expressed.

ADD REPLY

Login before adding your answer.

Traffic: 2233 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6