Is it possible to calculate TPM using 10X Genomics public data?
1
0
Entering edit mode
23 months ago
Athena • 0

I'm wondering if using data from there (link below) i could find/calculate TPM https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.0.1/pbmc4k or is this data not sufficient enough to do that?

genomics Python R genome • 1.5k views
ADD COMMENT
1
Entering edit mode

Why do you want TPM? TPM divides counts by transcript length but with UMI-tagged data, it isn't necessarily true that longer length -> more counts. I'd recommend not dividing by transcript length for 10X data.

ADD REPLY
0
Entering edit mode

Im trying to run a correlation test using my bulk data (either using RPKM/FPKM/TSM) and do some further downstream analysis.

What would be a better method then, if you do not recommend dividing transcript length?

ADD REPLY
0
Entering edit mode

Just don't divide by transcript lengths. Just take a gene's UMI count and divide it by the total number of UMIs in a cell (this is essentially what TPM is except we're not dividing by transcript length).

ADD REPLY
0
Entering edit mode

Just use the raw counts. All of the above methods introduce a single linear scaling factor so correlation does not change regardless of the method -- unless you introduce something like a per-gene factor such as length -- which as pointed out makes no sense for 10X data.

ADD REPLY
1
Entering edit mode
23 months ago

I would use a standard single-cell method like the LogNormalize from the Seurat package on the counts. Length normalisation only matters for full-length transcriptome sequencing.

ADD COMMENT

Login before adding your answer.

Traffic: 2722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6