Using single sample gsea with tpm file
1
0
Entering edit mode
4.4 years ago

I have a single RNA-Seq sample analyzed with kallisto and the output is a tsp with 170k lines (transcripts) and tpm values. I want to do single sample pathway enrichment (gene set enrichment), but I am not sure what, how to further preprocess the Kallisto output to fit into gsva, esp.:

1.) I need Entrez gene ids as input, I guess. However, I have multiple transcripts of a gene in Kallisto output. Is this a problem? 2.) Do I need any z-transformation?

Thank you very much for your help,

Sebastian

RNA-Seq kallisto gene set enrichment SSGSEA • 1.8k views
ADD COMMENT
1
Entering edit mode
4.4 years ago
igor 13k

1.) I need Entrez gene ids as input, I guess. However, I have multiple transcripts of a gene in Kallisto output. Is this a problem?

You do not need gene IDs, but your gene sets have to match your genes. However, most gene sets are using genes, not transcripts. You can use tximport to convert transcripts to genes.

2.) Do I need any z-transformation?

That will depend on what you are doing. Different algorithms have different expectations. GSVA expects logCPM or logTPM values if kcdf="Gaussian" (see docs).

ADD COMMENT
0
Entering edit mode

Thank you very much Igor for your helpful reply.

One more question: are you aware of a step by step walkthrough or example code how to (correctly) perform gene set enrichment analysis with one sample (table of genes and 1 column with gene expression values (in tpm))?

Thanks again for your help,

Sebastian

Am Mo., 9. Dez. 2019 um 16:35 Uhr schrieb igor on Biostar < mailer@biostars.org>:

ADD REPLY
0
Entering edit mode

I assume you just give it a one column matrix.

ADD REPLY

Login before adding your answer.

Traffic: 1942 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6