Question

TCGA data analysis: DEG analysis

0

Entering edit mode

5.7 years ago

aswanikrishnap1994 ▴ 20

i am currently working with TCGA dataset from UCSC xena browser. I have completed the differential gene expression analysis using Deseq2 from genepattern for one cancer dataset. I have some doubts regarding the result and want to know how to proceed further for analyzing across different cancer samples. I am very new to TCGA and currently doing the analysis based on

Following are my doubts regarding the analysis

1. HT seq count file downloaded from xena has transcript id's, i want gene id's for my analysis. How should i do this?
2. For generating a heatmap for DEG's of different cancer dataset should i use the log2 expression values from DEseq2?

RNA-Seq TCGA DEG ANALYSIS • 2.6k views

ADD COMMENT • link updated 5.7 years ago by ATpoint 88k • written 5.7 years ago by aswanikrishnap1994 ▴ 20

0

Entering edit mode

There is no need to SHOUT. I have removed the excessive uppercase letters from your title.

ADD REPLY • link 5.7 years ago by WouterDeCoster 48k

score 2 · Answer 1 · 2019-10-17

Towards 1) You should check how exactly this file has been created. If it is indeed transcript level then aggregate it to the gene level with tximport which you can then seamlessly integrate into DESeq2. Check the respective manuals. Code is given there.

2) I would use Z-transformed log2 expression values for clustering. This could be the log2-transformed values from DESeq2 itself or you use vst or rlog on the raw gene-level data again. The latter two are already log2 after running the command. Given a data frame with FCs you can do t(scale(t(fc.matrix))) to get them. This will focus the clustering on the relative differences between the samples for each gene and is robust against outliers e.g. some genes showing extreme fold changes as the Z-FCs are a relative measure for each gene indicating how much each sample diverges from the mean of all samples for each gene. See e.g. the Wikipedia article on Z-scores (standardization).