Question: Do I need to normalize level 3 TCGA data?
gravatar for chenchen314159
4.7 years ago by
United States
chenchen31415960 wrote:

Hi. Recently I'm dealing with TCGA microarray data. According to the description, the level 3 data has already been normalized. However, when I boxplot the expression matrix, it seems that many of the chips (which are all from one platform) are quite different in quantiles. My question is should I normalize and/or log2 transform them?

microarray tcga • 9.2k views
ADD COMMENTlink modified 4.2 years ago by nnabavi0 • written 4.7 years ago by chenchen31415960

Level 3 microarray data should be normalized. Which microarray is this? Have you looked at whether these differences correlate to tumor type, or subtype? Depending on the platform, these might be real genome-wide effects.

ADD REPLYlink written 4.7 years ago by Cyriac Kandoth5.3k

It's U133A. They are all from GBM tumors (no control). For subtypes, this platform lacks this information in clinical data.

ADD REPLYlink written 4.7 years ago by chenchen31415960
gravatar for Cyriac Kandoth
4.7 years ago by
Cyriac Kandoth5.3k
Memorial Sloan Kettering, New York, USA
Cyriac Kandoth5.3k wrote:

Normalization happens at Level 2 as explained here. So Level 3 TCGA data should be post-normalization and in a format more suitable for making interpretations. But different tumor-specific working groups may do the job differently. GBM was one of the earliest TCGA projects where a lot of lessons were yet to be learned - like abandoning U133A for RNA-seq based expression data. ;)

In general, you should read the method's sections in the tumor-specific marker papers. For TCGA GBM, the supplement here explains the "Creation of a unified Expression Dataset", but it's not clear whether this needed to be done on top of the Level 3 data.

ADD COMMENTlink written 4.7 years ago by Cyriac Kandoth5.3k

I see. Should I use RNAseq instead? But seems that RNAseq has fewer participants than microarray.

ADD REPLYlink written 4.7 years ago by chenchen31415960

Yes, I would recommend RNA-seq RSEMs, but I'm fairly certain that RNA-seq was not done for TCGA GBM. There was a lot of work done in this manuscript to make the most of the microarray data.

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by Cyriac Kandoth5.3k

Hi! I am actually looking at TCGA level3 RNASeqV2 data. My goal is to look at the DEGs (tumor vs. normal) and I'm looking at LUAD now.

I am using edgeR at the moment since the original rsem paper mentioned that those rsem can be processed by edgeR.

I was wondering if it makes sense to include all the tumor samples available, including those that don't have the matching normal samples from the same participant, and analyze for the DEGs? What kind of normalization method would be recommended if I do so? Or can I just use the edgeR default normalization?

ADD REPLYlink written 4.6 years ago by cafelumiere1270

I don't have a good answer for you. You should post a new question on Biostars. Do this in general, if your question is even slightly unrelated.

ADD REPLYlink written 4.6 years ago by Cyriac Kandoth5.3k
gravatar for nnabavi
4.2 years ago by
nnabavi0 wrote:

Hello! I have a similar question, are the normalized values of TCGA level3 RNAseqV2 data already normalized to the normal adjacent tissue or are they representing tumor tissue? The data I downloaded falls under the TN blue category meaning it's Tumor/Matched Normal. Also would the unit of measurement be Intensity/RPKM/FPKM or fold-change? Thanks for any help!

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by nnabavi0

hi nnabavi and welcome, an important point: this should be a new question, not an answer to a different question

also, read up some info here:

if still in doubt then post a new question by using the "moderate -> delete" links and creating a new question. this will allow the Biostar gurus to be able to help you much better :)


ADD REPLYlink written 4.2 years ago by TriS3.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1947 users visited in the last hour