Question: TCGA RNA-Seq, Expression levels in comparison with normal and connection with CNVs
0
gravatar for thmourikis
4.7 years ago by
thmourikis10
European Union
thmourikis10 wrote:

Hello,

I am new to RNA-Seq pipelines/analysis and TCGA data. I am trying to identify genes with differential expression between tumor and normal. The ultimate goal is to connect differential expression with copy number alterations in a patient specific manner. I cannot find RNA-Seq data for many normal samples (matched with their corresponding tumor sample) per cancer type. Is it true that there aren't many normal RNA-Seq expression data? If yes, is there any alternative approach to reach the above-described goal?

Thanks a lot in advance.

 

 

rna-seq • 4.0k views
ADD COMMENTlink modified 4.7 years ago by TriS4.0k • written 4.7 years ago by thmourikis10

About 10% f the RNASeq samples are from normal tissue. To perform differential expression you can still use all the samples, but you have to put in a patient-matching indicator and a tumour / normal indicator into your design matrix.

ADD REPLYlink written 4.7 years ago by russhh4.9k

thanks for your reply russ_hyde. Could you please elaborate a little bit more? Could you give an example? Is that like using pooled normal?

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by thmourikis10
2
gravatar for TriS
4.7 years ago by
TriS4.0k
United States, Buffalo
TriS4.0k wrote:

you can find the normal samples by checking the sample's barcode. basically if the 14th character is a "1" that's a tumor, if it's a "0" is a normal.

for example a matched normal-tumor samples pair would be: 

TCGA-12-4567-01-blah-blah --> this is normal

TCGA-12-4567-11-blah-blah --> this is tumor

however, the number of "normal" samples depends from the tumor type, some have more some have less.

another way to download separately tumor vs normal is to use TCGA Assembler that you can download from here. it's an R package that allows you to download a variety of TCGA datasets including clinical data or CNVs

in the clinical data sheet you will be able to see which patients are which and which ones have clinical data available. you can download clinical data from the Data Matrix following the tutorial here

ADD COMMENTlink written 4.7 years ago by TriS4.0k

thanks for your answer and the useful links. I think blah-01-blah is the tumor and everything >09 is the normal. Anyway, i guess the answer to my question is that indeed there aren't many "normal" RNASeq samples.

ADD REPLYlink written 4.7 years ago by thmourikis10

so, 01-09 is tumor, 10-19 is normal, 20-29 controls, 5x cell lines, 60-61 xenograft

depends from what you mean for "many", but yes, there are not as many as tumors. if you need a wider cohort you might want to tap into the GTEx project

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by TriS4.0k

thanks once again for your answer TriS!

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by thmourikis10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1240 users visited in the last hour