TCGA-BRCA tissue and RNAseq problem

0

Entering edit mode

16 months ago

reventropy2003 • 0

I'm trying to analyze RNAseq data from the TCGA-BRCA project. I've downloaded the STAR count tsv files as well as the clinical and sample manifests/metadata. The problem is that there is no column entry in the "clinical.tsv" which indicates tissue sample type so as far as I can tell there is no way to know if the STAR counts come from normal or disease tissue.

The "gdc_sample_sheet" clearly indicates that some samples are disease and others are normal tissue, but the "case submitter id" in the clinical file has the last few characters clipped and these are what identify the sample type. For instance, case_submitter_id: 'TCGA-E2-A154' in the clincal.tsv file is 'TCGA-E2-A154-01A' in the gdc_sample sheet, with "01A" indicating disease tissue. How do I know what RNAseq files come from normal tissue if any?

NIH RNAseq GDC TCGA-BRCA • 722 views

ADD COMMENT • link 16 months ago by reventropy2003 • 0

0

Entering edit mode

Go to https://portal.gdc.cancer.gov > Exploration > select TCGA and the cancer type you want > select case ID to download the files and have all the information you need about the sample

ADD REPLY • link 16 months ago by pinheirofabiano ▴ 20

0

Entering edit mode

Thanks! Somehow I missed the "File Name" column in the gdc_sample_sheet.

ADD REPLY • link 16 months ago by reventropy2003 • 0

Login before adding your answer.