GDAC firehose, multiple samples from RNASeq corresponds to the same clinical data
1
1
Entering edit mode
5.2 years ago
xuren2120 ▴ 20

Hi everyone,

I have a question about the GDAC firehose data from this website: (https://gdac.broadinstitute.org). I downloaded the RNASeq data: illuminahiseq_rnaseqv2-RSEM_genes and clinical data: Clinical_Pick_Tier1, and I found multiple columns in RNASeq corresponds to the same column clinical data. e.g. In BRCA:

In RNASeq data, there are columns:
TCGA-BH-A208-11A-51R-A157-07 TCGA-BH-A208-01A-11R-A157-07

However, in the clinical data, there is only one column:
TCGA-BH-A208

I have two questions:
1) Is the sample TCGA-BH-A208 a tumor sample or normal sample?
2) If I wish to match the clinical data to RNASeq data, since there is only one clinical column but two RNASeq columns, which RNASeq column should I use?

I'd appreciate it if anyone could help. Thanks.

RNA-Seq TCGA sequence • 1.5k views
ADD COMMENT
1
Entering edit mode
5.2 years ago

One cannot infer anything about the tissue type from the short TCGA barcode, i.e., TCGA-BH-A208. We can just say that this is an individual who was part of the TCGA project. It then follows that:

  • TCGA-BH-A208-11A-51R-A157-07 = normal tissue from patient with barcode TCGA-BH-A208
  • TCGA-BH-A208-01A-11R-A157-07 = tumour tissue from patient with barcode TCGA-BH-A208

Take a look at the definitions of the fields in these barcodes: Meaning letters in TCGA sample barcode field

Kevin

ADD COMMENT
1
Entering edit mode

Got it, thanks Kevin!

ADD REPLY

Login before adding your answer.

Traffic: 2662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6