Mapping files to patients in TCGA (RNA-seq)
1
0
Entering edit mode
15 months ago
Roza • 0

Hello & good day,

I need help regarding downloading TCGA (RNA-seq) dataset. I have no idea about Bioinformatics, my background is computer science. I would like to run this code. I'm trying to run the code provided here: https://github.com/luisvalesilva/multisurv/blob/master/data/preprocess_omics.ipynb

I have an issue in mapping files to patient via cases.0.submitter_id.

When I downloaded RNA-seq from the manifest provided by the author gdc_manifest.2019-08-23.txt, the mapping process failed, I think due to changes that occurred in the gdc portal. So what I did is, I downloaded the updated version of RNA-seq from gcd, then when I map files to patients, I was able to map only around 1000+ patients whereas around 8000+ couldn't.

Could you guide me how to handle this?

Does that means only around 1000 patients RNA-seq are publicly available (access control = open)?

Your help is much appreciated.

Thank you

RNA-seq TCGA • 992 views
ADD COMMENT
1
Entering edit mode

Not sure about the particular notebook you mentioned. But in general, once you added these files to GDC cart, there is a sample_sheet you can download from the cart. This sample sheet TSV file contains the exact file to patient mapping you are asking for.

ADD REPLY
0
Entering edit mode

Thank you Zhenyu Zhang, I wasn't aware about this (sample_sheet).

ADD REPLY
0
0
Entering edit mode

thank you Ming Tang

ADD REPLY

Login before adding your answer.

Traffic: 2089 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6