Question: TCGA data set with both expression (rna seq/ microarray) and exome seq of same samples?
gravatar for cafelumiere12
4.1 years ago by
United States
cafelumiere1270 wrote:

I have tried downloading some TCGA RNA seq data before from TCGA site, also trying TCGA-Assembler download route. However I don't remember being able to do this:

Does anyone have any idea/ suggestion regarding how to find data sets of a particular cancer that have both some sort of expression results (RNA-seq or microarray) and exome sequencing results?


I was able to use and found the data I needed.

cbioportal tcga cgdsr • 2.0k views
ADD COMMENTlink modified 2.9 years ago by Biostar ♦♦ 20 • written 4.1 years ago by cafelumiere1270
gravatar for Sean Davis
4.1 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

TCGA has both gene expression and exome/genome sequence data for nearly all samples. To get access to the actual sequencing data, you will need to apply for access, as sequencing data for human subjects is almost always controlled access. See here for instructions:

ADD COMMENTlink written 4.1 years ago by Sean Davis26k

Thank you! After poking around I found

However, while I tried to download the data from the above link, for example, through COADREAD Archives- After downloading MAF files and mRNAseq files -

I found overlapping samples (based on project, TSS, participant ID of TCGA barcodes), however there are only 74 overlapping samples between mRNAseq data and Mutation Annotation files. While when I tried using R package cgdsr_1.2.5 for querying data from CBio portal as well and found that in the COADREAD datasets there should be at least 195 cases in one of the studies (Colorectal Adenocarcinoma (TCGA, Nature 2012)) with complete data (mutation, mRNA,etc) . The only problem I have with using cgdsr to query CBio portal is that there isn't a way to do bulk download, I need to specify specific genes. Not sure why I am getting fewer overlapping cases through GDAC website though

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by cafelumiere1270
gravatar for nwon
4.1 years ago by
New Zealand
nwon40 wrote:

All TCGA data has migrated to Genomic Data Commons Link to Genomic Data Commons

Within this web resource is the legacy TCGA data within the legacy database.

ADD COMMENTlink written 4.1 years ago by nwon40
gravatar for pel
4.1 years ago by
pel10 wrote:

You can find the largest selection of level2 and level3 (no human subjects protocol required) data for somatic mutations, CNVs, SNPs, methylation, and RNA-Seq and chip-based expression for each tumor in TCGA for multiple cancer sites at the PanCancer 12 site

Recall, as was pointed out above, you cannot get the sequence data without approval, however, the mutation (.maf) files are level2 and have most of the mutation calls.

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by pel10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1646 users visited in the last hour