Question: Whole Genome Sequences somatic MAF files at TCGA
2
gravatar for always_learning
15 months ago by
Doha, Qatar
always_learning780 wrote:

Hi All,

I was trying to get WGS MAF files from TCGA. I was going through this wonderful post here at Working with MAF files (Mutation Annotation Format) from the TCGA (The Cancer Genome Atlas) and it looks like TCGA does not have WGS MAF. as mentioned in above link by Cyrius that "Almost everything in the TCGA MAFs are from targeted exome sequencing. 50 of the 200 LAML tumors were whole-genome sequenced, and the putative calls were targeted with custom capture arrays" . So seems that TCGA doesn't have WGS MAF files for all tumors type at all. Even this link https://confluence.broadinstitute.org/display/GDAC/MAF+Dashboard of FireHose says that " All MAFs on this page are exome only. We do not provide whole genome MAFs."

Could someone please help me in finding repository for Somatic WGS MAF files for different type of cancer.

Thanks Najeeb

cancer tcga wgs maf • 695 views
ADD COMMENTlink modified 15 months ago by Chris Miller18k • written 15 months ago by always_learning780
0
gravatar for Chris Miller
15 months ago by
Chris Miller18k
Washington University in St. Louis, MO
Chris Miller18k wrote:

You can get access to the WGS calls from TCGA data, it's just that they're protected to help protect patient privacy. The thought is that germline SNPs do slip through somatic pipelines at a non-negligible percentage, so if you offer up the whole genome variants, you potentially expose people to identification. (whether you agree with this rationale or not, it's the policy)

The process for requesting access is described here: https://tcga-data.nci.nih.gov/tcga/tcgaAccessTiers.jsp

ADD COMMENTlink written 15 months ago by Chris Miller18k

Ah, and that goes via dbGAP portal and which is a very very complicated task in itself specially for Non-US.

ADD REPLYlink written 15 months ago by always_learning780

Hi Chris. I've obtained access to TCGA protected data, but based on the very small proportion of variants that falls outside coding regions, all the calls appear to be from targeted sequencing (that being said, the Sequence_Source field is empty). I have browsed through the Legacy GDC and noticed some >100GB bam files that likely correspond to WGS, but didn't find any WGS MAFs. Could you please provide some additional hints about where to find the WGS variants? Thanks.

ADD REPLYlink written 5 months ago by mathieu.lajoie0

Are you talking about AML specifically, or WGS in general from TCGA projects?

ADD REPLYlink written 5 months ago by Chris Miller18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 462 users visited in the last hour