Question: Whole Genome Sequences somatic MAF files at TCGA
gravatar for always_learning
4.5 years ago by
Doha, Qatar
always_learning1.1k wrote:

Hi All,

I was trying to get WGS MAF files from TCGA. I was going through this wonderful post here at Working with MAF files (Mutation Annotation Format) from the TCGA (The Cancer Genome Atlas) and it looks like TCGA does not have WGS MAF. as mentioned in above link by Cyrius that "Almost everything in the TCGA MAFs are from targeted exome sequencing. 50 of the 200 LAML tumors were whole-genome sequenced, and the putative calls were targeted with custom capture arrays" . So seems that TCGA doesn't have WGS MAF files for all tumors type at all. Even this link of FireHose says that " All MAFs on this page are exome only. We do not provide whole genome MAFs."

Could someone please help me in finding repository for Somatic WGS MAF files for different type of cancer.

Thanks Najeeb

cancer tcga wgs maf • 2.2k views
ADD COMMENTlink modified 4.5 years ago by Chris Miller21k • written 4.5 years ago by always_learning1.1k
gravatar for Chris Miller
4.5 years ago by
Chris Miller21k
Washington University in St. Louis, MO
Chris Miller21k wrote:

You can get access to the WGS calls from TCGA data, it's just that they're protected to help protect patient privacy. The thought is that germline SNPs do slip through somatic pipelines at a non-negligible percentage, so if you offer up the whole genome variants, you potentially expose people to identification. (whether you agree with this rationale or not, it's the policy)

The process for requesting access is described here:

ADD COMMENTlink written 4.5 years ago by Chris Miller21k

Ah, and that goes via dbGAP portal and which is a very very complicated task in itself specially for Non-US.

ADD REPLYlink written 4.5 years ago by always_learning1.1k

Hi Chris. I've obtained access to TCGA protected data, but based on the very small proportion of variants that falls outside coding regions, all the calls appear to be from targeted sequencing (that being said, the Sequence_Source field is empty). I have browsed through the Legacy GDC and noticed some >100GB bam files that likely correspond to WGS, but didn't find any WGS MAFs. Could you please provide some additional hints about where to find the WGS variants? Thanks.

ADD REPLYlink written 3.7 years ago by mathieu.lajoie10

Are you talking about AML specifically, or WGS in general from TCGA projects?

ADD REPLYlink written 3.7 years ago by Chris Miller21k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1269 users visited in the last hour