Question: Whole Genome Sequences somatic MAF files at TCGA
gravatar for always_learning
24 months ago by
Doha, Qatar
always_learning870 wrote:

Hi All,

I was trying to get WGS MAF files from TCGA. I was going through this wonderful post here at Working with MAF files (Mutation Annotation Format) from the TCGA (The Cancer Genome Atlas) and it looks like TCGA does not have WGS MAF. as mentioned in above link by Cyrius that "Almost everything in the TCGA MAFs are from targeted exome sequencing. 50 of the 200 LAML tumors were whole-genome sequenced, and the putative calls were targeted with custom capture arrays" . So seems that TCGA doesn't have WGS MAF files for all tumors type at all. Even this link of FireHose says that " All MAFs on this page are exome only. We do not provide whole genome MAFs."

Could someone please help me in finding repository for Somatic WGS MAF files for different type of cancer.

Thanks Najeeb

cancer tcga wgs maf • 1.2k views
ADD COMMENTlink modified 24 months ago by Chris Miller19k • written 24 months ago by always_learning870
gravatar for Chris Miller
24 months ago by
Chris Miller19k
Washington University in St. Louis, MO
Chris Miller19k wrote:

You can get access to the WGS calls from TCGA data, it's just that they're protected to help protect patient privacy. The thought is that germline SNPs do slip through somatic pipelines at a non-negligible percentage, so if you offer up the whole genome variants, you potentially expose people to identification. (whether you agree with this rationale or not, it's the policy)

The process for requesting access is described here:

ADD COMMENTlink written 24 months ago by Chris Miller19k

Ah, and that goes via dbGAP portal and which is a very very complicated task in itself specially for Non-US.

ADD REPLYlink written 24 months ago by always_learning870

Hi Chris. I've obtained access to TCGA protected data, but based on the very small proportion of variants that falls outside coding regions, all the calls appear to be from targeted sequencing (that being said, the Sequence_Source field is empty). I have browsed through the Legacy GDC and noticed some >100GB bam files that likely correspond to WGS, but didn't find any WGS MAFs. Could you please provide some additional hints about where to find the WGS variants? Thanks.

ADD REPLYlink written 14 months ago by mathieu.lajoie0

Are you talking about AML specifically, or WGS in general from TCGA projects?

ADD REPLYlink written 14 months ago by Chris Miller19k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1785 users visited in the last hour