WGS for TCGA HNSC project
1
0
Entering edit mode
6.7 years ago
smb310 • 0

My question is somewhat related to this post: Whole Genome Sequences somatic MAF files at TCGA I am looking for WGS data for TCGA HNSC samples. I know that some samples had WGS (as opposed to just WES/WXS) based on, for example, Supplementary File 1.1.xlsx from here https://www.nature.com/nature/journal/v517/n7536/full/nature14129.html (in zipped file), which gives "Whole Genome Sequencing Barcodes."

I do have access to the protected data and, as in, Whole Genome Sequences somatic MAF files at TCGA, there are no MAFs for WGS. Thus, I tried to look at the available VCF files instead (through the Legacy GDC). There are multiple VCF files for each sample. Looking at one specific sample, the VCF file with the largest number of variants (> 100,000) has SequenceSource=WXS. The others say Source=dbGaP, or Source=CGHUB. I'm a bit lost at this point as to whether or not there are any VCF files for WGS.

next-gen • 1.7k views
ADD COMMENT
0
Entering edit mode
6.7 years ago
Zhenyu Zhang ★ 1.2k

I don't know why the sequencesource will state otherwise, but if there is sdrf describing this vcf is derived from a WGS CGHub analysis_id/bam file, then it should be WGS.

As to MAFs, ppl will not make project-level MAFs out of WGS calls given the file will be 100 times larger.

ADD COMMENT

Login before adding your answer.

Traffic: 1379 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6