Question: TCGA germline variants status
gravatar for jan
4.4 years ago by
Sydney, Australia
jan120 wrote:


I have annotated vcf files from TCGA. I'm interested in looking at Germline variants in TCGA samples. The TCGA vcf contains variant calls from both normal and primary tumors. I'm trying to understand how to differentiate between germline and somatic variants. 

Would I be able to tell germline variants simply from these two information:

##INFO=<ID=SS,Number=1,Type=Integer,Description="Somatic status of sample">      

##FORMAT=<ID=SS,Number=1,Type=Integer,Description="Variant status relative to non-adjacent Normal,0=wildtype,1=germline,2=somatic,3=LOH,4=post-transcriptional modification,5=unknown">


And this might sound stupid but TCGA vcf files contain both normal and primary sample but the how can I tell if the annotation in info columns belong to normal or primary samples? 

sequencing tcga vcf snpsift • 2.2k views
ADD COMMENTlink modified 3.9 years ago by echen10 • written 4.4 years ago by jan120

Where did you get the VCF files? Did you apply for access to the protected germline data?

ADD REPLYlink modified 4 months ago by RamRS26k • written 4.4 years ago by Sean Davis26k
gravatar for echen1
3.9 years ago by
echen10 wrote:

The TCGA public data has had tumor variants that overlap with known SNPs (aka germline variants) curated out of it. So if you want germline variants, and you don't have the protected data, you must apply for the protected data.

ADD COMMENTlink written 3.9 years ago by echen10

Hi echen1, thank you for your reply.

I have the protected germline mutation files with the right authorization. I have checked that the exome data was derived from blood samples so I consider variants with somatic status=1 as germline variants

ADD REPLYlink written 3.9 years ago by jan120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 880 users visited in the last hour