Question: TCGA germline variants status
1
gravatar for jan
3.9 years ago by
jan110
Malaysia
jan110 wrote:

Hi,

I have annotated vcf files from TCGA. I'm interested in looking at Germline variants in TCGA samples. The TCGA vcf contains variant calls from both normal and primary tumors. I'm trying to understand how to differentiate between germline and somatic variants. 

Would I be able to tell germline variants simply from these two information:

##INFO=<ID=SS,Number=1,Type=Integer,Description="Somatic status of sample">      

##FORMAT=<ID=SS,Number=1,Type=Integer,Description="Variant status relative to non-adjacent Normal,0=wildtype,1=germline,2=somatic,3=LOH,4=post-transcriptional modification,5=unknown">

 

And this might sound stupid but TCGA vcf files contain both normal and primary sample but the how can I tell if the annotation in info columns belong to normal or primary samples? 

sequencing tcga vcf snpsift • 2.0k views
ADD COMMENTlink modified 3.5 years ago by echen10 • written 3.9 years ago by jan110

Where did you get the VCF files?  Did you apply for access to the protected germline data?

ADD REPLYlink written 3.9 years ago by Sean Davis25k
0
gravatar for echen1
3.5 years ago by
echen10
echen10 wrote:

The TCGA public data has had tumor variants that overlap with known SNPs (aka germline variants) curated out of it. So if you want germline variants, and you don't have the protected data, you must apply for the protected data.

ADD COMMENTlink written 3.5 years ago by echen10

Hi echen1, thank you for your reply.

I have the protected germline mutation files with the right authorization. I have checked that the exome data was derived from blood samples so I consider variants with somatic status=1 as germline variants

ADD REPLYlink written 3.5 years ago by jan110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2121 users visited in the last hour