TCGA germline variants status
1
2
Entering edit mode
8.4 years ago
jan ▴ 170

Hi,

I have annotated vcf files from TCGA. I'm interested in looking at Germline variants in TCGA samples. The TCGA vcf contains variant calls from both normal and primary tumors. I'm trying to understand how to differentiate between germline and somatic variants.

Would I be able to tell germline variants simply from these two information:

##INFO=<ID=SS,Number=1,Type=Integer,Description="Somatic status of sample">
##FORMAT=<ID=SS,Number=1,Type=Integer,Description="Variant status relative to non-adjacent Normal,0=wildtype,1=germline,2=somatic,3=LOH,4=post-transcriptional modification,5=unknown">

And this might sound stupid but TCGA vcf files contain both normal and primary sample but the how can I tell if the annotation in info columns belong to normal or primary samples?

vcf TCGA sequencing snpSift • 3.5k views
ADD COMMENT
0
Entering edit mode

Where did you get the VCF files? Did you apply for access to the protected germline data?

ADD REPLY
0
Entering edit mode
8.0 years ago
echen1 • 0

The TCGA public data has had tumor variants that overlap with known SNPs (aka germline variants) curated out of it. So if you want germline variants, and you don't have the protected data, you must apply for the protected data.

ADD COMMENT
0
Entering edit mode

Hi echen1, thank you for your reply.

I have the protected germline mutation files with the right authorization. I have checked that the exome data was derived from blood samples so I consider variants with somatic status=1 as germline variants

ADD REPLY

Login before adding your answer.

Traffic: 1585 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6