TCGA Germline allelic fraction distribution
0
3
Entering edit mode
3 months ago

I doing germline variant calling on TCGA data, however and I started noticing something strange.

As a test I did the following: I downloaded one tumor/normal genome bam file pair. First I ran variant calling using Strelka (starting from the alignment file and using the various TCGA reference files) and noticed that the distribution of allelic fractions did not look right, it looked skewed (see right figure below, WGS example) and did not reflect the homozygocity or heterozygocity bimodal distribution that I would expect (homozygous would have one big peak at one, heterozygous a distribution centered around 0.5 - see left figure below). I thought I did something wrong and then converted the bam to fastq and ran the analysis from scratch but got the same thing. Below is a figure of the distribution that I would expect and what I have observed in other projects, and what I am seeing on a TCGA tumor/normal pair.

enter image description here

Can you explain? Any advice appreciated.

calling variant tcga • 679 views
ADD COMMENT
0
Entering edit mode

First I ran variant calling using Strelka (starting from the alignment file and using the various TCGA reference files)...

How did you run variant calling with Strelka (Germline or Somatic mode)? Did you run variant calling for both samples (tumour and normal) together, or each sample separately? If you are performing somatic variant calling, then you would not expect the variant frequencies to conform to your model example.

ADD REPLY
0
Entering edit mode

Hello, thank you very much for your response. The error was actually in how I am calling Strelka, I ran haplotypecaller on the same data and got the correct distribution. The issue is on how Strelka is being called, I am running it through Sarek so will check on that.

ADD REPLY
0
Entering edit mode

By the way, this problem is solved. The issue was that TCGA is WXS data, and I was missing the appropriate parameter (--wes) on Strelka. Once I put that then we saw the expected distribution.

ADD REPLY

Login before adding your answer.

Traffic: 5083 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6