Question: Calculate allele frequency for structural variant breakpoints
gravatar for nr23
2.7 years ago by
nr2380 wrote:

I have a list of high quality structural variant breakpoints that are the product of multiple different SV callers. While some of the approaches I use estimate the allele frequency for called breakpoints, others do not, so in some cases I have high quality variants that lack allele frequency estimates.

Because of this (and **), I would like to be able to calculate allele frequencies for a list of genomic coordinates.

Allele frequencies in this context refer to the fraction of reads supporting an alternate allele. A homozygous variant supported by all reads would have an allele frequency of 1, whereas a heterozygous would be 0.5 (assuming 100% purity of sample).

This should be possible by looking at the alignment file (.bam) and counting the number of reads supporting and disagreeing for each variant call. E.g. AR/RR+AR (alternate reads/referece reads + alternate reads).

I am currently thinking that I will have to write my own tool to do this (probably using PySam, but if anyone can suggest an existing approach, or an easy way of achieving this I would be happy to not re-invent this particular wheel!

** Also because I see a high discrepancy between the allele frequency values given by different approaches (DELLY, LUMPY+SVtyper, Meerkat, novobreak) for the same variant

next-gen genome • 1.7k views
ADD COMMENTlink written 2.7 years ago by nr2380

Can't you do this using samtools mpileup?

ADD REPLYlink written 2.7 years ago by WouterDeCoster44k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1889 users visited in the last hour