Question: Calculate allele frequency for structural variant breakpoints
gravatar for nr23
9 months ago by
nr2370 wrote:

I have a list of high quality structural variant breakpoints that are the product of multiple different SV callers. While some of the approaches I use estimate the allele frequency for called breakpoints, others do not, so in some cases I have high quality variants that lack allele frequency estimates.

Because of this (and **), I would like to be able to calculate allele frequencies for a list of genomic coordinates.

Allele frequencies in this context refer to the fraction of reads supporting an alternate allele. A homozygous variant supported by all reads would have an allele frequency of 1, whereas a heterozygous would be 0.5 (assuming 100% purity of sample).

This should be possible by looking at the alignment file (.bam) and counting the number of reads supporting and disagreeing for each variant call. E.g. AR/RR+AR (alternate reads/referece reads + alternate reads).

I am currently thinking that I will have to write my own tool to do this (probably using PySam, but if anyone can suggest an existing approach, or an easy way of achieving this I would be happy to not re-invent this particular wheel!

** Also because I see a high discrepancy between the allele frequency values given by different approaches (DELLY, LUMPY+SVtyper, Meerkat, novobreak) for the same variant

next-gen genome • 552 views
ADD COMMENTlink written 9 months ago by nr2370

Can't you do this using samtools mpileup?

ADD REPLYlink written 9 months ago by WouterDeCoster32k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2041 users visited in the last hour