Question: Infer loss of heterozygosity from SNP variant frequencies
gravatar for ceruleanivy
9 months ago by
ceruleanivy20 wrote:

I would like to know what methods are mostly used to extrapolate a LOH event from variant allele frequencies on SNPs. I have targeted sequencing data on FFPE tumors so that's the only approach I can possibly utilize. My current problems are

a) Finding a script (R?) that takes into account tumor cell content (%) b) Normalize my data (quantile?) as I have quite a few frequencies that don't aggregate in 0, 0.5 and 1 VAF.

Note that my samples include a normal site and a primary tumor for each case. Initially I tried to use Bioconductor's quantsmooth but couldn't figure out the right way to format my input.

sequencing snp next-gen R • 373 views
ADD COMMENTlink modified 9 months ago by Amitm1.2k • written 9 months ago by ceruleanivy20
gravatar for Amitm
9 months ago by
Amitm1.2k wrote:

hi, I haven't understand some bits of your question, like what do you mean by normalizing allele freq., but I will try to say what I understand about inferring LOH. First off, some of the variant callers like VarScan, give you LOH calls as well, along with somatics. So if you have paired samples, then its quite ez. The part which I didn't understood: "...few freq. dont aggregate in 0, 0.5 & 1 VAF". As far as I know, unless you have 100% pure tumor sample AND no clonal heterogeneity AND no copy-number variations going on.., VAFs would not fit into those 3 classes, rather would be a continuous spectrum ranging from 0 to 100.

If not using softwares like VarScan, another way of looking into LOH would be to devise custom scripts to look into VCF files from single-sample variation calling. Most VCFs have allele depths (ref. & var.) quoted for each variant. Looking for pos. with variant call in the normal sample but no call (all reads supporting the ref. allele instead) in the tumor sample is one simplistic way.

ADD COMMENTlink written 9 months ago by Amitm1.2k

Thanks for the reply. My problem is that a considerable number of recorded VAFs fall between 0.2-0.3 and 0.70-0.85 and therefore this returns a quite unfamiliar scatter-plot, that's why I thought about trying quantile normalization (only for better visualization).

ADD REPLYlink written 9 months ago by ceruleanivy20

I haven't heard of normalizing VAFs (pardon me if I have missed something very obvious). But they are supposed to be a continuous spread. One consideration is whether you have performed whole exome/ genome or targeted gene panel seq. instead. If the number of varaints are few to start with then the spread would not appear smooth. I am attaching a scatterplot comparing VAFs of two related tum. samples. enter image description here the pink dot was a hallmark mut. and hence highlighted. The grey blobs are private mut.

ADD REPLYlink written 9 months ago by Amitm1.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1743 users visited in the last hour