Question: Infer loss of heterozygosity from SNP variant frequencies
gravatar for ceruleanivy
3.3 years ago by
ceruleanivy30 wrote:

I would like to know what methods are mostly used to extrapolate a LOH event from variant allele frequencies on SNPs. I have targeted sequencing data on FFPE tumors so that's the only approach I can possibly utilize. My current problems are

a) Finding a script (R?) that takes into account tumor cell content (%) b) Normalize my data (quantile?) as I have quite a few frequencies that don't aggregate in 0, 0.5 and 1 VAF.

Note that my samples include a normal site and a primary tumor for each case. Initially I tried to use Bioconductor's quantsmooth but couldn't figure out the right way to format my input.

sequencing snp next-gen R • 1.3k views
ADD COMMENTlink modified 3.3 years ago by Amitm1.6k • written 3.3 years ago by ceruleanivy30
gravatar for Amitm
3.3 years ago by
Amitm1.6k wrote:

hi, I haven't understand some bits of your question, like what do you mean by normalizing allele freq., but I will try to say what I understand about inferring LOH. First off, some of the variant callers like VarScan, give you LOH calls as well, along with somatics. So if you have paired samples, then its quite ez. The part which I didn't understood: "...few freq. dont aggregate in 0, 0.5 & 1 VAF". As far as I know, unless you have 100% pure tumor sample AND no clonal heterogeneity AND no copy-number variations going on.., VAFs would not fit into those 3 classes, rather would be a continuous spectrum ranging from 0 to 100.

If not using softwares like VarScan, another way of looking into LOH would be to devise custom scripts to look into VCF files from single-sample variation calling. Most VCFs have allele depths (ref. & var.) quoted for each variant. Looking for pos. with variant call in the normal sample but no call (all reads supporting the ref. allele instead) in the tumor sample is one simplistic way.

ADD COMMENTlink written 3.3 years ago by Amitm1.6k

Thanks for the reply. My problem is that a considerable number of recorded VAFs fall between 0.2-0.3 and 0.70-0.85 and therefore this returns a quite unfamiliar scatter-plot, that's why I thought about trying quantile normalization (only for better visualization).

ADD REPLYlink written 3.3 years ago by ceruleanivy30

I haven't heard of normalizing VAFs (pardon me if I have missed something very obvious). But they are supposed to be a continuous spread. One consideration is whether you have performed whole exome/ genome or targeted gene panel seq. instead. If the number of varaints are few to start with then the spread would not appear smooth. I am attaching a scatterplot comparing VAFs of two related tum. samples. enter image description here the pink dot was a hallmark mut. and hence highlighted. The grey blobs are private mut.

ADD REPLYlink written 3.3 years ago by Amitm1.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1577 users visited in the last hour