Question: LOH and CNV data from VCF files
gravatar for GarF
6.2 years ago by
European Union
GarF20 wrote:

I was asked to find both the CNV and the LOH datas from 3 VCF files (all from the same patient but 2 of them are from different tumors). Here's one of the rows:

chr1 843352 . T C 11.3 . DP=39;VDB=0.0412;AF1=0.5;AC1=1;DP4=13,11,5,5;MQ=42;FQ=14.2;PV4=1,1,1.2e-05,1 GT:PL:GQ 0/1:41,0,255:43


##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
##INFO=<ID=DP4,Number=4,Type=Integer,Description="# high-quality ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">
##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square mapping quality of covering reads">
##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred probability of all samples being the same">
##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele frequency (assuming HWE)">
##INFO=<ID=AC1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele count (no HWE assumption)">
##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for strand bias, baseQ bias, mapQ bias and tail distance bias">
##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">

Since I'm new to this I did some researches and, considering that all I have are these VCFs, it seems that I lack both the BAM/SAM files and the fields from the INFO column of the VCFs to get the data I need. 
I kept searching anyway and I bumped into a Bioconductor package, SomatiCA, that seems to deliver what I 
need from, among the other things, Lesser Allele Frequency (LAF) infos. I did some other researches but now 
I'm kinda struggling trying to figure out if and how I can calculate the LAFs from just what I have.

Any help regarding the whole situation will be appreciated, thanks in advance.

cnv loh vcf • 3.2k views
ADD COMMENTlink modified 5.8 years ago by ivivek_ngs5.0k • written 6.2 years ago by GarF20
gravatar for ivivek_ngs
5.8 years ago by
Seattle,WA, USA
ivivek_ngs5.0k wrote:

From the vcf file you can always extact the allele frequency. LAF should be having frequency less than 50%. So at each position you have calculate based from the read depth the lesser allele frequency and then create a tab-delimited file in the fashion required by SomatiCA and then run the required algorithm. You should be familiar with unix and shell commands.

ADD COMMENTlink written 5.8 years ago by ivivek_ngs5.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1651 users visited in the last hour