Question: VCF merge and VCF intersect
gravatar for Kousik
5.6 years ago by
Kousik10 wrote:

I want to use vcf-merge or vcf-intersect on multiple vcf files and then filter the variants based on their quality and DP values. I have a basic question regarding the way vcftools work !


When i intersect two or more vcf files, the QUAL and DP values of a varant in the 1st vcf file is always being shown in the resulted vcf file. The QUAL/DP values of remaining vcf files are being ignored !! If it is the case then filtering after intersection is difficult.


When i merge 2 vcf files i am getting 2 different collumns for different genotypes, which is completely fine  but why the average QUAL vales and sum of DP values are being shown ? For example 1st vcf file has a SNP with QUAL=40 and DP=2 and 2nd vcf file has the same SNP with QUAL=200 and DP=50 then the merged vcf file will have the variant with QUAL=120 and DP=52, which is quite promission even though the variant quality and DP were bad in 1 vcf file. In this case also the filtering after merging is difficult.

should i filter the variant before merging them ? Any suggestions will be highly appreciated.


vcftools • 5.3k views
ADD COMMENTlink modified 5.6 years ago by RamRS26k • written 5.6 years ago by Kousik10
gravatar for RamRS
5.6 years ago by
Houston, TX
RamRS26k wrote:

Well, the VCF intersect documents has a disclaimer on these unreliable results. Personally, I am yet to use these tools but I don't see how they could be a stable part of any pipeline. However, I think that's the most the tool can do, given that it cannot be expected to predict which genotype value to pick from the multiple input files. I'd suggest using the intersect to just get the chr and pos values, and then using vcftools (the binary, not the Perl module) to filter them out once you decide where each variant should get its information from.

I'd suggest not relying on the perl module merge and isec to be part of any stable pipeline. It can provide info, but not fit in and give a reliable output that you can use.

ADD COMMENTlink modified 5 months ago • written 5.6 years ago by RamRS26k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1802 users visited in the last hour