Question: VCF merge and VCF intersect
0
gravatar for Kousik
4.8 years ago by
Kousik10
Germany
Kousik10 wrote:

I want to use vcf-merge or vcf-intersect on multiple vcf files and then filter the variants based on their quality and DP values. I have a basic question regarding the way vcftools work !

vcf-intersection:

When i intersect two or more vcf files, the QUAL and DP values of a varant in the 1st vcf file is always being shown in the resulted vcf file. The QUAL/DP values of remaining vcf files are being ignored !! If it is the case then filtering after intersection is difficult.

vcf-merge:

When i merge 2 vcf files i am getting 2 different collumns for different genotypes, which is completely fine  but why the average QUAL vales and sum of DP values are being shown ? For example 1st vcf file has a SNP with QUAL=40 and DP=2 and 2nd vcf file has the same SNP with QUAL=200 and DP=50 then the merged vcf file will have the variant with QUAL=120 and DP=52, which is quite promission even though the variant quality and DP were bad in 1 vcf file. In this case also the filtering after merging is difficult.

should i filter the variant before merging them ? Any suggestions will be highly appreciated.

-Kousik

vcftools • 4.8k views
ADD COMMENTlink modified 4.8 years ago by RamRS21k • written 4.8 years ago by Kousik10
1
gravatar for RamRS
4.8 years ago by
RamRS21k
Houston, TX
RamRS21k wrote:

Well, the VCF intersect documents has a disclaimer on these unreliable results. Personally, I am yet to use these tools but I don't see how they could be a stable part of any pipeline. However, I think that's the most the tool can do, given that it cannot be expected to predict which genotype value to pick from the multiple input files. I'd suggest using the intersect to just get the chr and pos values, and then using vcftools (the binary http://vcftools.sourceforge.net/man_latest.html, not the Perl module) to filter them out once you decide where each variant should get its information from.

I'd suggest not relying on the perl module merge and isec to be part of any stable pipeline. It can provide info, but not fit in and give a reliable output that you can use.

ADD COMMENTlink written 4.8 years ago by RamRS21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1001 users visited in the last hour