Filter 2 mapped VCF files so the "POS" are the same?
1
0
Entering edit mode
23 months ago
JBC • 0

I have two VCF files from different individuals. They are both mapped and have a number of non-overlapping positions. Is there a way to filter then so only the overlapping POS remain in the file?

File 1:

  #CHROM    POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  P1
  NW_014726.1   404 .   A   G   225 .   DP=15;VDB=0.310396;SGB=-0.686358;MQSB=0.640197;MQ0F=0;AC=2;AN=2;DP4=0,0,7,7;MQ=43   GT:PL   1/1:255,42,0
    NW_014726.1 495 .   T   C   225 .   DP=16;VDB=0.351783;SGB=-0.689466;MQSB=0.948436;MQ0F=0;AC=2;AN=2;DP4=0,0,6,10;MQ=39  GT:PL   1/1:255,48,0
    NW_014726.1 542 .   T   G   225 .   DP=16;VDB=0.909865;SGB=-0.688148;MQSB=0.959011;MQ0F=0;AC=2;AN=2;DP4=0,0,8,7;MQ=41   GT:PL   1/1:255,45,0
    NW_014726.1 722 .   G   A   225 .   DP=24;VDB=0.680279;SGB=-0.692352;MQSB=0.769328;MQ0F=0;AC=2;AN=2;DP4=0,0,8,13;MQ=40  GT:PL   1/1:255,63,0
    NW_014726.1 750 .   G   C   225 .   DP=22;VDB=0.501443;SGB=-0.692352;MQSB=0.95645;MQ0F=0;AC=2;AN=2;DP4=0,0,7,14;MQ=39   GT:PL   1/1:255,63,0

File 2:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  M2
NW_014726.1 404 .   A   G   225 .   DP=14;VDB=0.237864;SGB=-0.683931;MQSB=0.679965;MQ0F=0;AC=2;AN=2;DP4=0,0,5,8;MQ=43   GT:PL   1/1:255,39,0
NW_014726.1 495 .   T   C   225 .   DP=15;VDB=0.67567;SGB=-0.686358;MQSB=0.939413;MQ0F=0;AC=2;AN=2;DP4=0,0,4,10;MQ=39   GT:PL   1/1:255,42,0
NW_014726.1 530 .   T   G   225 .   DP=16;VDB=0.650373;SGB=-0.686358;MQSB=0.411112;MQ0F=0;AC=2;AN=2;DP4=0,0,9,5;MQ=42   GT:PL   1/1:255,42,0
NW_014726.1 745 .   G   A   225 .   DP=18;VDB=0.178812;SGB=-0.690438;MQSB=1;MQ0F=0;AC=2;AN=2;DP4=0,0,6,11;MQ=41 GT:PL   1/1:255,51,0
NW_014726.1 750 .   G   C   206 .   DP=17;VDB=0.787081;SGB=-0.680642;RPB=0.893597;MQB=0.031048;MQSB=1;BQB=0.965874;MQ0F=0;ICB=1;HOB=0.5;AC=1;AN=2;DP4=2,3,4,8;MQ=41 GT:PL   0/1:239,0,115

The output I am looking for would remove the POS that are not in common and would look like this:

File 1:

  #CHROM    POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  P1
  NW_014726.1   404 .   A   G   225 .   DP=15;VDB=0.310396;SGB=-0.686358;MQSB=0.640197;MQ0F=0;AC=2;AN=2;DP4=0,0,7,7;MQ=43   GT:PL   1/1:255,42,0
    NW_014726.1 495 .   T   C   225 .   DP=16;VDB=0.351783;SGB=-0.689466;MQSB=0.948436;MQ0F=0;AC=2;AN=2;DP4=0,0,6,10;MQ=39  GT:PL   1/1:255,48,0
    NW_014726.1 750 .   G   C   225 .   DP=22;VDB=0.501443;SGB=-0.692352;MQSB=0.95645;MQ0F=0;AC=2;AN=2;DP4=0,0,7,14;MQ=39   GT:PL   1/1:255,63,0

File 2:

   #CHROM   POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  M2
    NW_014726.1 404 .   A   G   225 .   DP=14;VDB=0.237864;SGB=-0.683931;MQSB=0.679965;MQ0F=0;AC=2;AN=2;DP4=0,0,5,8;MQ=43   GT:PL   1/1:255,39,0
    NW_014726.1 495 .   T   C   225 .   DP=15;VDB=0.67567;SGB=-0.686358;MQSB=0.939413;MQ0F=0;AC=2;AN=2;DP4=0,0,4,10;MQ=39   GT:PL   1/1:255,42,0
    NW_014726.1 750 .   G   C   206 .   DP=17;VDB=0.787081;SGB=-0.680642;RPB=0.893597;MQB=0.031048;MQSB=1;BQB=0.965874;MQ0F=0;ICB=1;HOB=0.5;AC=1;AN=2;DP4=2,3,4,8;MQ=41 GT:PL   0/1:239,0,115

I tried using

bcftools merge

with the hope of just filtering a single file, but i keep running into problems, so no I am looking for a way to filter the individuals files based on the positions in the other.

Looking for any solution in R, BCFtools/VCFtools, or whatever combination of awk, grep, ... works.

VCF bcftools • 712 views
ADD COMMENT
1
Entering edit mode

use bcftools isec

ADD REPLY
0
Entering edit mode
23 months ago
JBC • 0

I found a solution! In case anyone else is trying to do the same thing, I used bedtools:

bedtools intersect -sorted -a File1.vcf -b File2.vcf > NewFile1.vcf

Where NewFile1.vcf is File1.vcf with POS removed that are not in File2.vcf

ADD COMMENT
1
Entering edit mode

this is not want you asked for. I you have any indel in your VCF1, it might overlap a variant in VCF2 while not having the same POS.

ADD REPLY
0
Entering edit mode

OK. I did not know that. But I did already filter out all indels with:

> vcftools  --remove-indels

Does that get around the problem?

ADD REPLY

Login before adding your answer.

Traffic: 2405 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6