bcftools merge: does it only merge overlapping variants or all
1
2
Entering edit mode
5.6 years ago

Hi there,

I'm not sure about the following: if you use bcftools merge to merge vcf.gz files does it merge the overlapping variants only or does it also merge variants that are present in one dataset but not in the other?

Thanks!

SNP bcftools merging • 5.9k views
ADD COMMENT
0
Entering edit mode

Thank you very much for the answer. Is there an option to only merge the overlapping variants from the beginning?

ADD REPLY
0
Entering edit mode

Hello again,

please use the ADD REPLY button below the post you like to reply to.

I'm not aware of such in option. If you can make sure, that in your vcf file you like to merge, are no genotypes ./., you could filter out those sites after merging.

$ bcftools merge in1.vcf.gz in2.vcf.gz|bcftools filter -e 'GT="./."' > out.vcf

fin swimmer

ADD REPLY
0
Entering edit mode

Okay thanks. But if I do it that way it will still be present in one part of the dataset (for the ones it was present before merging) or will it the variant be remove from the whole dataset?

Isabel

ADD REPLY
1
Entering edit mode

Your initial datasets will be keepd untouched.

If you are interested to find out which variants are in all your vcf files, without merging them than the term you are looking for is intersect.

Have a look at the man page of bcftools isec for some examples. So e.g. this might be a useful command:

# Extract and write records from A shared by both A and B using exact allele match
   bcftools isec A.vcf.gz B.vcf.gz -p dir -n =2 -w 1

fin swimmer

ADD REPLY
1
Entering edit mode
5.6 years ago

Hello,

It also merges variants that are present in one file but not in the other. In that case the genotype is set to ./. by default. You can set it to 0/0 by invoking the --missing-to-ref parameter.

fin swimmer

ADD COMMENT

Login before adding your answer.

Traffic: 2583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6