Interpreting bcftools isec output
1
0
Entering edit mode
11 weeks ago
pabe ▴ 10

Hi, I attempted to find overlapping variants across my four panels using

bcftools isec -p dir -n=4 <file1.vcf.gz> <file2.vcf.gz> <file3.vcf.gz> <file4.vcf.gz>

I got this list of outputs:

0000.vcf 
0001.vcf 
0002.vcf 
0003.vcf 
README.txt 
sites.txt

Where the README.txt says:

dir/0000.vcf  for stripped  <file1.vcf.gz>
dir/0001.vcf  for stripped  <file2.vcf.gz>
dir/0002.vcf  for stripped  <file3.vcf.gz>
dir/0003.vcf  for stripped  <file4.vcf.gz>

My questions are— 1) what does “for stripped” mean? 2) Is sites.txt the list of overlapping variants across the four panels? I’ve searched for documentation that explains the outputs but haven’t been successful. Thanks for your help!

bcftools isec intersection vcf • 337 views
ADD COMMENT
1
Entering edit mode
11 weeks ago
Ram 34k

You're asking bcftools to pick sites found in all 4 VCF files (-n=4) and write a new file from each input file (no -w parameter). So, bcftools "strips" each VCF file to contain only loci common to all 4 VCF files and writes them to new files numbered in the order of input parameters.

Yes, sites.txt is the list of common sites across the input files.

Unfortunately, the documentation is insufficient on this and I speak from educated guesses based on experience.

ADD COMMENT
0
Entering edit mode

Got it. Thanks for your help!

ADD REPLY

Login before adding your answer.

Traffic: 2639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6