Interpreting bcftools isec output
1
0
Entering edit mode
2.7 years ago
pabe ▴ 30

Hi, I attempted to find overlapping variants across my four panels using

bcftools isec -p dir -n=4 <file1.vcf.gz> <file2.vcf.gz> <file3.vcf.gz> <file4.vcf.gz>

I got this list of outputs:

0000.vcf 
0001.vcf 
0002.vcf 
0003.vcf 
README.txt 
sites.txt

Where the README.txt says:

dir/0000.vcf  for stripped  <file1.vcf.gz>
dir/0001.vcf  for stripped  <file2.vcf.gz>
dir/0002.vcf  for stripped  <file3.vcf.gz>
dir/0003.vcf  for stripped  <file4.vcf.gz>

My questions are— 1) what does “for stripped” mean? 2) Is sites.txt the list of overlapping variants across the four panels? I’ve searched for documentation that explains the outputs but haven’t been successful. Thanks for your help!

bcftools isec intersection vcf • 3.7k views
ADD COMMENT
2
Entering edit mode
2.7 years ago
Ram 43k

You're asking bcftools to pick sites found in all 4 VCF files (-n=4) and write a new file from each input file (no -w parameter). So, bcftools "strips" each VCF file to contain only loci common to all 4 VCF files and writes them to new files numbered in the order of input parameters.

Yes, sites.txt is the list of common sites across the input files.

Unfortunately, the documentation is insufficient on this and I speak from educated guesses based on experience.

ADD COMMENT
0
Entering edit mode

Got it. Thanks for your help!

ADD REPLY
0
Entering edit mode

Then, after trimming, whether all the 4 vcf output files will be the same?. What will be the difference between the different stripped vcf files?. Please suggest.

ADD REPLY
0
Entering edit mode

Can't say off the top of my head, but my best guess is no, all 4 won't be the same - their loci (CHR, POS, REF, ALT) should match (if the intersection was done with all 4 fields) but other information won't match as that will be derived directly from the input VCF data.

ADD REPLY

Login before adding your answer.

Traffic: 2438 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6