Question: Identifying Private SNPs between multi sample vcf files.
gravatar for nataliagru1
5 months ago by
nataliagru160 wrote:

Dear Community,

Hope all is well. I am having difficulty finding the best way to quantify Private SNPs between my multi sample VCF files. For example, I have 110 samples in my VCF file that I generated via CohortCalling using GATK. I have separated the VCF by samples who are in the same genus.

So I now have 4 VCF files (populations) I would like to compare. I would like to know the total amount of private SNPs compared to each population.

However when I attempt to use command such as BCFTOOLS:

bcftools isec Genus1.vcf.gz Genus2.vcf.gz -p /dir/out

It outputs the correct files but is unable to identify shared or private sites between multisample VCF's.

When I used vcf-compare:

 vcf-compare -g Genus1.vcf.gz  Genus2.vcf.gz

it is only able to output the total number of SNPs. It cant discern any differences between the multi-sample VCF file.

Note: When I run these commands on VCF that contains only one sample these commands execute perfectly and output appropriate data.

Note: I have indexed my files with TABIX and have zipped them using bgzip.

Can anyone offer any guidance or help as to how to quantify total private snps in a multi-sample VCF file compared to another multisample VCF file?

Thank you for taking the time to read my post and for your help!

ADD COMMENTlink written 5 months ago by nataliagru160

Check this out:

Hope this helps.

ADD REPLYlink written 5 months ago by prasundutta87390

I would like to make an updated note. "bcftools isec" works as it should. It was unable to identify private SNPs between my multi-vcf files (Genus1.vcf vs. Genus2.vcf) because I had split these files originally from a vcf file that contained all species (Genusall.vcf). I split my vcf file based on genus using bcftools view.

For some reason bcftools isec cannot identify private or shared SNP with VCF files split using bcftools view. bcftools isec works fine when files are merged instead of split from a master VCF file.

ADD REPLYlink modified 27 days ago • written 27 days ago by nataliagru160
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2773 users visited in the last hour