Question: Vcf filtering GT for between sample variants only
gravatar for kadamek49
4 months ago by
kadamek490 wrote:


I have a vcf file with 3 samples and would like to filter genotype(GT) variants that are the same across all 3 samples (Ex. 0/1 0/1 0/1). I am looking for differences between the 3 samples and want only variants that are different between the 3 samples (Ex. 0/0 0/0 0/1). Most of my genotypes are heterozygous. Does anyone have suggestions on how to do this?

Thank you!

snp vcftools heterozygous vcf • 184 views
ADD COMMENTlink written 4 months ago by kadamek490

Definitely not an ideal solution and only applies to the case where there are only 3 samples and no phasing in the vcf.

perl -lane '{if($_ =~ /^#/){print }else{my %geno=map{[split /:/,$_]->[0]=>0 } @F[-3 .. -1];if(scalar keys %geno != 1){print } } }' test.vcf

It splits and takes the last 3 columns then gets the genotype in a hash. If the number of keys in the hash is 1 then all genotypes are the same.

ADD REPLYlink written 4 months ago by microfuge1.8k

Thank you! This worked and accomplished what I was asking. Really appreciate it!

ADD REPLYlink written 4 months ago by kadamek490
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1817 users visited in the last hour