Question: Vcf filtering GT for between sample variants only
0
gravatar for kadamek49
4 months ago by
kadamek490
kadamek490 wrote:

Hello,

I have a vcf file with 3 samples and would like to filter genotype(GT) variants that are the same across all 3 samples (Ex. 0/1 0/1 0/1). I am looking for differences between the 3 samples and want only variants that are different between the 3 samples (Ex. 0/0 0/0 0/1). Most of my genotypes are heterozygous. Does anyone have suggestions on how to do this?

Thank you!

snp vcftools heterozygous vcf • 184 views
ADD COMMENTlink written 4 months ago by kadamek490
1

Definitely not an ideal solution and only applies to the case where there are only 3 samples and no phasing in the vcf.

perl -lane '{if($_ =~ /^#/){print }else{my %geno=map{[split /:/,$_]->[0]=>0 } @F[-3 .. -1];if(scalar keys %geno != 1){print } } }' test.vcf

It splits and takes the last 3 columns then gets the genotype in a hash. If the number of keys in the hash is 1 then all genotypes are the same.

ADD REPLYlink written 4 months ago by microfuge1.8k

Thank you! This worked and accomplished what I was asking. Really appreciate it!

ADD REPLYlink written 4 months ago by kadamek490
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1817 users visited in the last hour