Vcf filtering GT for between sample variants only
0
0
Entering edit mode
3.9 years ago
kadamek49 • 0

Hello,

I have a vcf file with 3 samples and would like to filter genotype(GT) variants that are the same across all 3 samples (Ex. 0/1 0/1 0/1). I am looking for differences between the 3 samples and want only variants that are different between the 3 samples (Ex. 0/0 0/0 0/1). Most of my genotypes are heterozygous. Does anyone have suggestions on how to do this?

Thank you!

SNP vcftools vcf heterozygous • 1.1k views
ADD COMMENT
1
Entering edit mode

Definitely not an ideal solution and only applies to the case where there are only 3 samples and no phasing in the vcf.

perl -lane '{if($_ =~ /^#/){print }else{my %geno=map{[split /:/,$_]->[0]=>0 } @F[-3 .. -1];if(scalar keys %geno != 1){print } } }' test.vcf

It splits and takes the last 3 columns then gets the genotype in a hash. If the number of keys in the hash is 1 then all genotypes are the same.

ADD REPLY
0
Entering edit mode

Thank you! This worked and accomplished what I was asking. Really appreciate it!

ADD REPLY

Login before adding your answer.

Traffic: 2734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6