Question: How to check and extract if a SNP is heterozygous or homozygous in a vcf file from the SNP ID
gravatar for akang
2.8 years ago by
akang90 wrote:

I have a vcf file and a list of SNP ids. I checked if SNPs are heterozygous/homozygous in my samples.

vcftools --vcf my vcf file.vcf --snp snp1 --extract-FORMAT-info GT | grep "0/1"

Now, I want to extract the sample ids that are either 0/1 or 1/1 and also i want to run an odds ratio test? Is there a way to do that in vcf tools?

snp vcf • 2.1k views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by akang90

What is the format you need the data in ?

If you do not want to write a simple pysam script, you could do something like:

First clean the VCF such that you will end up with only het SNPs.

vcf-subset --exclude-ref --type SNPs in.vcf.gz > out.vcf

Then you could use the vcftools and no need to grep.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by geek_y9.8k

To grep multiple patterns you can use the following command:

egrep "0/1|1/1"
ADD REPLYlink written 2.8 years ago by iraun3.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 800 users visited in the last hour