Is there a quick way to extract genotype information about a sample from a vcf file? For a given VCF, I would essentially like to create the following table for a given sample: POS CHROM GT
did you try a few 'cut's ? like cut -f 1,2,10 | cut -d ':' -f1 ?
I have a 10 line python script. Do you run how to run a python script? I can send it to you. In case you want to extract all the positions irrespective of if the given sample is polymorphic (compared to reference) for that position then you can simple use "cut"
command from unix.
i would like to have a look at that script and most probably will use it. can you possibly share?
You can do this in vcftools:
vcftools --vcf <your_vcf> --indv <your_sample> --extract-FORMAT-info GT --out <prefix>
The results will be in a file called "<prefix>.GT.FORMAT".
I am having some issue reading the exported file (extension GT.FORMAT) in my python script, even when copied into a text file (OSX). Is there some conversion required to read the file properly?
Use gatk varianttotable
There's a vcftools module that does exactly what you were asking for: vcf-to-tab. the syntax is very simple:
vcf-to-tab <in.vcf >out.tab
check vcflib for various vcf utility programs including vcfgenotypes
you can clone the git and install
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy