Hi All,
I am looking for a way (possibly using bcftools) to extract samples based on the variant, but I haven't found it yet.
For instance:
CHROM POS REF ALT SAMPLE.A SAMPLE.B SAMPLE.C
chr1 10 A TAA 0/0 0/1 1/1
chr1 10 C GGA 0/0 0/0 0/1
I want to have samples that are Het or Hom for the SNP 1-10-A-TAA. so the Ideal output should be something similar to:
chr1 10 A TAA SAMPLE.B SAMPLE.C
The file I am working with are huge so I would need the best computational approach.
So far I came out with this:
bcftools query -r chr1:1-15 -i "GT=="AA" & GT=="AR"' -f '%CHR %POS %REF %ALT [/t%SAMPLE=%GT]\n' file_name.bcf
Any suggestions which can improve the speed?
Thanks to whoever spend some time to help me.