subset vcf to keep only samples heterozygous or homozygous for the alternate allele of a given variant
1
0
Entering edit mode
2.2 years ago
curious ▴ 750

position 45094160 on chromosome 15 in my vcf of 100 samples corresponds to a missense variant

I want to create a new vcf that contains only those samples of the 100 that are homozygous or heterozygous for the alternate allele at pos 45094160. I realize that I can do this with python or grep with some work, but I am wondering if there is a handy way to do it with bcftools or some other high level tool using the position or ID?

Thanks.

bcftools • 724 views
ADD COMMENT
0
Entering edit mode
2.2 years ago
cfos4698 ★ 1.1k

If you want to use bcftools, you'll first need to get a list of all samples that are heterozygous or homozygous for the alternate allele in an initial step somehow, then use:

bcftools view -s sample1,sample2 file.vcf > filtered.vcf
bcftools view -S sample_file.txt file.vcf > filtered.vcf

(https://bioinformatics.stackexchange.com/questions/3477/how-to-subset-samples-from-a-vcf-file)

ADD COMMENT

Login before adding your answer.

Traffic: 1944 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6