Removing individuals and plotting Allele Frequency Difference plots from a vcf file
0
0
Entering edit mode
9 weeks ago
Ollie • 0

Hello,

I have a vcf file containing multiple variants across several individuals. I need to remove several of the individuals from the vcf to focus only on those from specific populations, and then create allele frequency difference dot plots for them. I'm very new to this type of work, and don't really know where to go.

I've started to create a conda environment and install gatk into it, but I can only find how to remove variants rather than sampled individuals from the vcf. If anybody could point me in the right direction it would be greatly appreciated. I can provide more details if needed.

vcf • 517 views
1
Entering edit mode

Check out bcftools view and -s parameter.

0
Entering edit mode

Thanks, I'm trying this method out now, but I'm getting an error. I've posted the error in a reply to another comment.

1
Entering edit mode
0
Entering edit mode

Thanks for this. I've just tried it out but I'm getting the following error (I'll include the code I used aswell):

bcftools view -S ^indiv.txt test_1.vcf > filtered.vcf
[W::vcf_parse] Contig '237' is not defined in the header. (Quick workaround: index the file with tabix.)
Undefined tags in the header, cannot proceed in the sample subset mod

0
Entering edit mode

Quick workaround: index the file with tabix.

Have you tried indexing the file with tabix or adding the contig to the header?

0
Entering edit mode