Entering edit mode
9 weeks ago
Ollie • 0
I have a vcf file containing multiple variants across several individuals. I need to remove several of the individuals from the vcf to focus only on those from specific populations, and then create allele frequency difference dot plots for them. I'm very new to this type of work, and don't really know where to go.
I've started to create a conda environment and install gatk into it, but I can only find how to remove variants rather than sampled individuals from the vcf. If anybody could point me in the right direction it would be greatly appreciated. I can provide more details if needed.
Thanks, I'm trying this method out now, but I'm getting an error. I've posted the error in a reply to another comment.
Removal of multiple samples from vcf file
Thanks for this. I've just tried it out but I'm getting the following error (I'll include the code I used aswell):
Have you tried indexing the file with tabix or adding the contig to the header?