Is there some easy way to remove duplicates from the vcf file? I just want to get rid of the list of duplicated SNPs, not to leave one of the duplicates, but to delete them all.
I already tried with bcftools but it didn't work. Now, I am trying to delete them using --exclude intervals from GATK, but I would like to find some other solution if possible. Is there some quick way to just delete lines/SNPs from vcf file?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for the quick answer, but I have already seen those and that's not what I need. I want to delete not only the duplicate but the "nonduplicate" too.
So, if there is SNP1 4356789 and SNP1 4356789 I want to get rid of both. Not to leave one of them, but to delete both.
Thank you, this worked!