I am attempting to use PLINK2 with its --extract flag to extract the specific set of variants from UK Biobank dataset in *.bgen format. Unfortunately, the variant IDs contained in .bgen file is in rs_id format, which is not unique.
For example, the two different SNPs happening in the same locus will be given the exactly the same rs_id. But I only want to extract the information for one of them.
I wonder if there is a way to ask PLINK2 to extract the variants not only based on the variant ID but also by the Allele1 and Allele2 information?