I have a vcf file with phased data ("dataset1"), which I want to analyse together with some other genotype data ("dataset2").
For some loci in dataset1, the ref/alt alleles are opposite to those in dataset2, i.e. for a given SNP I get A/G and G/A respectively.
I have to questions:
Is there any way I can either
(i) check quickly the ref/alt consistency across all my loci in the two datasets and ideally remove all inconsistent positions? would the --diff-site-discordance flag from vcftools perform something like that?
(ii) swop the ref/alt information for the SNPs of my choice directly on the vcf files? I want to avoid converting to plink because I don't want to lose the phase information
Any ideas will be very much appreciated.