I followed the SNPRelate package tutorial in order to filter a VCF by LD and finally pursued PCA. My problem is that by converting the VCF into GDS SNP's and Chromosomes ID's are also converted into a different nomenclature.
sample.id, a unique identifier for each sample.
snp.id, a unique identifier for each SNP.
How can I get the original SNP ID's that are kept after LD pruning?
Thanks in advance.
setwd("~/Desktop/snprelate") library(gdsfmt) library(SNPRelate) cap.vcf <- "all_148.vcf" snpgdsVCF2GDS(cap.vcf, "cap.gds", method = "biallelic.only") snpgdsSummary("cap.gds") genofile <- snpgdsOpen("cap.gds") pop_code <- scan("pop.txt", sep = "\t", what = list("pop.txt")) pop_code set.seed(1000) cap.LD <- snpgdsLDpruning(genofile, remove.monosnp = TRUE, ld.threshold = .1, maf = .1, missing.rate = 0.1, method = "composite", slide.max.bp = 500000, verbose = TRUE) snp_id_list <- unlist(cap.LD)