How to correct PLINK input for SNPRelate so that samples are not uploaded as snps
1
0
Entering edit mode
8.7 years ago
moithuti • 0

I have some 1000 Genomes Phase 3 data as binary PLINK files. Reading in the files results in the following:

read in files from directory

fam.chr1.1000GP3 <- "chr1.common.variants.1000GP3_20130502_allchrSNPs_nodup2_updatefidparentidgender.fam" bim.chr1.1000GP3 <- "chr1.common.variants.1000GP3_20130502_allchrSNPs_nodup2_updatefidparentidgender.bim" bed.chr1.1000GP3 <- "chr1.common.variants.1000GP3_20130502_allchrSNPs_nodup2_updatefidparentidgender.bed"

convert to gds format

snpgdsBED2GDS(bed.chr1.1000GP3, bim.chr1.1000GP3, fam.chr1.1000GP3, "chr1.1000GP3.gds", snpfirstdim = FALSE) Start snpgdsBED2GDS ... BED file: "chr1.common.variants.1000GP3_20130502_allchrSNPs_nodup2_updatefidparentidgender.bed" in the SNP-major mode (Sample X SNP) FAM file: "chr1.common.variants.1000GP3_20130502_allchrSNPs_nodup2_updatefidparentidgender.bim", DONE. BIM file: "chr1.common.variants.1000GP3_20130502_allchrSNPs_nodup2_updatefidparentidgender.fam", DONE. Tue Feb 14 16:43:56 2017 store sample id, snp id, position, and chromosome. start writing: 530207 samples, 2504 SNPs ... Tue Feb 14 16:43:56 2017 0% Tue Feb 14 16:44:21 2017 100% Tue Feb 14 16:44:24 2017 Done. Optimize the access efficiency ... Clean up the fragments of GDS file: open the file "chr1.1000GP3.gds" (size: 334054845). # of fragments in total: 39. save it to "chr1.1000GP3.gds.tmp". rename "chr1.1000GP3.gds.tmp" (size: 334054593). # of fragments in total: 18. Warning message: In snpgdsBED2GDS(bed.chr1.1000GP3, bim.chr1.1000GP3, fam.chr1.1000GP3, : NAs introduced by coercion

Essentially the data is swappped, instead of 2504 samples with 530 2017 snps, it is the other way round. How do I change my PLINK file so that the input will reflect 2504 individuals and ~ 500k snps?

SNPRelate PLINK Data format PCA • 2.3k views
ADD COMMENT
0
Entering edit mode
8.7 years ago
moithuti • 0

Took a simple workaround to solve. Basically I converted the binary PLINK file to a vcf and the input was read properly as X - samples and X - snps. If anyone can still get the PLINK format to work correctly, I would still like to know how to correct the PLINK file so that it is read in as X - samples and X-snps.

ADD COMMENT

Login before adding your answer.

Traffic: 3613 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6