Question: Position error using write.plink
2
gravatar for ida.larsson
10 months ago by
ida.larsson20
ida.larsson20 wrote:

Hi, I am experiencing a problem when trying to use write.plink from the R-package snpStats. The output is three plink-files; .bed, .bim and .fam and the position of each SNP in the resulting .bim-file is completely wrong (see below).

1   rs6678176   0   5   A   G
1   rs78286437  0   7   C   T
1   rs72961022  0   17  C   T
1   rs12082355  0   46  C   T
1   rs11166268  0   53  A   C
1   rs12059609  0   73  A   G
1   rs6657921   0   92  C   G
1   rs7514596   0   115 C   G

In addition to the SnpMatrix with all genotypes (800 000 SNPs, 95 samples) I also give write.plink a subject.data dataframe containing info about the samples and a snp.data dataframe which contains info about the SNPs, e.g. chromosome and position. When I manually go through parts of snp.data the positions seems to be correct, so I guess the error occurs when using write.plink. This is my line of code for outputing the plink-files:

library("snpStats")

write.plink('plinkfromR', snps = snps, subject.data = famFile,
            pedigree = pedigree, id = id, father = father, mother = mother,
            sex = sex, phenotype = phenotype, snp.data = smc_test$map,
            chromosome = chromosome, genetic.distance = genetic.distance,
            position = position, allele.1 = allele.1, allele.2 = allele.2)

and here is the snp.data dataframe containing correct positions for the SNPs:

           chromosome genetic.distance  position allele.1 allele.2
rs6678176           1               NA 100000827        A        G
rs78286437          1               NA 100000843        C        T
rs72961022          1               NA 100003476        C        T
rs12082355          1               NA 100007454        C        T
rs11166268          1               NA 100008607        A        C
rs12059609          1               NA 100012794        A        G

I would be very grateful for any help or suggestions on how to resolve this. This is causing me a lot of errors later in my analysis!

Ida

write.plink snpstats plink snp R • 342 views
ADD COMMENTlink modified 10 months ago by zx87548.2k • written 10 months ago by ida.larsson20

Are some arguments not supposed to be in quotes? E.g.: sex = "sex" ?

ADD REPLYlink written 10 months ago by zx87548.2k

That might be the case since it refers to the column names of the dataframes "famFile" and "smc_test$map", however I couldn't read that from the documentation and most of the info seems to be read correctly anyway (sex is for example correct in the output .fam file). But I can try changing and see what happens!

ADD REPLYlink modified 10 months ago • written 10 months ago by ida.larsson20

Can you also post the map file you are using here? snp.data = smc_test$map

Is there a discordance between your snp.data and this map?

ADD REPLYlink written 10 months ago by Vivek2.3k

the smc_test$map is the snp.data-file, so there should be no discordance, I posted a part of it above. Or am I misunderstanding something with the snp.data?

> head(smc_test$map)
           chromosome genetic.distance  position allele.1 allele.2
rs6678176           1               NA 100000827        A        G
rs78286437          1               NA 100000843        C        T
rs72961022          1               NA 100003476        C        T
rs12082355          1               NA 100007454        C        T
rs11166268          1               NA 100008607        A        C
rs12059609          1               NA 100012794        A        G
ADD REPLYlink modified 10 months ago • written 10 months ago by ida.larsson20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1686 users visited in the last hour