Question: R: read.plink Error
gravatar for zwang10
4.6 years ago by
United States
zwang1010 wrote:

Using read.plink from snpMatrix

I use following command to read bed file in R:

read.plink("/home/chr22_pos_16854001_to_16857000",na.strings = "-9")

However, it shows

Error in `row.names<`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique value when setting 'row.names': ‘.’

It is weird that when I read chr22_pos_16050001_to_16053000.bed using same command, there is no such error. Can someone help me here?

I use following command from plink to get bed file

$/home/zwang10/research/plink --bfile chr22 --chr 22 --from-bp 16854001 --to-bp 16857001 --make-bed --out chr22_16854001_to_16857001
    PLINK v1.90b3.35 64-bit (25 Mar 2016)
    (C) 2005-2016 Shaun Purcell, Christopher Chang   GNU General Public License v3
    Logging to chr22_16854001_to_16857001.log.
    Options in effect:
      --bfile chr22
      --chr 22
      --from-bp 16854001
      --out chr22_16854001_to_16857001
      --to-bp 16857001

    32134 MB RAM detected; reserving 16067 MB for main workspace.
    9 variants loaded from .bim file.
    1497 people (0 males, 0 females, 1497 ambiguous) loaded from .fam.
    Ambiguous sex IDs written to chr22_16854001_to_16857001.nosex .
    Using 1 thread (no multithreaded calculations invoked).
    Before main variant filters, 1497 founders and 0 nonfounders present.
    Calculating allele frequencies... done.
    9 variants and 1497 people pass filters and QC.
    Note: No phenotypes present.
    --make-bed to chr22_16854001_to_16857001.bed + chr22_16854001_to_16857001.bim +
    chr22_16854001_to_16857001.fam ... done.

And then I use awk '{print $2}' chr22_16854001_to_16857001.fam | sort | uniq -d to check whether there are duplicated individual. But I got no output. I did find the famliy name are same, but individual name is different, part of fam file.

UK10K ALS5085249 0 0 0 -9
UK10K ALS5085250 0 0 0 -9
UK10K ALS5085251 0 0 0 -9
UK10K ALS5085252 0 0 0 -9
UK10K ALS5085253 0 0 0 -9
UK10K ALS5085254 0 0 0 -9
UK10K ALS5085255 0 0 0 -9
UK10K ALS5085256 0 0 0 -9
UK10K ALS5085257 0 0 0 -9
UK10K ALS5085258 0 0 0 -9

One interesting thing is when I set pos1=16050001 and pos2=16053000, and get the chr22_16050001_to_16053000.bed file. Then, I use same command to read bed file, and there is no such error.

bed plink bioconductor R • 2.8k views
ADD COMMENTlink modified 4.6 years ago by chrchang5237.7k • written 4.6 years ago by zwang1010

It says "duplicate 'row.names' are not allowed" so maybe investigate you file for duplicate row.names?

something like cut -f 1 FILE | sort | uniq -d

ADD REPLYlink written 4.6 years ago by Floris Brenk970

I did not get any output.

ADD REPLYlink written 4.6 years ago by zwang1010
gravatar for chrchang523
4.6 years ago by
United States
chrchang5237.7k wrote:

The duplicate "." IDs are in the .bim file, not the .fam file. --set-missing-var-ids provides one way to replace them with unique IDs.

ADD COMMENTlink written 4.6 years ago by chrchang5237.7k

Thank you very much for your help.let me try.

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by zwang1010
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2232 users visited in the last hour