Question: Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed
0
gravatar for xxxxxxxx
13 months ago by
xxxxxxxx20
xxxxxxxx20 wrote:

my matrix is like this- 13367*13367 long matrix-

        NBAS    DNAH9   NRAS    NRAS    TP53    TP53    TP53    SCYL2   RNF19A
    NBAS    1   0   0   0   0   0   0   0   0
    DNAH9   0   1   0   0   0   0   0   0   0
    NRAS    0   0   1   0   0   0   0   0   0
    NRAS    0   0   0   1   0   0   0   0   0
    TP53    0   0   0   0   1   0   0   0   0
    TP53    0   0   0   0   0   1   0   0   0
    TP53    0   0   0   0   0   0   1   0   0
    SCYL2   0   0   0   0   0   0   0   1   0
    RNF19A  0   0   0   0   0   0   0   0   1

I need to extract all pairs of rows and column headers for which the value is equal to 1 . I m using the following R script-

Pmatrix = read.csv ("file.csv", header= TRUE, row.names = 1)
sig_values <- which(Pmatrix==1, arr.in=TRUE)
cbind.data.frame(colIDs = colnames(Pmatrix)[ sig_values[, 1] ],rowIDs = rownames(Pmatrix)[ sig_values[, 2] ])

but getting error-

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed

If I will put row.names = False R will assume no rownames and add numbering instead. But i need the row names and column names not the numbers.

matrix rstudio csv R • 1.4k views
ADD COMMENTlink modified 13 months ago by swbarnes27.7k • written 13 months ago by xxxxxxxx20

You should not duplicate your post ....

ADD REPLYlink written 13 months ago by Titus910

But the duplicates headers contain different values inside the matrix how can i delete those duplicates? i need those duplicate headers

ADD REPLYlink written 13 months ago by xxxxxxxx20

Why are there multiple TP53 entries? Rather than appending suffixes to the row/column names, or similarly modifying on import - work out why you ended up with duplicated gene symbols in this dataset and fix that.

ADD REPLYlink written 13 months ago by russhh5.3k

because these are different mutations of same gene

ADD REPLYlink written 13 months ago by xxxxxxxx20
2

Not without further information they aren't. Supposing your desired workflow worked and you were able to show that TP53 and BRAF were co mutated or whatever, and you printed out "TP53\tBRAF". When you come back to the dataset, how will you know which of the different TP53 mutations it was that associates with BRAF? Get your data annotated such that you can disambiguate one row from another and one column from another. Do this before you export the matrix; and if you do that, the matrix will import properly, because the rownames will be unique

ADD REPLYlink written 13 months ago by russhh5.3k

Cross posted at StackOverflow, and closed as duplicate, see linked posts there.

ADD REPLYlink modified 13 months ago • written 13 months ago by zx87549.2k

Thank you but those answers are not working for me

ADD REPLYlink modified 13 months ago • written 13 months ago by xxxxxxxx20
1
gravatar for swbarnes2
13 months ago by
swbarnes27.7k
United States
swbarnes27.7k wrote:

Throw a "_1", "_2", etc on those non-unique names to make them unique.

ADD COMMENTlink modified 13 months ago • written 13 months ago by swbarnes27.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1241 users visited in the last hour