Question

Duplicate row name not allowed while reading the file in R

0

Entering edit mode

4.9 years ago

imanbh • 0

Trying to analyse count data using R, its throwing error "duplicate 'row.names' are not allowed" . when in reality I have checked the files if there's any discrepancy (as in duplicate row present and whether the sample data matches with count data).

Can anyone help?

I found this in this blog, someone's comment "From what I've understood from answers to similar questions, a possible problem may be that the read.csv command is not recognizing the zeros in the last column as values, so the program reads it as if the first row contained one fewer field than the number of columns, and hence uses the first column for the row names.However, when I create a "fake" table with actual zeros, blanks, or "NA" in the same positions as shown in the example above, the program has no trouble recognizing them and reading the file."

But I seem to not be able to understand how to solve this, can anyone help?

RNA-Seq deseq2 Differential-Gene-expression R • 1.5k views

ADD COMMENT • link updated 16 days ago by Ram 43k • written 4.9 years ago by imanbh • 0

2

Entering edit mode

Without data examples and command lines it is impossible to help you.

ADD REPLY • link 4.9 years ago by ATpoint 82k

0

Entering edit mode

I recently had a similar issue that was caused by the text encoding being utf16. By default, read.table (which is used by the other read.xxx functions) tries to automatically detect text encoding but it seems it can fail. Try specifying the encoding with the encoding argument or open and resave the file as utf8.

ADD REPLY • link 4.9 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Check the row.names column with uniq -d to see if any duplicate gene/transcript id present.

ADD REPLY • link 4.9 years ago by Arup Ghosh 3.2k