Duplicate row name not allowed while reading the file in R
0
0
Entering edit mode
4.9 years ago
imanbh • 0

Trying to analyse count data using R, its throwing error "duplicate 'row.names' are not allowed" . when in reality I have checked the files if there's any discrepancy (as in duplicate row present and whether the sample data matches with count data).

Can anyone help?

I found this in this blog, someone's comment "From what I've understood from answers to similar questions, a possible problem may be that the read.csv command is not recognizing the zeros in the last column as values, so the program reads it as if the first row contained one fewer field than the number of columns, and hence uses the first column for the row names.However, when I create a "fake" table with actual zeros, blanks, or "NA" in the same positions as shown in the example above, the program has no trouble recognizing them and reading the file."

But I seem to not be able to understand how to solve this, can anyone help?

RNA-Seq deseq2 Differential-Gene-expression R • 1.5k views
ADD COMMENT
2
Entering edit mode

Without data examples and command lines it is impossible to help you.

ADD REPLY
0
Entering edit mode

I recently had a similar issue that was caused by the text encoding being utf16. By default, read.table (which is used by the other read.xxx functions) tries to automatically detect text encoding but it seems it can fail. Try specifying the encoding with the encoding argument or open and resave the file as utf8.

ADD REPLY
0
Entering edit mode

Check the row.names column with uniq -d to see if any duplicate gene/transcript id present.

ADD REPLY

Login before adding your answer.

Traffic: 2011 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6