Trying to analyse count data using R, its throwing error "duplicate 'row.names' are not allowed" . when in reality I have checked the files if there's any discrepancy (as in duplicate row present and whether the sample data matches with count data).
Can anyone help?
I found this in this blog, someone's comment "From what I've understood from answers to similar questions, a possible problem may be that the read.csv command is not recognizing the zeros in the last column as values, so the program reads it as if the first row contained one fewer field than the number of columns, and hence uses the first column for the row names.However, when I create a "fake" table with actual zeros, blanks, or "NA" in the same positions as shown in the example above, the program has no trouble recognizing them and reading the file."
But I seem to not be able to understand how to solve this, can anyone help?
Without data examples and command lines it is impossible to help you.
I recently had a similar issue that was caused by the text encoding being utf16. By default, read.table (which is used by the other read.xxx functions) tries to automatically detect text encoding but it seems it can fail. Try specifying the encoding with the encoding argument or open and resave the file as utf8.
Check the
row.names
column withuniq -d
to see if any duplicate gene/transcript id present.