Question

Error when running 'DESeqDataSetFromMatrix' from Deseq2

0

Entering edit mode

2.4 years ago

Bioinf_Questions • 0

Hello. I'm having problem at the very beggining of the DESeq2 pipeline. When I try to run the command

dds <- DESeqDataSetFromMatrix(countData=countData, colData=metaData, design=~dex, tidy = TRUE)

I keep getting the error mesage:

Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed
Além disso: Warning message:
non-unique values when setting 'row.names': ‘ACX1’, ‘CYCB1’

I used rownames(data)[duplicated(rownames(data))] to check and the results were 'character(0)' for both the data and metadata.

What could be wrong with my dataset?

Thanks in advance

Deseq2 Dataset • 1.6k views

ADD COMMENT • link 2.4 years ago by Bioinf_Questions • 0

score 2 · Answer 1 · 2021-11-28

If you set tidy=TRUE DESeq2 tries to convert the first column into the rownames. What that error is telling you is that you have duplicated values (gene names) in your first column, in this case the genes ACX1 and CYCB1.

To remove rows with duplicated gene names in the first column run the following code.

countData <- countData[!(duplicated(countData[[1]]) | duplicated(countData[[1]], fromLast=TRUE)), ]

It's also important to figure out why there are duplicate values in your data. It could be caused by an error in your code, or a consequence of the annotation used, such as multiple gene IDs being associated with a single gene name. If the later is true it might be better to keep everything as gene IDs (which I generally recommend anyway).