Deseq No Recognizing My Row.Names
1
0
Entering edit mode
10.9 years ago
Alicia ▴ 20

Hi,

I'm trying to use DESeq to know the differential expressed genes of my datasets and i'm encountering that DESeq is not recognizing my row.names so i can't create my cds.

My .csv input file looks like:

transcript_id,C4,CRL_2APR10,CRL_1_15JUL11,CRL_2_15JUL11 
comp1000201_c0_seq1,5.00,0.00,0.00,0.00
comp1000297_c0_seq1,7.00,0.00,0.00,0.00
comp100036_c0_seq1,0.00,0.00,0.00,0.00
comp10003_c1_seq1,2.00,0.00,0.00,0.00
comp100041_c0_seq1,3.00,0.00,0.00,0.00
comp100041_c0_seq2,0.00,0.00,0.00,0.00
comp100041_c0_seq3,0.00,0.00,0.00,0.00
comp100051_c0_seq1,0.00,0.00,0.00,0.00
comp1000890_c0_seq1,3.00,0.00,0.00,0.00

This is what i'm running:

> spercysts_vs_embryos = read.csv (
+   file.choose(), 
+   header = TRUE, 
+   row.names=1, 
+   sep = ",", 
+   dec = ".")

> head(spercysts_vs_embryos)
                    C4 CRL_2APR10 CRL_1_15JUL11 CRL_2_15JUL11
comp1000201_c0_seq1  5          0             0             0
comp1000297_c0_seq1  7          0             0             0
comp100036_c0_seq1   0          0             0             0
comp10003_c1_seq1    2          0             0             0
comp100041_c0_seq1   3          0             0             0
comp100041_c0_seq2   0          0             0             0

>cond = factor(c("SP", "SP", "EB", "EB"))

> spercysts_vs_embryosDesign = data.frame(
+   row.names = colnames( spercysts_vs_embryos ), 
+   condition = c( "SP", "SP", "EB", "EB" ), 
+   libType = c( "paired-end", "paired-end", "paired-end", "paired-end" ) )
> spercysts_vs_embryosDesign
              condition    libType
C4                   SP paired-end
CRL_2APR10           SP paired-end
CRL_1_15JUL11        EB paired-end
CRL_2_15JUL11        EB paired-end

> str(spercysts_vs_embryos)
'data.frame':    307048 obs. of  4 variables:
 $ C4           : num  5 7 0 2 3 0 0 0 3 0 ...
 $ CRL_2APR10   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ CRL_1_15JUL11: num  0 0 0 0 0 0 0 0 0 10 ...
 $ CRL_2_15JUL11: num  0 0 0 0 0 0 0 0 0 3 ...

So, everything looks fine to me. But when i try to create my cds:

> cds <-newCountDataSet(spercysts_vs_embryos, cond )
Error in newCountDataSet(spercysts_vs_embryos, cond) : 
  The countData is not integer.

So, if i check what is happening:

> which( is.na(spercysts_vs_embryos), arr.ind=TRUE )
     row col

Any suggestions??? Thanks!

differential-expression ngs differential-expression • 2.8k views
ADD COMMENT
0
Entering edit mode
10.9 years ago

I think the problem is exactly what the error states: Your counts are not integers. They look like they are, but in your input file you have decimal points. DESeq expects integers, because it is not possible that for example 10.38 reads map to a sequence.

Try:

is.integer(spercysts_vs_embryos$C4[1])

and you will probably get FALSE as a result, which means that your 5 is still a 5.00. You can change that by applying as.integer() to your columns.

ADD COMMENT

Login before adding your answer.

Traffic: 2793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6