Question

edgeR error while trying to read the data

0

Entering edit mode

10.0 years ago

ygowtha ▴ 20

Hello All

I am really new to edgeR so please excuse me if the question is trivial. I am trying to run edgeR to perform some statistical analysis on RNA-Seq data. I was trying to follow an example on edgeR wiki version and adapting it to my experiment. However, I encountered a problem at the initial stage where I am trying to get edgeR to read my data in a text file. My code has been written below.

#raw data

library(edgeR)
Loading reqraw.data<-read.table(file="All_Samples1.txt",header=TRUE)
head(raw.data)

    Geneid DP12101 DP12102 DP1251 DP1252 DP1201 DP1202 DH101 DH102 DH51 DH52
1 rna22510       0       0      0      0      0      0     0     0    0    0
2 rna22511       0       0      0      0      0      0     0     0    0    0
3 rna22512       0       0      0      0      0      0     0     0    0    0
4 rna22513       0       0      0      0      0      0     0     0    0    0
5 rna22514       0       0      0      0      0      0     0     0    0    0
6 rna22515       0       0      0      0      0      0     0     0    0    0

(A background of the example I am following: The raw data in the example has an extra column in its column specifying the length of the reads. The code in that example is same as the one I used below.

#reading data into edgeR

counts<-raw.data[,-c(1,ncol(raw.data))]
rownames(counts)<-raw.data[,1]
colnames(counts) <- paste (c (rep ("DP_10",2), rep("DP_5",2), rep("DP_0",2), rep("DH_10",2), rep("DH_5",2)), c(1:2,1:2,1:2,1:2,1:2), sep=" ")

Error in `colnames<-`(`*tmp*`, value = c("DP_10 1", "DP_10 2", "DP_5 1",  : 'names' attribute [10] must be the same length as the vector [9]

I believe that my error is the presence of 10 attributes in code but there are only 9 columns. And it seems to me that somewhere in the code there is a line asking it to skip the final column. Since I am new it would be great if you could help me correct the error.

PS: I did proceed creating a dummy column at the end but I would like to learn why I was not able to correct it without adding the dummy column. I tried different variations of the code trying to omit 1 in the first two lines, but was not successful.

Thank You

edgeR RNA-Seq statistics R • 3.4k views

ADD COMMENT • link updated 2.6 years ago by Ram 43k • written 10.0 years ago by ygowtha ▴ 20

Ram · Accepted Answer · 2014-05-08

2

Entering edit mode

10.0 years ago

Devon Ryan 104k

You might just:

raw.data<-read.table(file="All_Samples1.txt",header=TRUE, row.names=1)
colnames(counts) <- paste (c (rep ("DP_10",2), rep("DP_5",2), rep("DP_0",2), rep("DH_10",2), rep("DH_5",2)), c(1:2), sep=" ")

Your problem, btw, is due to this: counts<-raw.data[,-c(1,ncol(raw.data))], which is removing the first and last columns of the raw data before when creating counts while you only want to remove the first one. Either just do what I mentioned above or change this to counts<-raw.data[,-1].

ADD COMMENT • link updated 4.3 years ago by Ram 43k • written 10.0 years ago by Devon Ryan 104k

0

Entering edit mode

Yes, that makes sense. Thank you.

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 10.0 years ago by ygowtha ▴ 20