DESeq2
0
0
Entering edit mode
5 weeks ago
rheab1230 ▴ 30

Hello everyone, I am trying to perform deseq2 analysis on my genecount file to normalize it. This is from where I got the gene count file: https://www.ebi.ac.uk/arrayexpress/files/E-GEUV-1/GD660.GeneQuantCount.txt.gz My gene count file looks like this: TargetID Gene_Symbol Chr Coord HG00096.1.M_111124_6 HG00097.7.M_120219_2 HG00099.1.M_120209_6 HG00099.5.M_120131_3 HG00100.2.M_111215_8 HG00101.1.M_111124_4 HG00102.3.M_120202_8 HG00103.4.M_120208_3 HG00104.1.M_111124_5 HG00105.1.M_120209_7 HG00105.3.M_120223_6 HG00106.4.M_120208_5 HG00108.7.M_120219_2 HG00109.1.M_120209_4 HG00109.3.M_120202_5 HG00110.2.M_120131_2 HG00111.1.M_120209_8 HG00111.2.M_111215_4 HG00112.6.M_120119_2 HG00114.1.M_120209_3 HG00114.6.M_120217_1 HG00115.6.M_120119_1 HG00116.2.M_120131_1 HG00117.1.M_111124_2 HG00117.1.M_120209_1 HG00117.2.M_111216_4 HG00117.3.M_120202_6 HG00117.4.M_120208_4 HG00117.5.M_120131_3 HG00117.6.M_120217_1 HG00117.7.M_120219_4 HG00118.4.M_120208_5 HG00119.1.M_120209_3 HG00119.2.M_111216_6 HG00120.3.M_120202_2 HG00121.1.M_111124_7 HG00122.6.M_120119_1 HG00123.4.M_120208_7 HG00124.3.M_120223_7 The code is: GD_dat = read.delim("GD660.GeneQuantCount.txt",header=TRUE,row.names = NULL) GD_dat = GD_dat[,-c(1:3)] head(GD_dat) dim(GD_dat) colnames(GD_dat) = substr(colnames(GD_dat),1,7) rownames(GD_dat) = substr(rownames(GD_dat),1,15) geneNames<-GD_dat[,1] rownames(GD_dat)<-geneNames GD_dat<-GD_dat[,2:ncol(GD_dat)] sample_info<-DataFrame(condition=names(GD_dat), row.names=names(GD_dat)) library("DESeq2")

runs the DESeq2

ds<-DESeqDataSetFromMatrix(countData=GD_dat, colData=sample_info, design= ~condition) keep_genes<-rowSums(counts(ds))>0

I am getting this error: NA20814.2.M_111215_6 NA20815.5.M_120131_5 NA20816.3.M_120202_7 1 0 0 0 2 0 0 0 3 0 0 0 4 10 8 16 5 0 0 0 6 4860 6782 4952 NA20819.3.M_120202_2 NA20826.1.M_111124_1 NA20828.2.M_111216_8 1 0 2 0.000 2 0 0 0.000 3 0 0 0.000 4 6 16 8.000 5 0 0 0.000 6 1864 3446 4814.479 [1] 53934 661 Error in .rowNamesDF<-(x, value = value) : duplicate 'row.names' are not allowed Calls: rownames<- ... row.names<- -> row.names<-.data.frame -> .rowNamesDF<- In addition: Warning message: non-unique values when setting 'row.names': '333174', '568198', '668559', '1976363', '2182439', '2637270', '2637585', '2795614', '3417146', '5115909', '7291199', '7307416', '7440175', '9212383', '9215731', '10490159', '10697357', '12203078', '12267546', '12794843', '15130775', '15489611', '16739015', '17046652', '18118499', '18507325', '18967449', '19015949', '19303400', '19612838', '19627036', '20408712', '20829598', '21180973', '22788423', '24682679', '25042238', '27401462', '27932953', '29952206', '30501206', '30893010', '31799523', '31895475', '32635667', '32806599', '34117481', '34252878', '34880704', '36871979', '37126773', '37823505', '37962056', '37979892', '38023636', '38080696', '38858438', '39240459', '39347289', '39817308', '40509629', '41754280', '42120283', '42640301', '43009842', '44245583', '45911744', '46854048', '47012325', '50101948', '50155854', '50747584', '50837249', '52009066', '53063128', '53704282', '53835525', '54379303', '54385522', '54427734', '56109820', '5 [... truncated] Execution halted

In my case I don't know how to arrange the gene in one column and sample in another with their count values. For me its coming as one sample and its corresponding genes.

DESEq2 rna-seq error • 173 views
ADD COMMENT
0
Entering edit mode

Hi,

I think you have made your rows and geneNames dataframe from the 'coord' values in the data rather than the Gene_Symbol column.

ADD REPLY

Login before adding your answer.

Traffic: 1519 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6