Question: Error in `rownames<-`(`*tmp*`, value = colnames(countData)): attempt to set 'rownames' on an object with no dimensions
0
gravatar for katarzyna.wieciorek
12 weeks ago by
katarzyna.wieciorek0 wrote:

Hello,

I am using DESeq2 library following the manual 3.2 Starting from count matrices. I have my countdata and coldata imported from CSV files. I understand that countdata file can be a problem here but I don't understand what's the problem exactly.

My code:

library(DESeq2)

NGS <- read.csv2(paste0(datadir,"/CLN3_NGS_orig.csv"), header = T,stringsAsFactors = F)
Sinfo <- read.csv2(paste0(datadir,"/Sampleinfo.csv"), header = T,stringsAsFactors = F)
head(NGS)
head(Sinfo)

coldata <- DataFrame(Sinfo)
coldata <- lapply(coldata, as.factor)
coldata

lapply(NGSnum, class)
NGSnum <- data.frame(NGS[1], apply(NGS[2:13],2, as.numeric))
NGSFull <- DESeqDataSetFromMatrix(
countData = NGSnum,
colData = coldata,
design = ~ Genotype + Treatment)
NGSFull

NGS$Genotype <- relevel(NGSFull$Genotype, "WT")

deseqNGS <- DESeq(NGS)
res <- results(deseqNGS)
res

My error after appyling DESeqDataSetFromMatrix:

Error in `rownames<-`(`*tmp*`, value = colnames(countData)) : attempt to set 'rownames' on an object with no dimensions

My coldata and countdata files on pastebin: countdata & coldata

By the way, my countdata contain transcripts, sometimes several transcripts (ENST) correspond to single gene (ENSG). Can DESeq2 sort it out for me and give me only output with genes? It is easy to convert transcripts to genes but harder to make one position out of several.

Thank you in advance, Kasia

deseq2 next-gen bioconductor R • 680 views
ADD COMMENTlink modified 12 weeks ago by ATpoint36k • written 12 weeks ago by katarzyna.wieciorek0

You should aggregate transcript counts to the gene level first, e.g. using the tximport package. Otherwise DESeq2 does not know which transcript belongs to each gene, it just sees counts and will treat them as one gene per row. Please make sure you provide the necessary example data in the post as you cannot expect users to download things first from a cloud. This could be any kind of malware. dput is useful to provide example data suitable for copy/pasting.

ADD REPLYlink written 12 weeks ago by ATpoint36k

Thank you, I will try tximport first. I edited the post so now it contains pastebin links. Is it ok for you?

ADD REPLYlink written 12 weeks ago by katarzyna.wieciorek0
0
gravatar for ATpoint
12 weeks ago by
ATpoint36k
Germany
ATpoint36k wrote:

Your colData is a list, not a data frame as required.

library(DESeq2)

NGS <- read.csv2("~/Desktop/counts.csv", header = T,stringsAsFactors = F)

## Gene names must be rownames and values must be integers
rownames(NGS) <- NGS$Transcript
NGS$Transcript <- NULL
NGS <- round(apply(NGS,2,as.numeric))

## coldata are fine, no need to do anything here
Sinfo <- read.csv2("Desktop/coldata.csv", header = TRUE, stringsAsFactors = TRUE)

NGSFull <- DESeqDataSetFromMatrix(
  countData = NGS,
  colData = coldata,
  design = ~ Genotype + Treatment)

As said, use tximport to get gene level counts first. Then use DESeqDataSetFromTximport().

ADD COMMENTlink modified 12 weeks ago • written 12 weeks ago by ATpoint36k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1135 users visited in the last hour