Hello,
I am using DESeq2 library following the manual 3.2 Starting from count matrices. I have my countdata and coldata imported from CSV files. I understand that countdata file can be a problem here but I don't understand what's the problem exactly.
My code:
library(DESeq2)
NGS <- read.csv2(paste0(datadir,"/CLN3_NGS_orig.csv"), header = T,stringsAsFactors = F)
Sinfo <- read.csv2(paste0(datadir,"/Sampleinfo.csv"), header = T,stringsAsFactors = F)
head(NGS)
head(Sinfo)
coldata <- DataFrame(Sinfo)
coldata <- lapply(coldata, as.factor)
coldata
lapply(NGSnum, class)
NGSnum <- data.frame(NGS[1], apply(NGS[2:13],2, as.numeric))
NGSFull <- DESeqDataSetFromMatrix(
countData = NGSnum,
colData = coldata,
design = ~ Genotype + Treatment)
NGSFull
NGS$Genotype <- relevel(NGSFull$Genotype, "WT")
deseqNGS <- DESeq(NGS)
res <- results(deseqNGS)
res
My error after appyling DESeqDataSetFromMatrix:
Error in `rownames<-`(`*tmp*`, value = colnames(countData)) : attempt to set 'rownames' on an object with no dimensions
My coldata and countdata files on pastebin: countdata & coldata
By the way, my countdata contain transcripts, sometimes several transcripts (ENST) correspond to single gene (ENSG). Can DESeq2 sort it out for me and give me only output with genes? It is easy to convert transcripts to genes but harder to make one position out of several.
Thank you in advance, Kasia
You should aggregate transcript counts to the gene level first, e.g. using the
tximport
package. Otherwise DESeq2 does not know which transcript belongs to each gene, it just sees counts and will treat them as one gene per row. Please make sure you provide the necessary example data in the post as you cannot expect users to download things first from a cloud. This could be any kind of malware. dput is useful to provide example data suitable for copy/pasting.Thank you, I will try tximport first. I edited the post so now it contains pastebin links. Is it ok for you?