Question: error in getting results from DESeq2
0
gravatar for nazaninhoseinkhan
15 months ago by
Iran, Islamic Republic Of
nazaninhoseinkhan370 wrote:

Dear all, I am trying to run DESeq2 to get the list of differential expressed miRNAs between 330 tumor and 42 normal samples. However when I run the following code:

> dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Condition)

> dds <- estimateSizeFactors(dds)

> nc <- counts(dds, normalized=TRUE)

> filter <- rowSums(nc >= 10) >= 2

> dds <- dds[filter,]

> dds$condition<-relevel(dds$Condition, ref="normal")

> design(dds) <- formula(~ Type + Condition)

> dds <- DESeq(dds)

using pre-existing size factors

estimating dispersions

gene-wise dispersion estimates

mean-dispersion relationship

final dispersion estimates

fitting model and testing

32 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest

-- replacing outliers and refitting for 124 genes

-- DESeq argument 'minReplicatesForReplace' = 7 

-- original counts are preserved in counts(dds)

estimating dispersions

fitting model and testing

29 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest.

I tried to fix this problem with using higher threshold in this code:

dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Condition)

> keep <- rowSums((counts(dds)) >= 10) >=300 

> dds <- dds[keep,]

> dds$Condition<-relevel(dds$Condition, ref="normal")

> design(dds) <- formula(~ Type + Condition)

> dds <- estimateSizeFactors(dds)

> dds <- estimateDispersions(dds)

gene-wise dispersion estimates

mean-dispersion relationship

-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.

final dispersion estimates

> dds <- nbinomWaldTest(dds, maxit=1000)

> res <- results(dds)

> write.table(res,"D:\\FromD\\EndocrinologyAndMetabolicDisorderIns\\GDC\\deseq2-21khordadWhiteUsingFilter10-300.txt", sep="\t")

> plotMA(res, ylim = c(-2,2))

> res <- results(dds, name="condition_Cancer_vs_normal")

However this time I get this error for the last line: Error: subscript contains invalid names.

Now my question is what is the best threshold that we have to choose when we are working on large number of samples.

I did not face with this kind of errors with smaller number of samples (55 samples).

I would appreciate any comment

Nazanin

results error deseq2 • 2.0k views
ADD COMMENTlink modified 15 months ago by Devon Ryan91k • written 15 months ago by nazaninhoseinkhan370

You get the error immediately upon running the last line or elsewhere? What threshold are you asking about in your post?

ADD REPLYlink written 15 months ago by Devon Ryan91k

Yes, I got the error when I use

res <- results(dds, name="condition_Cancer_vs_normal")

I want to know what thresholds we are allowed to use in :

keep <- rowSums((counts(dds)) >= 10) >=300

and in:

dds <- nbinomWaldTest(dds, maxit=1000)
ADD REPLYlink modified 8 weeks ago by RamRS24k • written 15 months ago by nazaninhoseinkhan370

Any particular reason for choosing 10 and 300 in

> rowSums((counts(dds)) >= 10) >=300

I have seen several papers/tutorials using different strategy like rowSums>0 to filter low count genes, but I could never get the real reason.

ADD REPLYlink modified 8 weeks ago by RamRS24k • written 14 months ago by Arindam Ghosh170
0
gravatar for Devon Ryan
15 months ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

You can skip using keep entirely, that's unlikely to help you in any way. The maxit value should be high enough that you get convergence, so if you didn't get an error with that then it's fine. You're receiving Error: subscript contains invalid names. because you mistyped one of the levels. My guess is that you should be using name="condition_cancer_vs_normal" (note the lowercase C).

ADD COMMENTlink written 15 months ago by Devon Ryan91k

Hi,

Thank you so much for your reply.

Yes, this morning I realized that I misspelled the "Condition".

And thanks for your advice about maxit. I think it is about the number of iterations and does not affect my data.

ADD REPLYlink written 15 months ago by nazaninhoseinkhan370
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1812 users visited in the last hour