Question: error in getting results from DESeq2
0
gravatar for nazaninhoseinkhan
6 months ago by
Iran, Islamic Republic Of
nazaninhoseinkhan340 wrote:

Dear all, I am trying to run DESeq2 to get the list of differential expressed miRNAs between 330 tumor and 42 normal samples. However when I run the following code:

> dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Condition)

> dds <- estimateSizeFactors(dds)

> nc <- counts(dds, normalized=TRUE)

> filter <- rowSums(nc >= 10) >= 2

> dds <- dds[filter,]

> dds$condition<-relevel(dds$Condition, ref="normal")

> design(dds) <- formula(~ Type + Condition)

> dds <- DESeq(dds)

using pre-existing size factors

estimating dispersions

gene-wise dispersion estimates

mean-dispersion relationship

final dispersion estimates

fitting model and testing

32 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest

-- replacing outliers and refitting for 124 genes

-- DESeq argument 'minReplicatesForReplace' = 7 

-- original counts are preserved in counts(dds)

estimating dispersions

fitting model and testing

29 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest.

I tried to fix this problem with using higher threshold in this code:

dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Condition)

> keep <- rowSums((counts(dds)) >= 10) >=300 

> dds <- dds[keep,]

> dds$Condition<-relevel(dds$Condition, ref="normal")

> design(dds) <- formula(~ Type + Condition)

> dds <- estimateSizeFactors(dds)

> dds <- estimateDispersions(dds)

gene-wise dispersion estimates

mean-dispersion relationship

-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.

final dispersion estimates

> dds <- nbinomWaldTest(dds, maxit=1000)

> res <- results(dds)

> write.table(res,"D:\\FromD\\EndocrinologyAndMetabolicDisorderIns\\GDC\\deseq2-21khordadWhiteUsingFilter10-300.txt", sep="\t")

> plotMA(res, ylim = c(-2,2))

> res <- results(dds, name="condition_Cancer_vs_normal")

However this time I get this error for the last line: Error: subscript contains invalid names.

Now my question is what is the best threshold that we have to choose when we are working on large number of samples.

I did not face with this kind of errors with smaller number of samples (55 samples).

I would appreciate any comment

Nazanin

results error deseq2 • 723 views
ADD COMMENTlink modified 6 months ago by Devon Ryan86k • written 6 months ago by nazaninhoseinkhan340

You get the error immediately upon running the last line or elsewhere? What threshold are you asking about in your post?

ADD REPLYlink written 6 months ago by Devon Ryan86k

Yes, I got the error when I use

"res <- results(dds, name="condition_Cancer_vs_normal")".

I want to know what thresholds we are allowed to use in :

keep <- rowSums((counts(dds)) >= 10) >=300

and in:

dds <- nbinomWaldTest(dds, maxit=1000)
ADD REPLYlink modified 6 months ago by Vijay Lakhujani3.4k • written 6 months ago by nazaninhoseinkhan340

Any particular reason for choosing 10 and 300 in

rowSums((counts(dds)) >= 10) >=300

I have seen several papers/tutorials using different strategy like rowSums>0 to filter low count genes, but I could never get the real reason.

ADD REPLYlink written 4 months ago by ag1805x120
0
gravatar for Devon Ryan
6 months ago by
Devon Ryan86k
Freiburg, Germany
Devon Ryan86k wrote:

You can skip using keep entirely, that's unlikely to help you in any way. The maxit value should be high enough that you get convergence, so if you didn't get an error with that then it's fine. You're receiving Error: subscript contains invalid names. because you mistyped one of the levels. My guess is that you should be using name="condition_cancer_vs_normal" (note the lowercase C).

ADD COMMENTlink written 6 months ago by Devon Ryan86k

Hi,

Thank you so much for your reply.

Yes, this morning I realized that I misspelled the "Condition".

And thanks for your advice about maxit. I think it is about the number of iterations and does not affect my data.

ADD REPLYlink written 6 months ago by nazaninhoseinkhan340
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1108 users visited in the last hour