Question: Note on DESeq2
0
gravatar for Uday Madappa
6 weeks ago by
Uday Madappa40
National Institute of Technology, Calicut, India
Uday Madappa40 wrote:

Hi all,

While running DESeq2 I encountered the following :-

-- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time.

Can somebody please explain this note? Thanks.

rna-seq deseq2 • 183 views
ADD COMMENTlink modified 5 weeks ago by h.mon15k • written 6 weeks ago by Uday Madappa40
1

You may have low sample numbers or it could be that many of your transcripts have 0 counts. Can you confirm? Did you do any pre-filtering?

ADD REPLYlink written 6 weeks ago by Kevin Blighe19k

I have 25 diseased-normal sample pairs and ya i have approx 1k out of 20k genes having 0 counts. I did not do any pre-filtering. How serious is the impact on the results?

ADD REPLYlink written 5 weeks ago by Uday Madappa40
1

'Quite' serious. You should definitely remove genes that only have zeros (i.e. 100% zeros), and then remove other genes that have a high proportion of zeros (i.e. >50% zeros).

You can also do the logic another way by removing all genes whose mean raw count across all samples is <-10.

ADD REPLYlink written 5 weeks ago by Kevin Blighe19k

Thanks for your insight. I noticed the result section where it said "19375 out of 20530 non zero genes/ variables" (something like this). As for my understanding the package has automatically eliminated genes that only have zeros. Please correct me if I'm wrong. Regarding high proportion of zeros, the design of my experiment is such that I have normal and cancerous reading of the same sample adjacent to each other and I've used 25 such samples. Now, what if for a particular gene, the normal count of it is 0 and its corresponding cancerous count is some positive number for all samples. Won't it be eliminated in spite of it being quite significant?

ADD REPLYlink written 5 weeks ago by Uday Madappa40
1

Yes, that is a 'flaw' in the current way that we conduct differential expression analysis. Microarrays overcome this, to some extent. You may want to consider less harsh thresholds than those that I mentioned.

I presume that you are interested in some antisense transcript or non-coding RNA?

ADD REPLYlink written 5 weeks ago by Kevin Blighe19k

mRNA. I have eliminated genes whose count is less than 50. It seems to have a very minor impact on the resulting number of genes (ie; + or - 10 genes).

ADD REPLYlink written 5 weeks ago by Uday Madappa40
1

That is raw counts, right?; and, after that, you re-normalise.

The algorithms can handle zeros. In your case, though, you also have a low sample number.

ADD REPLYlink written 5 weeks ago by Kevin Blighe19k

The count is RSEM normalised. I rounded them off before inputting it to DESeq2. How many minimum samples should one have?

ADD REPLYlink written 4 weeks ago by Uday Madappa40
1

No set number, but the groups that you're comparing should also be balanced. For example, 20 Vs. 20 is better than 20 Vs. 5.

ADD REPLYlink written 4 weeks ago by Kevin Blighe19k

Oh, alright. Mine is 25 vs 25.

ADD REPLYlink written 4 weeks ago by Uday Madappa40
3
gravatar for h.mon
5 weeks ago by
h.mon15k
Brazil
h.mon15k wrote:

See Michael Love answer at DESEq2 warning while using GLM fitting.

ADD COMMENTlink written 5 weeks ago by h.mon15k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 665 users visited in the last hour