Question

Note on DESeq2

0

Entering edit mode

6.0 years ago

bioinfo456 ▴ 150

Hi all,

While running DESeq2 I encountered the following :-

-- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time.

Can somebody please explain this note? Thanks.

RNA-Seq DESeq2 • 3.0k views

ADD COMMENT • link updated 6.0 years ago by h.mon 35k • written 6.0 years ago by bioinfo456 ▴ 150

1

Entering edit mode

You may have low sample numbers or it could be that many of your transcripts have 0 counts. Can you confirm? Did you do any pre-filtering?

ADD REPLY • link 6.0 years ago by Kevin Blighe 87k

0

Entering edit mode

I have 25 diseased-normal sample pairs and ya i have approx 1k out of 20k genes having 0 counts. I did not do any pre-filtering. How serious is the impact on the results?

ADD REPLY • link 6.0 years ago by bioinfo456 ▴ 150

1

Entering edit mode

'Quite' serious. You should definitely remove genes that only have zeros (i.e. 100% zeros), and then remove other genes that have a high proportion of zeros (i.e. >50% zeros).

You can also do the logic another way by removing all genes whose mean raw count across all samples is <-10.

ADD REPLY • link 6.0 years ago by Kevin Blighe 87k

0

Entering edit mode

Thanks for your insight. I noticed the result section where it said "19375 out of 20530 non zero genes/ variables" (something like this). As for my understanding the package has automatically eliminated genes that only have zeros. Please correct me if I'm wrong. Regarding high proportion of zeros, the design of my experiment is such that I have normal and cancerous reading of the same sample adjacent to each other and I've used 25 such samples. Now, what if for a particular gene, the normal count of it is 0 and its corresponding cancerous count is some positive number for all samples. Won't it be eliminated in spite of it being quite significant?

ADD REPLY • link 6.0 years ago by bioinfo456 ▴ 150

1

Entering edit mode

Yes, that is a 'flaw' in the current way that we conduct differential expression analysis. Microarrays overcome this, to some extent. You may want to consider less harsh thresholds than those that I mentioned.

I presume that you are interested in some antisense transcript or non-coding RNA?

ADD REPLY • link 6.0 years ago by Kevin Blighe 87k

0

Entering edit mode

mRNA. I have eliminated genes whose count is less than 50. It seems to have a very minor impact on the resulting number of genes (ie; + or - 10 genes).

ADD REPLY • link 6.0 years ago by bioinfo456 ▴ 150

1

Entering edit mode

That is raw counts, right?; and, after that, you re-normalise.

The algorithms can handle zeros. In your case, though, you also have a low sample number.

ADD REPLY • link 6.0 years ago by Kevin Blighe 87k

0

Entering edit mode

The count is RSEM normalised. I rounded them off before inputting it to DESeq2. How many minimum samples should one have?

ADD REPLY • link 6.0 years ago by bioinfo456 ▴ 150

1

Entering edit mode

No set number, but the groups that you're comparing should also be balanced. For example, 20 Vs. 20 is better than 20 Vs. 5.

ADD REPLY • link 6.0 years ago by Kevin Blighe 87k

0

Entering edit mode

Oh, alright. Mine is 25 vs 25.

ADD REPLY • link 6.0 years ago by bioinfo456 ▴ 150

score 3 · Answer 1 · 2018-04-15

3

Entering edit mode

6.0 years ago

h.mon 35k

See Michael Love answer at DESEq2 warning while using GLM fitting.

ADD COMMENT • link 6.0 years ago by h.mon 35k