Question

High fold change values

1

Entering edit mode

4.4 years ago

Lluís R. ★ 1.2k

I am studying some RNA-seq data with edgeR and limma. The data was obtained from 59 samples using Illumina HiSeq 2500 and analyzing following the workflow RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR .

The data is quite complex with 4 different factors interacting between them we don't see a batch effect on the data. I have made 62 comparisons, some of them have differentially expressed genes (DEG) with very high log2 fold changes, so I think I made a mistake somewhere upstream.

On one approach with a model with the the intercept and 9 coefficients, the highest log2FC value is 40 which means a fold change of 1 099 511 627 776. However, the log fold change is not influenced by the average expression, the genes are already filtered by low median expression. I have some comparisons with up to 66% of differentially expressed genes (DEG) while other comparisons don't have any DEG.

Another approach (without using any model, just using the samples we want to compare) resulted on a fold change of 2609 and some comparisons up to 75% being DEG. Again I don't observe any influence of the average expression in the fold change.

14 comparisons for both methods don't have any DEG, so I don't think that the either of methods are wrong. My question is related to those high values of fold change. Below I provide an example comparing the log2 fold changes for a comparison. Is there some expected maximum or are these values normal? Any advice?

Thanks in advance.

Comparison of logFC of different methods

RNA-Seq • 1.9k views

ADD COMMENT • link 4.4 years ago by Lluís R. ★ 1.2k

2

Entering edit mode

However, the log fold change is not influenced by the average expression, the genes are already filtered by low median expression.

Maybe you also need to filter by a minimal expression level, if you have a gene expressed in one condition as 20 cpm but in the control condition with 0.00001 cpm, that is a 2,000,000X fold-change.

Also, filter your DEG by FDR besides FC.

ADD REPLY • link 4.4 years ago by JC 13k

0

Entering edit mode

I filtered using edgeR::filterByExpr, that is min.count = 10 and min.total.count = 15. But I haven't looked at cpm, I'll look at it. When I mention the 75% of DEG is looking only to the FDR < 0.05.

ADD REPLY • link 4.4 years ago by Lluís R. ★ 1.2k

0

Entering edit mode

check if your normalization is not reducing values to zero, a quick box-plot for all your samples can visualize that

ADD REPLY • link 4.4 years ago by JC 13k

0

Entering edit mode

I don't think it is. You can find the boxplot here, also here after voom normalization

ADD REPLY • link 4.4 years ago by Lluís R. ★ 1.2k