I am studying some RNA-seq data with
limma. The data was obtained from 59 samples using Illumina HiSeq 2500 and analyzing following the workflow RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR .
The data is quite complex with 4 different factors interacting between them we don't see a batch effect on the data. I have made 62 comparisons, some of them have differentially expressed genes (DEG) with very high log2 fold changes, so I think I made a mistake somewhere upstream.
On one approach with a model with the the intercept and 9 coefficients, the highest log2FC value is 40 which means a fold change of 1 099 511 627 776. However, the log fold change is not influenced by the average expression, the genes are already filtered by low median expression. I have some comparisons with up to 66% of differentially expressed genes (DEG) while other comparisons don't have any DEG.
Another approach (without using any model, just using the samples we want to compare) resulted on a fold change of 2609 and some comparisons up to 75% being DEG. Again I don't observe any influence of the average expression in the fold change.
14 comparisons for both methods don't have any DEG, so I don't think that the either of methods are wrong. My question is related to those high values of fold change. Below I provide an example comparing the log2 fold changes for a comparison. Is there some expected maximum or are these values normal? Any advice?
Thanks in advance.