Question

RNA seq for different tissues samples

0

Entering edit mode

16 months ago

galaxy • 0

Hello,

I am running a RNA Seq Analysis on a number of samples from a certain livestock organism. I ran a very typical pipeline using featurecounts for the counts portion of the analysis, RPKM for normalization and DEseq2 and EdgeR both for the differential gene analysis portion. However, because these are samples from different tissues and not from different conditions(i.e. drug vs control) the differential gene analysis results are very overblown(i.e. way too many genes look differentially expressed). For example, I have 4 replicates from the brain, 4 replicates from muscle, and another 4 from kidneys. Are there any tools or methods our that there that take this kind of thing into account in order to get better results? Alternatively, are there statistical methods or cutoffs that I should change or be made aware of in my analysis? I know this is a very vague question, but this is my first time running an analysis like this and I am still learning a lot. Any help would be greatly appreciated.

RNASeq Gene Expression EDGER Differential DESeq2 • 970 views

ADD COMMENT • link updated 16 months ago by ATpoint 81k • written 16 months ago by galaxy • 0

1

Entering edit mode

Hello,

What is the biological question, you want to answer with such an experiment? Of course you can compare brain-samples with kidney-samples. But these two organs have totally different functions. Since there are different metabolic and signalling pathways active, any comparison will lead to "way too many" differentially expressed genes.

For instance, this publication shows a large scale transcriptome analysis for different organs.

ADD REPLY • link 16 months ago by michael.ante ★ 3.8k

score 1 · Answer 1 · 2022-12-01

1

Entering edit mode

16 months ago

ATpoint 81k

Basically, the organ differences are expected to be large indeed. A way to prioritize genes with large fold changes in a data-driven fashion is to test against a fold change other than zero. By default, both edgeR and DESeq2 assume as Null hypothesis that the true fold change is zero, so no difference between comparison groups. You can use the edgeR function glmTreat or the DESeq2 function lfcShrink with the lfc argument to test against a certain minimum fold change. That could for such as setup be 2. Any significant genes are then guaranteed to have evidence for a fold change larger than that. It's quite stringent but could help narrowing down the top genes, which is statistically superior than just filtering for a fold cahnge in the existing results tables, as fold changes can be large with big standard errors, still have little statistical support.

ADD COMMENT • link 16 months ago by ATpoint 81k

0

Entering edit mode

Thank you so much for the response. I'll definitely be looking into those functions to see what statistical support they may be able to provide. Additionally however, I've also been looking into other threads and discussions on the same topic such as: Differential expression for two very different samples.

Do you have any thoughts or experience with parameter logratioTrim in the function calcNormFactors()? This seems to be another commonly suggested use for alleviating problems from too many differentially expressed genes.

Another one I have seen discussed is quantile normalization.

Do you think either of these might be important to consider when running my kind of RNAseq experiment?

ADD REPLY • link 16 months ago by galaxy • 0

1

Entering edit mode

What is actually the question you want to answer in this experiment? Brain, kidne and muscle are as biologically different as it gets in terms of organ function, many hundreds if not thousands of DEGs a a priori expected, so why did you do this experiment and what outcome did you expect?

ADD REPLY • link 16 months ago by ATpoint 81k