Can I perform a correlation test with 3 biological replicates per condition?
1
0
Entering edit mode
5 weeks ago

Hello all,

Apologies for the not so technical question.

I am currently analyzing biological replicates from an RNA-seq experiment, 3 for each condition that I am comparing. These 3 replicates are for a specific tissue and cell type.

Having gotten a list of DEGs, I now want to access the correlation between these genes.

However, even though 3 biological replicates per condition seems to be the norm in these experiments, many papers discuss that we need at least 6 replicates per sample, with 12 being ideal.

You can find the links to the papers and articles I read on the subject.

My question is: does it make sense to perform a Pearson Correlation test for example, or run WGCNA, for a scenario where I have so few replicates? How is correlation usually inferred in experiments with 3x3 replicates?

Literature:

Thank you

RNA-Seq deseq2 • 430 views
ADD COMMENT
1
Entering edit mode

However, even though 3 biological replicates per condition seems to be the norm in these experiments, many papers discuss that we need at least 6 replicates per sample, with 12 being ideal.

There is no general statement that will hold here. If you have human data that are not paired you can easily need dozens or hundreds of replicates to get significances depending on true effect size. Likewise, with cellline replicates and large effects even 2 can be enough to get results. In the end it is a combination of available specimen, money, feasibility and what you can expect in terms of biological effects.

WGCNA

No, the FAQ of WGCNA clearly tells that iirc < 20 samples is pointless. Check its docs.

ADD REPLY
0
Entering edit mode

Thank you ATpoint . My apologies, maybe I should rephrase the question. I read the WGCNA docs already, and for the Pearson correlation test they say 25 is the minimum per condition. However, I have seen published papers where pearson correlation is done with just 3 biological replicates per condition for example. If there are large differences in gene expression between conditions, can a correlation analysis be carried out? Or does it always hold that you need a minimum of samples per condition?

Ex paper with Pearson Corr with 3 replicates: https://www.researchgate.net/figure/Pearson-correlation-coefficient-between-the-three-biological-replicates-The-first_fig2_341033248

ADD REPLY
0
Entering edit mode
5 weeks ago
ATpoint 83k

I think you're misunderstanding some concepts here.

WGCNA is a framework that defines modules based on how and which genes are correlated to each other based on the Pearson correlation. That is essentially a per-gene correlation that is then aggregated into modules.

What you link is simply the correlation between samples, not genes. Imagine a plot with x-axis being expression values for all genes of sample 1 and y is the same for sample two. Then apply cor() to these two vectors of expression values and you get a correlation result that tells you how "similar" (whatever this means) the samples are. Obviously the minimum number for this is 2 because correlation is ONE vs THE_OTHER. Depends which question you want to answer.

ADD COMMENT
0
Entering edit mode

Ah, I see. So seen that what I want is the per-gene correlation, the minimum of 20/25 for the pearson correlation still applies. Thank you very much for your time ATpoint , I was indeed mixing concepts.

ADD REPLY
0
Entering edit mode

You're welcome. Yes, unfortunately you need quite a sample size for WGCNA simply because correlations with few samples are vastly underpowered.

ADD REPLY

Login before adding your answer.

Traffic: 1648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6