I have recently obtained RNA-Seq data of tumor samples from a pilot study in my lab and just finished applying Differential Gene Expression analysis on the data.
The Background: Due to some restrictions related to the fiscal year on which the funding for this experiment was budgeted, only 15 samples could be run. The person we consulted regarding the experiment advised us to use three technical replicates, something we later (after the experiment was finished) found to be unnecessary (?). As a result, our data consist of:
2 biological replicates of Condition A, each with 3 technical replicates
AND
3 biological replicates of Condition B, each with 3 technical replicates
Definition
Biological replicates: Samples from different individuals with as close as possible tumor profile and clinical confounders, each exhibiting the factor of interest, either Condition A or Condition B
Technical replicates: RNA from the same sample run on the same day (same batch)
The Result of Analysis
After summing the technical replicates' counts, using DESEQ2 we found 6 differentially expressed genes (Adj. P-value < 0.05, ) between the two conditions.
The Question
- From what I understand, sample size determines power, which is the probability of rejecting the null hypothesis when in fact it is false (type II error, false negative). Am I correct in assuming that sample size does not have any effect on Beta (the probability of false positive)? I have read in this forum and in some journal articles that sample size in a RNA-Seq experiment should at least be 3 vs 3.
- There have been talks of (A) just using this data; instead of (B) designing a larger experiment (which is obviously a more expensive option). Is option (A) still a scientifically (and statistically) valid option, considering the sample size?
Thank you for considering to answer my questions. This is my first post in this forum! I have just started working in the field and just browsing past questions on this forum has helped answered my questions on many occasions. Looking forward to contributing in the years to come.
Best regards,
Michael
This is something new to me. I skimmed through some epidemiology papers after reading your answer, and now I get the feeling that they don't talk enough about this in Statistics classes. Thank you, Ian, for the amazing insight!
Cheers!