I have a question related to this previous post, Technical/Biological Replicates In Rna-Seq For Two Cell Lines , but different in a few ways.
I'd greatly appreciate your help.
I have different cell lines derived from human fibroblasts, which I have grown in vitro and prepared RNA-seq libraries from. I want to find genes differentially expressed between two conditions using DESeq 2.
For group 1 (WT/non-disease), I have 4 lines (4 individuals): 1A, 1B,1C,1D.
For group 2 (disease), I have just 2 lines (2 individuals): 2A, 2B.
Furthermore, from WT/non-disease lines 1A and 1B, I have 3 different colonies from each, grown in separate for 30 days (1A-1, 1A-2, 1A-3 and 1B-1, 1B-2, 1B-3).
From WT/non-disease lines 1C and 1D, I only grew one colony from each for 30 days (1C-1, 1D-1).
For the disease lines 2A and 2B, I have 2 different colonies from one, grown separately for 30 days (2A-1. 2A-2), while for the other, I have 3 different colonies (2B-1, 2B-2, 2B-3).
For each colony, I did only one RNA extraction and library prep + sequencing, so I have no strictly technical replicates, for a total of 13 libraries, each from a distinct colony grown separately for 30 days, albeit some from the same human cell line.
My gut feeling is that each library should be a biological replicate since they were all derived from separate colonies and are not really technical replicates, but I am also aware that there are two levels of biological variation in my experiment -- some colonies are from different humans, and others are colonies from the same human grown in parallel for 30 days.
Should I collapse the colonies from the same human individuals into a single column? Or is it better to keep each colony as a separate biological replicate given that it is capturing more variation for the condition than expected by just being a technical replicate?