I want to do a RNA-seq for 30 cancer patients that can be divided into two groups of 150. Each patient have normal and cancerous tissue paired. The problem is that we have money ONLY for 36 samples in total.
I was thinking, at first, to use 3 normals for each group of 15 as a way to use my samples in DE analysis. But then it ocurrured to me: Why not pool 5 normal tissues, 3 times each, so all normal tissue will be represented with less RNA-seq?
Group 1 - 15 cancer samples - 3 pools of 5 normal paired
Group 2 - 15 cancer samples - 3 pools of 5 normal paired
We were planning 100million reads paired-end, which is the standard in our institution.
The point of normal samples is not just to get a figure for what the average "normal" level is. It's also to get an idea of what the natural variation is between "normal" samples.
So first, I'm not sure that 3 normals is sufficient to get a good handle on the normal variation. Second, three average pools is just going to be a mush that will obliterate whatever variation there is between your normal samples.
Also, if all you care about is DE, you might not need 100 bp long paired end. Paired end costs twice as much, but doesn't give you double the gene counts.
Not DE only, but many other analyses such as mutation, isoform and all we can do with RNA-seq.
Could you expand the explanation of why variation between normal samples is important?