Hi,
I downloaded rna seq read matrix for a specific tissue from GTEx, and I was hoping to do differential expression analysis base on age of the donor. Say there are 30 donors that are classified as young, 30 donors that are classified as old, would I be able to use these two groups in DESeq2 for differential expression analysis? I was wondering if the in-group variance would be too big for this to work.
Thanks!
Thank you the both of you! I basically did what you described, and for PCA, I should probably establish the same cut off as when I did DEG I assume? I only saw a cloud and didn't really see too much with all the available samples. I then selected the old, and the young (which is the top and bottom 10% of age) and perform DEG and actually found quite a lot of DEGs, with alpha = 0.01.
I concur with everything ATpoint has mentioned.
You will not know until you've looked at the data and tested it. That's exactly what the stats implemented in the usual differential expression tools are there to help you decide: whether the factor of interest (in your case: age) has a systematic effect on the expression pattern of a given gene. If the variability within your two groups is large, the p-values will indicate that, i.e. you will probably not find any genes with p/q-values that'll meet the usual criteria for statistical significance.