I am stuck with the correlated and independent data combined in one study. Here's my dilemma:

Say X is a drug(explanatory variable) and Y is a gene expression(response variable).

Normally, you would give out drugs to a half of your group(randomly chosen) and placebo to the rest, measure the gene expression for each person in the group, and conduct the differential expression analysis between two conditions using a standard package such as deseq.

However, let's say between-subjects variability is huge, so you also measured the gene expression prior to taking the pill as well. In other words, each subject has two data points(before and after the taking the pill). How do you incorporate this information into the analysis? I tried to subtract the 'before' value from after 'value' for each patient, and conduct t-test on those statistics between conditions(drug & placebo), but I am not sure if I am doing it right.

I am curious if there is a standard, already established way to do this? You can assume Y follows Poisson or negative binomial.

Also, check out the edgeR documentation. The authors have included many example analyses, possibly some that are similar to what you described.

To see the user's guide, you can do this inside R:

`# load the library`

library(edgeR)

`# open the guid`

edgeRUsersGuide()