Question: Correct for confounding variable (variability in knockdown efficiency) in paired t-test for change in gene expression
gravatar for Scott
2.7 years ago by
Scott80 wrote:

This is primarily a statistics question, but the biological context may help frame the problem.

There are two samples treated with different shRNAs. 1 = control shRNA, 2 = target shRNA.

Then PCR is performed on several other (non-target) genes of interest. Several genes change in expression between control and target shRNA treatments, but the extent to which they change is highly correlated with the extent to which the target gene has been knocked down.

Since baseline gene expression/ Ct values are fairly variable between biological replicates, it is good to perform a paired t-test. I have done this for each gene of interest and then of course correct the P value based on multiple testing. This works, however I would like to increase my power by including more replicates of the experiment with variable knockdown efficiency, but have the efficient of knockdown taken into account. I basically need a way to correct for a confounding variable in a paired t-test.

It seems my options may include: 1) ANCOVA. This would likely work great for raw Ct values, but does not seem to easily accommodate paired data. 2) Multiple linear regression. Any advice on how to actually go about doing this would be appreciated (if it is indeed the recommended method)


statistics qpcr ancova gene t-test • 1.1k views
ADD COMMENTlink modified 2.7 years ago by Devon Ryan96k • written 2.7 years ago by Scott80

Am I correct in assuming that the pairing is between a control shRNA and a target shRNA in the same sample (it's unclear what the actual relevant system is in this case)?

ADD REPLYlink written 2.7 years ago by Devon Ryan96k

No. One sample has been treated with control shRNA and another sample treated with target shRNA. They are treated in parallel though, hence the pairing. Hope that makes sense.

ADD REPLYlink written 2.7 years ago by Scott80
gravatar for Devon Ryan
2.7 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

If you loaded your data into R and made a data frame that looked something like:

sample  batch   treatment       efficiency      Ct
s1      b1      control 0.0     30
s2      b1      shRNA   0.3     25
s3      b2      control 0.0     29
s4      b2      shRNA   0.2     27
s5      b3      control 0.0     30
s6      b3      shRNA   0.8     20
s7      b4      control 0.0     29
s8      b4      shRNA   0.9     21

Then the linear model for what you want is lm(Ct ~ batch + efficiency) and you're interested in seeing if efficiency is > 0. You don't need the treatment column, I've put that in so it's easy to see that all control shRNA samples will have an efficiency of 0. batch designates your pairs, feel free to rename it.

At the end of the day, this is just a tweaked version of a paired T-test, where instead of testing if the paired difference between shRNA and control is 0, you test whether this difference as a function of the efficiency has a non-zero slope. You can also do this somewhat manually, by subtracting the control from the shRNA within each pair and then regressing that difference vs. KD efficiency.

FYI, the p-value for KD efficiency (or shRNA treatment controlling for efficiency, if you prefer) in the above example is ~0.004.

ADD COMMENTlink written 2.7 years ago by Devon Ryan96k

Erm.... I'm not sure that you would expect the relationship to be linear, since a different of 1 Ct represents a ~2-fold difference in target expression. Also, i'd not be confident that the variance would be equal at low and high Ct values.

ADD REPLYlink written 2.7 years ago by i.sudbery9.3k

True, but it's a useful first stab at things.

ADD REPLYlink written 2.7 years ago by Devon Ryan96k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1577 users visited in the last hour