how to control age and gender effect in expression data using DESeq2
2
0
Entering edit mode
6.1 years ago
Lalit ▴ 20

I have a dataset where I want to see effect of a drug on my patients who responded and not responded towards treatment. I collected their blood at three different time point or visit. For each patient I have their age and sex information with me. Now to perform differential expression analysis I used DESeq2 to perform time series analysis as I have collected blood at three different visit. I want to control age and gender effect on my data so I can see interaction between responder group and different time point. Here is the sample table and my DESeq2 design formula:

sample     Phenotype     visit     Age     Gender

1             NonResponder 1      42        female

2             NonResponder  2      42        female

3            NonResponder   3      42        female

4            NonResponder   1      49       female

5           NonResponder    2     49        female

6           NonResponder   3     49       female

7          NonResponder    1     27       male

8          NonResponder     2     27       male

9         NonResponder      3     27      male

10        Responder         1       77      female

11       Responder          2      77      female

12       Responder         3       77       female

13       Responder         1       51      male

14      Responder         2       51       male

15      Responder        3       51        male

16      Responder        1       47        male

17      Responder        2       47        male

18      Responder       3        47        male

So which design should I use to control age and gender effect on my data

**design 1:

dds=(design= ~age+gender+visit+phenotype+visit:phenotype+age:phenotype+gender:phenotype)
dds=DESeq(dds)****

**design 2:

dds=(design=~age+gender+visit+phenotype+visit:phenotype)

dds=DESeq(dds,test="LRT", reduced=~age+gender)****

I will highly appreciate help with this

Best,

Lalit

RNA-Seq DESeq2 diffenrential expression analysis • 3.8k views
ADD COMMENT
1
Entering edit mode
6.1 years ago

Is that your complete dataset?

If it is, I'm pretty sure it is not possible to use age with so few samples. You have no way of knowing whether the differences are age-based, or person-based. Same thing with sex; is the female responder different because she is female, or just because she is a different person? You can't know.

If you had ten times the number of patients, you could bin the patients into groups by age or sex, and compare between the groups, but you can't do that here.

I think your design has to be just phenotype, or phenotype + visit for the responders. If you only have the samples you listed, it's just too small to take into account all the variables you tracked. I'd also check out the PCA; see if your samples at least separate by phenotype.

ADD COMMENT
0
Entering edit mode

Dear Swbarnes2, Thank you so much for the reply. Actually its just a small data set which i presented in this forum. I have 10 non responder patient and 9 responder patient. I collected their blood at three different time point or visit. So in total I have total 57 samples for three visit. I did PCA analysis using phenotype age and visit information but I did not see separate cluster. Variation between PCA1 and PCA2 was not that much. PCA1 18% and PCA2 was 9%. But I am not sure which design I should use to see genes where visit and phenotype have effect and they are not affected by age and gender. I want to correct this data for age and gender. Should I use design 1 as full model dds=(design= ~age+gender+visit+phenotype+visit:phenotype+age:phenotype+gender:phenotype) dds=DESeq(dds)

or should I use design 2 as reduced model to correct my data for age and gender dds=(design=~age+gender+visit+phenotype+visit:phenotype) dds=DESeq(dds,test="LRT", reduced=~age+gender)

Best Regards, Lalit

ADD REPLY
0
Entering edit mode
6.1 years ago
Lalit ▴ 20

Dear Swbarnes2, Thank you so much for the reply. Actually its just a small data set which i presented in this forum. I have 10 non responder patient and 9 responder patient. I collected their blood at three different time point or visit. So in total I have total 57 samples for three visit. I did PCA analysis using phenotype age and visit information but I did not see separate cluster. Variation between PCA1 and PCA2 was not that much. PCA1 18% and PCA2 was 9%. But I am not sure which design I should use to see genes where visit and phenotype have effect and they are not affected by age and gender. I want to correct this data for age and gender. Should I use design 1 as full model dds=(design= ~age+gender+visit+phenotype+visit:phenotype+age:phenotype+gender:phenotype) dds=DESeq(dds)

or should I use design 2 as reduced model to correct my data for age and gender dds=(design=~age+gender+visit+phenotype+visit:phenotype) dds=DESeq(dds,test="LRT", reduced=~age+gender)

Best Regards, Lalit

ADD COMMENT
0
Entering edit mode

Please use ADD COMMENT when replying to an answer.

ADD REPLY

Login before adding your answer.

Traffic: 2817 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6