Question: GLM design for RNAseq analysis using DESeq
gravatar for illinois.ks
5.6 years ago by
Korea, Republic Of
illinois.ks160 wrote:



I am working with RNAseq analysis.. and I would like to get confirm for my model design before doing the analysis. 

I have 8 samples from 4 patients (each patient has two samples before treatment and after treatment.. ) I would like to find DEG genes which are differentially expressed by treatment.. However, I suspect that it also be affected by sex factor. so I would like to consider the sex factor (but among 4 patient, I only have one male, three female)

Here is the exact description for data model.


sample1         patient1 pre_treatment  female 

sample2         patient1 post_treatment female

sample3         patient2 pre_treamment male

sample4         patient2 post_treament male

sample5         patient3 pre_treamment female

sample6         patient3 post_treament female

sample7         patient4 pre_treamment female

sample8         patient4 post_treament female



df_enzyme <- factor(rep(c("pre", "post"), len=8))
df_patient <- rep(1:4, each=2,  len=8)
df_sex <- factor(c("F","F","M","M","F","F","F","F"))
df_enzyme <- relevel(df_enzyme, ref="pre")

df_meta <- data.frame(row.names=colnames(data), patient = df_patient, enzyme = df_enzyme, sex = df_sex)

# sincce I don't have replication, for the dispersion estimation, I used blind.. 

d_fabry <- estimateDispersions(d_fabry, method="blind", sharingMode="fit-only")

I made the following model. 

d_fit0a = fitNbinomGLMs(d_fabry, count ~ patient)
d_fit0b = fitNbinomGLMs (d_fabry, count ~ enzyme)
d_fit0c = fitNbinomGLMs (d_fabry, count ~ sex)

d_fit1a = fitNbinomGLMs(d_fabry, count ~ patient + enzyme)
d_fit1b = fitNbinomGLMs(d_fabry, count ~ patient + sex)

d_fit2 = fitNbinomGLMs(d_fabry, count ~ patient + enzyme + sex)
d_fit3 = fitNbinomGLMs(d_fabry, count ~ patient + enzyme + sex + enzyme:sex


for the treatment(enzyme) effect, which one is right?? maybe the first one?? ( it assume that sex would have effect.. right?) 

dh_pvalsGLM = nbinomGLMTest( dh_fit2, dh_fit1b ) 


dh_pvalsGLM = nbinomGLMTest( dh_fit1a, dh_fit0a) 

for the treatment and sex interaction effct, (since I suspect that depending on different sex, effect of treatment might be different or not), the following modle is right??

dh_pvalsGLM = nbinomGLMTest( dh_fit3, dh_fit2 )


Again, I would like to find DEG which are responsive to treament(enzyme), but I suspect that depending on sex, the responsiveness might be different (maybe not) I also, test with this assumption. Does my model do the right thing for this?? 

rna-seq • 3.7k views
ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by illinois.ks160
gravatar for Devon Ryan
5.6 years ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

You can't look at gender because it's completely consumed by the patient effect. Your experiment is described by either ~patient+enzyme or ~patient*enzyme, depending on whether you want to allow an interaction or not. I would generally suggest the former model, since that's what you're really interested in.

ADD COMMENTlink written 5.6 years ago by Devon Ryan94k

Thank you for your helps. if then (as you say, gender is consumed by the patient effect), is there any way to see if there is any different reaction depending on gender??  as you suggest, I can test with  ~patient+ enzyme(full) and ~ patient(reduced)  to see the effect of enzyme.. ( I mean the DEG which are responsive to the enzyme.)

But, I also would like to see whether there is any different reaction over gender? (I know I only have one female and three male..) do you think is it possible? or I can only test with enzyme??

ADD REPLYlink modified 5.6 years ago by Devon Ryan94k • written 5.6 years ago by illinois.ks160

I would say, no, there's no good way with the current dataset to look at gender. If you had a couple more male samples then perhaps you could get some meaningful results. Lacking that, don't waste your time.

ADD REPLYlink written 5.6 years ago by Devon Ryan94k

Oh.. Devon

I have two more quick questions,

Q1) I will have one more male patient. So total of two male, three female.

Do you think in this case, it would be okay ( although not enough), to test with gender?? 

The reason I am asking this is because, this specific disease are x-linked.. so there should be a difference depending on gender... So I would like to consider the gender as factor even though I could not test it. 

Then, do you think 



case 1)

full mode : ~ patient + gender + enzyme

reduced model : ~ patient + gender


case2 ) 

full model : ~ patient + enzyme

reduced model : ~ patient 


do you think case 1 and case 2 is same?? or still case 1 is meaninful if I would like to consider the fact that disease is x-linked?? 

ADD REPLYlink modified 5.6 years ago • written 5.6 years ago by illinois.ks160

You could try, though with the numbers of patients you have you'll be very limited unless the effect is VERY large (human data is noisy).

When you want to look at gender while keeping a patient effect, you have to remember that the patient effect will need to be recoded. Otherwise, the gender and patient effects will be completely confounded. So, you'll end up needing to label all of the patients 1-3 and then give each of them a gender such that there's a male patient 1 and a female patient 1 (and so forth). That will produce a full rank model matrix (well, you'll have to remove a column since you have unbalanced groups, but you get the idea).

ADD REPLYlink written 5.6 years ago by Devon Ryan94k

Thank you for your response.

If I understand your point correctly, you suggest that I need to make a new column "patient-gender" instead of patient or gender(remove patient and gender column). Is it right? So somthing like this.. 


                      patient-gender               enzyme. 

sample1         patient1_female         pre_treatment  

sample2         patient1_female         post_treatment

sample3         patient2_male             pre_treamment 

sample4         patient2_male             post_treament 

sample5         patient3_female          pre_treamment

sample6         patient3_female          post_treament 

sample7         patient4_female          pre_treamment 

sample8         patient4_female post_treament 


full model : ~ patient-gender + enzyme

reduced model : ~ patient-gender 



Is it right?  Could you please confirm for this one more time? I really appreciate your comments!! 

ADD REPLYlink written 5.6 years ago by illinois.ks160

More like this:

    patient    gender    enzyme
sample1    1    f    pre
sample2    1    f    post
sample3    1    m    pre
sample4    1    m    post
sample5    2    f    pre

and so one. There's an example where they do this in either the limma or edgeR user's guide (I don't recall which). If you didn't do something like this, the patient effect would suck up too many of the degrees of freedom.

ADD REPLYlink written 5.6 years ago by Devon Ryan94k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 642 users visited in the last hour