Question: DESeq2: can I correct for relatedness when using data from multiplex families?
0
gravatar for tiphaine
4.4 years ago by
tiphaine10
European Union
tiphaine10 wrote:

Hi All.

My question is the same blood than this post C: DESeq2: can I correct for relatedness when using data from multiplex families? 

In your solution, the model takes care only if the samples belong to the same family. I would like to know how we can add the level of relatedness such as MZ twin (100%) and DZ twin (50%). Currently, I have a variable zygosity with 2 levels (MZ and DZ).

Do you thus suggest to use the following model?

​~  Covariate + Technical.Confounder + familyID + zygosity + condition

and how we do the same thing if we are also also other members of family.

Regards,

Tiphaine

rna-seq deseq2 relatedness • 1.3k views
ADD COMMENTlink modified 4.4 years ago by Devon Ryan91k • written 4.4 years ago by tiphaine10

Do you have multiple individuals per family? That is, do you only have the twins or do you have other individuals as well?

ADD REPLYlink modified 4.4 years ago • written 4.4 years ago by Devon Ryan91k

Currently, I have two cases.

One model where I have only twin pairs (DZ and MZ) and another model where I have a mixture of twin pairs and singles but I know whether they are MZ or DZ.

I have not yet other individuals in my family. It is more for my curiosity to know how to deal it too.

 

 

 

ADD REPLYlink written 4.4 years ago by tiphaine10
0
gravatar for Devon Ryan
4.4 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

The only real way to do this in a generalized linear model is to add a column to the model matrix denoting the twin type. In cases where you have only a single set of twins per family, then there's no gain from doing this (in fact, the model matrix will be rank deficient). This would only work for cases where you have more individuals per family than simply the twins. If this approach isn't acceptable, then the only alternative is to use a different kind of model, where you can input a sample correlation or relatedness structure of some sort. Limma provides some methods in that regard, though I don't know enough about them to say how much more useful they may be in this case.
 

ADD COMMENTlink written 4.4 years ago by Devon Ryan91k

thanks for this

currently, I uses lmer4 with a random effect for zygosity and family. So I can use DESeq to normalise my data but not to find the differential expression.

 

ADD REPLYlink written 4.4 years ago by tiphaine10

Keep in mind that unless you have a large number of samples that you'll have lower power with lmer4.

ADD REPLYlink written 4.4 years ago by Devon Ryan91k

Oh, I didn't know that. I am going to look at that and maybe I come back to you if I am not sure to do.

do you have a link that explains that.

thanks

ADD REPLYlink written 4.4 years ago by tiphaine10

The paper on limma is probably the most appropriate place to start, since everything else (DESeq2, edgeR, etc.) really follows from it. Yes, that describes microarrays, but the underlying statistical argument applies. In short, lmer4 treats each gene individually, whereas DESeq2/edgeR/limma/etc. share information across genes to better estimate things like variance. They also incorporate priors for variance and fold-change shrinkage, which has some of the same effect as using a mixed model.

ADD REPLYlink written 4.4 years ago by Devon Ryan91k

Ok, thanks. I am going to reads the papers to be sure about my models because I thought to use it the same protocol for my different omic data that are generated into tables of per-feature counts for each sample such as microbiome .

ADD REPLYlink written 4.4 years ago by tiphaine10

I have a question about the filter/normalisation step: do you use DESeq (first version) to perform it? For instance, as explained in this vignette: http://www.bioconductor.org/packages/release/bioc/vignettes/genefilter/inst/doc/independent_filtering.pdf?

ADD REPLYlink written 4.4 years ago by alesssia520

You could do it manually, but it's simpler to just use DESeq2, which will do that for you.

ADD REPLYlink written 4.4 years ago by Devon Ryan91k

Thank you for your answer, but I believe that my question was not well-posed --sorry about that.

As Tiphaine, I am interested in using GLMM to find differentially expressed genes, but I think that to have meaningful results I need to perform a filtering/normalisation step beforehand. A better question would have been: "How to perform these steps on raw counts data before applying a GLMM?". 

To the best of my knowledge, DESeq2 performs the filtering step by means of the results() function. It selects genes which optimise the number of adjusted p-values less than a given value -- and sets the  p-values for the genes which do not pass the filter to NA. How these outcomes can be used in my context?

Btw, do you think it is better to start a new thread? IMHO, my intervention is making it messy...

 

ADD REPLYlink written 4.4 years ago by alesssia520

Yeah, it might be a bit cleaner to start a new thread.

ADD REPLYlink written 4.4 years ago by Devon Ryan91k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 626 users visited in the last hour