Question: DESEQ2 design analysis
0
gravatar for Teresa
4 months ago by
Teresa20
Teresa20 wrote:

Hi,

I have done an RNA-seq experiment with samples from Spain with a condition X, samples also from Spain but healthy controls, samples from France with a condition Y and samples also from France with the condition Y. If I want to do the DEG analysis using the DESEQ2 package to know the DE genes between the condition X and the condition Y, how do I have to normalice for the geographical gene expression changes?

Thanks a lot

rna-seq deseq2 • 230 views
ADD COMMENTlink modified 4 months ago by WouterDeCoster39k • written 4 months ago by Teresa20

You have two types of samples, and they combine two effects: the treatment (X and Y) and the sampling location (Spain and France).

You'll not be able to disentangle if the difference is due to the treatment or the sampling location. It is then only with support of the litterature or what you know about your system that you can discuss wether it is more likely to be one or the other...

Now with regards to the counts normalisation, I would proceed the usual way (like if you had 2 samples in condition A and 2 samples in condition B), following this paper recommendations for instance: https://f1000research.com/articles/5-1408/v3

ADD REPLYlink modified 4 months ago • written 4 months ago by Gautier Richard280

Thanks for your repply.

Just to be sure, Wouldn`t it be possible so using the counts from the healthy controls from each country to normalice the count of the samples of the same country with the condition A/B and then compare condition A and B?

ADD REPLYlink written 4 months ago by Teresa20

By following the above link (f1000research), you will see that they use the group information and the lane information. You could define the groups X, Y, Control_France and Control_Spain (or ControlFR and ControlS). That way you can come up with a contrast matrix that will result in the comparison of X to Y, Control France to Control Spain, X to Control Spain and Y to Control France I guess.

Design matrix:

    ControlFR  ControlS X   Y
1   1          0        0   0     
2   1          0        0   0     
3   0          1        0   0     
4   0          1        0   0     
5   0          0        0   1     
6   0          0        0   1     
7   0          0        1   0     
8   0          0        1   0

Contrast matrix:

          XvsY   FRvsS  XvsControl   YvsControl
ControlFR   0      1    0            -1
ControlS    0     -1   -1             0
X           1      0    1             0
Y          -1      0    0             1

The main issue will be the number of replicates maybe? How many do you have for each combination of Country and Treatment (in the example above there is two replicates per condition)?

The normalisation will happen on every given sample, independently of the groups I think.

Another way to proceed is to only perform the normalisation for each comparison you want to do (by only running DESeq2 / EdgeR on the samples you want to compare directly, for example, one run with ControlFR and Y, another run with ControlS and X, etc). Both ways can be done. For example the paper above is using the first way (normalisation of everything together then perform the comparisons you want), while Snakepipes pipeline is using the second way (normalise and compare one against another). For more information: https://snakepipes.readthedocs.io/en/latest/content/workflows/RNA-seq.html#rna-seq

ADD REPLYlink modified 4 months ago • written 4 months ago by Gautier Richard280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2183 users visited in the last hour