specific RNA-SEQ GLM design
0
0
Entering edit mode
5.6 years ago
gero007 • 0

Hi,

I received RNA-SEQ data for analyzing without being involved in the experimental design before the sequencing was conducted. I'm using GLM in edgeR for comparing the conditions, but now for this particular setup seems like adapt the models presented in the edgeR guide is not so trivial (at least for my limited statistical knowledge).

They have a cell line which can be transformed (outcome) with a single point mutation (Mutation). They tested by knocking out(KO) two different genes if they can avoid this transformation. The results showed that both knockouts could avoid the transformation. However from the preliminary data analysis I can tell that the effect on the expression profiles for the mutation is largely bigger than for both of the knockouts.

The data available can be represented like this:

sampleMutationKOOutcome
1wtwtNon_Transformed
2wtwtNon_Transformed
3wtwtNon_Transformed
4G12VwtTransformed
5G12VwtTransformed
6G12VwtTransformed
7G12VgenXNon_Transformed
8G12VgenXNon_Transformed
9G12VgenXNon_Transformed
10G12VgenYNon_Transformed
11G12VgenYNon_Transformed
12G12VgenYNon_Transformed


Being said that the effect of the mutation is overwhelming in comparison with the knockouts, the expression profiles of the knockouts are extremely close to the control (wt for Mutation + wt for knockouts). The idea here is to understand this slight difference that can avoid the transformed outcome

For trying to modelate this I coded

 >Mutation <- as.factor(c("NoMut","NoMut","NoMut","G12V","G12V","G12V","G12V","G12V","G12V","G12V","G12V","G12V"))

>KO <- as.factor(c("ctrl","ctrl","ctrl","ctrl","ctrl","ctrl","genx","genx","genx","geny","geny","geny"))

 >design <- model.matrix(~KO+Mutation, dgeCountsClean$samples)

wich renders the design

>design
            (Intercept) KOgenyKO KOgenxKO MutationG12V
wt_1                 1         0        0            0
wt_2                 1         0        0            0
wt_3                 1         0        0            0
G12V_1               1         0        0            1
G12V_2               1         0        0            1
genxG12V_1           1         0        1            1
genxG12V_2           1         0        1            1
genxG12V_3           1         0        1            1
genyG12V_1           1         1        0            1
genyG12V_2           1         1        0            1
genyG12V_3           1         1        0            1

I checked the the KOgenyKO and KOgenxKO, for deferentially expressed genes but unfortunately this design for these samples doesn't seems to be sensitive enough for accounting differences in the expression profiles. I thought that maybe modelling the GLM as ~KO*Mutation and checking for the coefficients in the interaction between the conditions (KOgenyKO:MutationG12V and KOgenxKO:MutationG12V) could help, but the problem with this particular design is that because I don't have the conditions of the KO without the mutation, the conditions KOgenxKO:MutationG12V and KOgenxKO (and the same for the geny) are redundant and the matrix is not of full rank.

So if anyone could give some piece of advice, a tutorial to read, or any tip for help me get out of this conundrum I will be extremely grateful.

Cheers!

Gero.

RNA-Seq R Design GLM • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 2339 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6