Question: making matrix model for samples with one wild type and 4 different results
0
gravatar for moru_all
12 months ago by
moru_all0
moru_all0 wrote:

Hello i'm newbie for bioinformatics field, and now studying the R to get the DEG from the samples. (And I apologize with my bad English because i'm NOT USED TO WRITING WITH IN ENGLISH.. )

I have four samples from same cell line which were treated with same treatment, but they are assumed to have different characteristics because of the other factors . (this is just the characteristic of the experiment)

I know that i need to use model.matrix() function in R to design the experiment for edgeR or Voom (limma).

Because of my short knowledge of statistics and BI, i'm really not sure how to design the matrix in proper way.

Is there anybody who can give me the idea how to design the matrix?

There are some ideas that i thought of, but i'm not sure with them because there's no replicate for wild type.

(1) First try:

 ............|      DIFF         |         WT  
ResultCell.1 |                1  |             0
ResultCell.2 |                1  |             0
ResultCell.3 |                1  |             0
ResultCell.4 |                1  |             0
WildTypeC    |                0  |             1

The code:

TAG = factor(c(rep("DIFF",4),"WT"))
design = model.matrix(~0+TAG)
YY = estimateGLMCommonDisp(YY, design, verbose=TRUE)
YY = estimateGLMTrendedDisp(YY, design)
YY = estimateGLMTagwiseDisp(YY, design)
FIT = glmFit(YY, design)
LRT = glmLRT(FIT, contrast = c(1,-1))

(2) Second try:

..............|         DIFF     |      WT 
ResultCell.1  |               1  |            0
WildTypeCel   |               0  |            1
ResultCell.2  |               1  |            0
WildTypeCel   |               0  |            1
ResultCell.3  |               1  |            0
WildTypeCel   |               0  |            1
ResultCell.4  |               1  |            0
WildTypeCel   |               0  |            1

The code:

SAMPLES = factor(c(1,1,2,2,3,3,4,4))
GROUP = factor(c(rep(c("DIFF","WT"),4)))
design = model.matrix(~GROUP+SAMPLES)
y = estimateGLMCommonDisp(y, design, verbose=TRUE)
y = estimateGLMTrendedDisp(y, design)
y = estimateGLMTagwiseDisp(y, design)
FIT = glmFit(y, design)
LRT = glmLRT(FIT, contrast = c(0,-1,0,0,0))

I tried both, and the result really different. To me, it was much better to see the plot of the second design matrix, because it showed much more organized form. I really tried to follow edgeR help page and searched some pages, but there was no case that used only one wild type and comparing other samples...... (in case that i searched carefully).

voom deg rna-seq edger R • 338 views
ADD COMMENTlink modified 12 months ago by Marks40 • written 12 months ago by moru_all0
0
gravatar for Marks
12 months ago by
Marks40
Marks40 wrote:

You do not have enough replicates for DE analysis. There's probably a way to perform 1v1 comparison but it's not statistically sound. Do you have a second replicate of WildTypeCel? Depending on what you aim to do, you will also need more replicates for the other ResultCell.#.

If you have the minimum required replicates, and ResultCell1/2/3/4 are all different your design matrix would look like this:

.............|      DIFF         |         WT  
ResultCell.1 |                1  |             0
ResultCell.1 |                1  |             0
WildTypeC    |                0  |             1
WildTypeC    |                0  |             1

etc for the rest of the `ResultCell.#. There is a way to perform more complicated comparisons uses the formulas but I won't go into this.

If ResultCell.# are all very similar and essentially all replciates of each other, then your matrix would look like this:

.............|      DIFF         |         WT  
ResultCell.1 |                1  |             0
ResultCell.2 |                1  |             0
ResultCell.3 |                1  |             0
ResultCell.4 |                1  |             0
WildTypeC    |                0  |             1
WildTypeC    |                0  |             1

Note the 2 replicates for WildTypeC.

Currently, you do not have enough replicates. Unfortunately, this means you won't be able to perform DE analysis. There is a section the edgeR manual on what to do if you have no replicates. The manual is fantastic and I highly recommended you have a thorough read of it.

Good luck

ADD COMMENTlink written 12 months ago by Marks40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1222 users visited in the last hour