Question: EdgeR matrix design and comparisons for paired samples
1
gravatar for silas008
6 weeks ago by
silas008130
Brazil
silas008130 wrote:

Hi guys,

I am a bit confused about the statistics for paired samples in edgeR.

I have 4 different treatments, A, B, C and D, each one with 4 samples. 2 of those samples are "before" treatment and the other 2 are "after" treatment.

If iam correct, checking the edgeR manual, the design of the model matrix should be:

> groups <- factor(targets$Group)
> treatment <- factor(targets$Treatment, levels=c("before","after"))
> design <- model.matrix(~groups+treatment)

But in the case I have a data that is a simple table containing the genes in the first column and de samples in the other columns, how can I construct the model matrix to accept this table format?

I think I can simple open the table as a matrix and atributte the factors to the samples:

> my_table <- data.matrix(my_table, row.names.default(my_table))
> groups <- factor(c(A1,A2,A3,A4,B1,B2,B3,B4,C1,C2,C3,C4,D1,D2,D3,D4))
> treatment <- factor(c("before", "before", "after", "after","before", "before", "after", "after","before", "before", "after", "after",))
> design <- design.matrix(~groups+treatment)
> y <- DGEList(counts=my_table, group=groups)

But I don't know if this is correct.

Does anyone can help with that, I'd really appreciate it.

Thanks

edger rna-seq • 146 views
ADD COMMENTlink modified 6 weeks ago by h.mon31k • written 6 weeks ago by silas008130
1
gravatar for h.mon
6 weeks ago by
h.mon31k
Brazil
h.mon31k wrote:

The "correct" way will depend on what A, B, C, D, before and after are, and on what you are interested to test, but it seems to me a better approach (not that what you did is wrong) in your case would be to create a factor combining both group and treatment

Group <- factor( paste( groups, treatment, sep = "." ) )
design <- design.matrix( ~ 0 + Group )
y <- DGEList( counts = my_table, group = Group )

I have 4 different treatments, A, B, C and D, each one with 4 samples. 2 of those samples are "before" treatment and the other 2 are "after" treatment.

If A, B, C and D, are treatments, why do you name the factor which describes them as group? And if before and after are time of sampling, why do you call this factor treatment instead of time? Adequately naming variables will make your code easier to understand.

ADD COMMENTlink written 6 weeks ago by h.mon31k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 675 users visited in the last hour