Question: Model Matrix Design - Replicates Within and Between Batches
0
gravatar for adamlangenbucher
2.7 years ago by
adamlangenbucher0 wrote:

Hello all,

I'm trying to combine two datasets for analysis and had some questions about setting up the design matrix. I'm trying to account for differences between two batches, two conditions, and technical replicates. The target list looks something like the following:

Sample     Condition     Batch     Individual    
  1           1            1           1   
  2           1            1           2   
  3           1            1           3   
  4           1            1           4   
  5           1            1           5   
  6           1            1           6   
  7           1            1           7   
  8           1            1           8   
  9           1            2           1   
  10          1            2           1   
  11          1            2           2   
  12          1            2           2   
  13          2            2           9   
  14          2            2           9   
  15          2            2           10   
  16          2            2           10    
  17          2            2           11    
  18          2            2           11

As you can see, there are both technical replicates within batches (all individuals in batch 2 have replicates), as well as replicates between batches (individuals 1 and 2 are present in both batches).

I can't create a model matrix as is, presumably because the variables are not linearly independent. Is there any way to develop a consistent model matrix to account for all these variables, given that not all individuals have replicates?

Thanks for your help,

-Adam

ADD COMMENTlink modified 2.7 years ago by russhh4.3k • written 2.7 years ago by adamlangenbucher0
0
gravatar for russhh
2.7 years ago by
russhh4.3k
UK, U. Glasgow
russhh4.3k wrote:

With targets as defined:

targets.df <- data.frame(
    Sample = 1:18,
    Condition = c(rep(1, 12), rep(2, 6)),
    Batch = c(rep(1, 8), rep(2, 10)),
    Individual = c(1:8, rep(c(1,2,9,10,11), each = 2))
    )

It should be do-able (I'm assuming: no individual is exposed to both conditions, no individual is replicated within batch1). But it needs a hacked together design matrix:

You need binary columns for: i) Intercept; ii) Condition2; iii) Batch1 Then you need a binary column for any individual who is present in both of the batches. Then you need a binary column for all but one of the remaining samples in batch2

So for the targets data.frame you've posted (apologies for the ugly code),

design <- with(targets.df, 
  data.frame(
    intercept = 1,
    cond2 = ifelse(Condition == 2, 1, 0),
    batch1 = ifelse(Batch == 1, 1, 0),
    match1 = ifelse(Individual == 1, 1, 0),
    match2 = ifelse(Individual == 2, 1, 0),
    match9 = ifelse(Individual == 9, 1, 0),
    match10 = ifelse(Individual == 10, 1, 0)
    ))

  Matrix::rankMatrix(design) # 7
> design
   intercept cond2 batch1 match1 match2 match9 match10
1          1     0      1      1      0      0       0
2          1     0      1      0      1      0       0
3          1     0      1      0      0      0       0
4          1     0      1      0      0      0       0
5          1     0      1      0      0      0       0
6          1     0      1      0      0      0       0
7          1     0      1      0      0      0       0
8          1     0      1      0      0      0       0
9          1     0      0      1      0      0       0
10         1     0      0      1      0      0       0
11         1     0      0      0      1      0       0
12         1     0      0      0      1      0       0
13         1     1      0      0      0      1       0
14         1     1      0      0      0      1       0
15         1     1      0      0      0      0       1
16         1     1      0      0      0      0       1
17         1     1      0      0      0      0       0
18         1     1      0      0      0      0       0

I'd strongly urge you to test this design though.

ADD COMMENTlink written 2.7 years ago by russhh4.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1171 users visited in the last hour