Question: How to create a design matrix for Cpg.annotate?
0
gravatar for c.ryder3
3.6 years ago by
c.ryder330
c.ryder330 wrote:
> head(ICGC_2)
              naive.1   memoryCS.1   naive.2   memoryCS.2   naive.3  memoryCS.3
cg00000029  0.6199970   0.5703951  0.6383819   0.5831206  0.7012571  0.6000816
cg00000108  0.9083578   0.9105157  0.9030611   0.9103147  0.9115842  0.8947593
cg00000109  0.8694214   0.7525098  0.8478160   0.7725212  0.8645145  0.7636347
cg00000165  0.1911901   0.3050081  0.1810569   0.3750369  0.2250429  0.3094155
cg00000236  0.8666489   0.8382011  0.8586420   0.8369283  0.8860430  0.8439371
cg00000289  0.6653662   0.5512665  0.5815338   0.4773868  0.6254710  0.5408634

Above is a snippet of a data frame I have in R that contains 450K methylation beta values for 6 samples, 3 of which are from naive B cells and 2 of which are from memory class-switched B cells.

I would eventually like to identify differentially methylated genomic regions in the naive samples compared to the memoryCS samples using the Bioconductor package DMRcate.

However, I'm stuck on creating a design matrix for Cpg.annotate. I've tried following the workflow available here... https://www.bioconductor.org/help/workflows/methylationArrayAnalysis/ ...but this doesn't explain too well how exactly to go about creating a design matrix.

Can anyone explain how I can go about creating a design matrix that will allow me to compare the naive and memoryCS samples?

Thank you

ADD COMMENTlink modified 3.6 years ago by e.rempel1000 • written 3.6 years ago by c.ryder330
3
gravatar for e.rempel
3.6 years ago by
e.rempel1000
Germany, Heidelberg
e.rempel1000 wrote:

I will provide a short answer here, but I would strongly recommend you to read more about statistics and linear models, in particular.

In mentioned manual, the authors are using the factors of interest to create the design matrix with function model.matrix. In your case, the factor of interest is the type of cells: naive or memoryCS. So you can create your design matrix like that:

type_cells <- factor(rep(c("naive","memoryCS"),3), levels = c("naive","memoryCS")) 
design <- limma::model.matrix(~0 + type_cells)
colnames(design) <- c("naive","memoryCS")

Then you can fit a linear model to your data:

fit <- lmFit(ICGC_2, design)

Then you could proceed with the manual.

HTH

ADD COMMENTlink written 3.6 years ago by e.rempel1000
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2135 users visited in the last hour
_