design matrix for dge
1
0
Entering edit mode
5.5 years ago

I am doing Cross-platform normalization and network-based data analysis of gene expression data to find out potential biomarker of cancer .I have merged 3 datasets without any treatment ( two of Affymetrix Human Genome U133A Array and one with Affymetrix Human Genome U133A Array plus 2, platform GPL96 AND GPL570) using insilicomerging package. I have 229 cancer samples and 74 normal one with i.e a ratio of about 3:1. I want to do the linear model analysis to find out differentially expressed genes.

limma r differential gene expression • 1.8k views
0
Entering edit mode
0
Entering edit mode

wow, meta-analysis is considerably more difficult than single dataset analysis. Have you analysed each dataset separately yet?

0
Entering edit mode

Yes, I have normalized each dataset separately and the merged them using insilico merging. Also, I am not doing the meta-analysis, its different from meta-analysis. It's cross-platform normalization method.

0
Entering edit mode
5.5 years ago

actually i don't know how to write code for design matrix

1
Entering edit mode

I'd suggest you follow @WouterDeCoster's suggestion and look at the Limma users guide.

0
Entering edit mode

And afterwards, if you have more specific questions, feel free to come back. But we're not here to help you perform your analysis from A to Z. If you get stuck and have a specific problem we would be happy to help. Quicker answer and free cookie if you have an extensive explanation on what you want, what doesn't work and what you've tried so far.

0
Entering edit mode

Codes that i have tried to merge data sets:

library(inSilicoMerging)

esets = list(ALLSet1, ALLSet2, ALLSet3, ALLSet4); esets

eset_NONE = merge(esets, method="NONE");

eset_COMBAT = merge(esets, method="COMBAT")

exprs(eset_COMBAT)

pData(eset_NONE)

colnames(pData(eset_NONE))

# VALIDATION

plotMDS(eset_NONE,
colLabel = "X.Sample_characteristics_ch1", symLabel = "Study", main = "NONE (No Transformation)")

plotMDS(eset_COMBAT, colLabel = "X.Sample_characteristics_ch1", symLabel = "Study", main = "COMBAT")

plotRLE(eset_NONE, colLabel = "Study", main = "NONE (No Transformation)"); plotRLE(eset_COMBAT, colLabel = "Study", main = "COMBAT");

gene = sample(rownames(exprs(eset_NONE)), 1)

plotGeneWiseBoxPlot(eset_NONE, batchLabel = "Study", colLabel = "X.Sample_characteristics_ch1", gene = gene, main = "NONE (No Transformation)");

plotGeneWiseBoxPlot(eset_COMBAT, batchLabel = "Study", colLabel = "X.Sample_characteristics_ch1", gene = gene, main = "COMBAT");

idx = sample(1:ncol(eset_NONE), 40);

plotDensities(eset_NONE[,idx], batchAnnot = "Study", main = "NONE (No Transformation)", legend = FALSE);

plotDensities(eset_COMBAT[,idx], batchAnnot = "Study", main = "COMBAT", legend = FALSE)

Finally, i have expression matrix hat is normalized and now i want to do limma analysis i.e linear model analysis. I am confused and not able to write code for design matrix.

0
Entering edit mode

Context is important. I'd propose that you edit your original question with well formatted code, what you're trying to achieve, the experimental question, how many samples? what treatments? what arrays have been used? etc...

0
Entering edit mode

Thankyou @ andrewj.skeleton73 for your reply. My original question is updated. If you want any other information i will provide it.

0
Entering edit mode

Anything else you need to know??

0
Entering edit mode

hi @ andrewj.skeleton73, I tried some codes but I am getting an error.

My codes are:

samples <- c(eset1$characteristics, eset$characteristics)

samples <- as.factor(samples)

samples

design <- model.matrix(~0 + samples)

colnames(design) <- c( "TUMOR", "NORMAL")

design

levels(samples)

library(limma)

fit <- lmFit(filteredEset, design)

contrast.matrix <- makeContrasts("TUMOR-NORMAL", levels = "samples")

2
Entering edit mode

You seem to be on the right track. Try:

contrast.matrix <- makeContrasts("TUMOR-NORMAL", levels = colnames(design))

0
Entering edit mode

thanks, it worked. now, I have a list of DEG. can you tell me what could be done next?

0
Entering edit mode

Depends on your experimental question. There are options in pathway analysis, GO term enrichment downstream to differential expression analysis.

0
Entering edit mode

Is this work is sufficient to be published or I need to go further for network analysis (or your suggestions) and then publish my work. It's mandatory to publish a paper to complete my degree...

0
Entering edit mode

You mention in your OP that your aim is to find a "cancer biomarker". Combining thee datasets and doing a differential expression test is probably not enough unless you have something novel. This would be plausible if you found something significantly differentially expressed in the combined dataset, that you don't see in each dataset individually, but even then I'd be skeptical and say you'd have to validate by qPCR at least. Even after that, you need something that would tie it to the pathology of the disease... I think you should consult with your supervisor over this.