Analyse of HGU133 set microarrays / Merge ExpressionSets
0
0
Entering edit mode
6.1 years ago
giroudpaul ▴ 70

Hello,

I am currently trying to extract data from a GEO dataset which has been done on the Affymetrix HGU133 plateforme, meaning that all samples were done on two different chip : hgu133a and hgu133b.

Applying the advises presented here :

https://support.bioconductor.org/p/5339/#5341

https://support.bioconductor.org/p/2411/

Combining Two Platforms Affy Hgu133A And Hgu133B

I did the RMA normalization separately, meaning that I have now two expression sets :

data.rmaA = rma(dataA)
data.rmaB = rma(dataB)


The next step would be to combine these two expression sets into one before continuing my analysis with Limma.

What is the best solution to perform this ?

Actually, this is apparently a common question that is not specific to my case. The trouble is that the solutions I found always are for combining different chips with common probes, or strictly identical chips, whereas in my case it is two "sister" chips that has been done in parallel (same samples, at the same time).

For now, I tried two solutions, neither worked :

I tried combineTwoExpressionSet from a4base package, which give this error, probably because it is not meant to merge expressionsets from different chips with different number of probes (as I have).

data.rmaAB = combineTwoExpressionSet(data.rmaA, data.rmaB)
Error in cbind(assayData(x)$exprs, assayData(y)$exprs) :
number of rows of matrices must match (see arg 2)
> dim(assayData(data.rmaA)$exprs) [1] 22283 15 > dim(assayData(data.rmaB)$exprs)
[1] 22645    15


inSilicoMerging package seemed promising, but didn't worked for me either, because he looks for common probes between the exprset. And he's not happy because there is only 168 of them (qc.probes from Affymetrix)

Still, when using plotMDS and plotRLE from this same package, it points out a batch effect between hgu133A and hgu133B exprsets. So I still need to perform so kind of normalization between hgu133A and hgu133B?

My ExpressionSets :

> data.rmaA
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22283 features, 15 samples
element names: exprs
protocolData
sampleNames: GSM115046_M0_D1_chipA.CEL GSM115047_M0_D2_chipA.CEL ...
GSM115060_M2_D3_chipA.CEL (15 total)
varLabels: ScanDate
phenoData
sampleNames: GSM115046_M0_D1_chipA.CEL GSM115047_M0_D2_chipA.CEL ...
GSM115060_M2_D3_chipA.CEL (15 total)
varLabels: sample
featureData: none
experimentData: use 'experimentData(object)'
Annotation: hgu133a
> data.rmaB
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22645 features, 15 samples
element names: exprs
protocolData
sampleNames: GSM115061_M0_D1_chipB.CEL GSM115062_M0_D2_chipB.CEL ...
GSM115075_M2_D3_chipB.CEL (15 total)
varLabels: ScanDate
phenoData
sampleNames: GSM115061_M0_D1_chipB.CEL GSM115062_M0_D2_chipB.CEL ...
GSM115075_M2_D3_chipB.CEL (15 total)
varLabels: sample
featureData: none
experimentData: use 'experimentData(object)'
Annotation: hgu133b

microarray expressionSet hgu133 R Bioconductor • 1.9k views