Question

Meta_analysis of different platforms of microarray data

2

Entering edit mode

5.3 years ago

j_jamal96 ▴ 20

Hi Dears… In a meta-analysis study, I am trying to find DEGs from different individual studies with platforms like 3D-Gene Human Oligo chip 25k V2.1, Affymetrix Human Genome U133 Plus 2.0 Array, Affymetrix Human Genome U133 Plus 2.0 Array, Agilent-014850 Whole Human Genome Microarray 4x44K G4112F and Agilent-014850 Whole Human Genome Microarray 4x44K G4112F. I would be very happy if anyone could help me to find the possibility of analysis and if it is possible which program would be more efficient?

microarray meta analysis R Transcriptome • 1.4k views

ADD COMMENT • link updated 5.3 years ago by Kevin Blighe 87k • written 5.3 years ago by j_jamal96 ▴ 20

score 2 · Answer 1 · 2019-01-15

If the condition under study is the same in each experiment, process each experiment independently and then perform a meta-analysis on the results. A problem that you'll face is that the packages needed to process the different array types will be different.

If the studies using Affymetrix Human Genome U133 Plus 2.0 Array are the exact same experimental set-up, you could feasibly just combine these and normalise them together, in which case you may still want to adjust for batch effects after (if you don't know what I am talking about, then ignore this) - same for the studies using Agilent-014850 Whole Human Genome Microarray 4x44K G4112F

Note that 3D-Gene Human Oligo chip 25k V2.1 is a lower density array compared to the others, so, there will not be extensive overlap between it and the others, i.e., different genes will be targeted.

For the Agilent arrays, a general pipeline to read in and normalise:

library("limma")

#Read in the data into a dataframe
#readTargets will by default look for the 'FileName' column in the spcified file
targetinfo <- readTargets("Targets.txt", sep="\t")

#Converts the data to a RGList (two-colour [red-green] array), with values for R, Rg, G, Gb
project <- read.maimages(targetinfo, source="agilent")

#Perform background correction on the fluorescent intensities
project.bgcorrect <- backgroundCorrect(project, method="normexp", offset=16)

#Normalize the data with the 'loess' method
#LOESS performs local regression on subsets of the data, resulting in the generation of a 'regression curve' through it
project.bgcorrect.norm <- normalizeWithinArrays(project.bgcorrect, method="loess")

#For replicate probes in each sample, replace values with the average
#ID is used to identify the replicates
project.bgcorrect.norm.avg <- avereps(project.bgcorrect.norm, ID=project.bgcorrect.norm$genes$ProbeName)

The file, Targets.txt, may look like:

FileName                                         WT_KO  Time
SampleFiles/251486810768_GE2-v5_95_Feb07_1_1.txt    WT  4Wk_TAC
SampleFiles/251486810768_GE2-v5_95_Feb07_1_2.txt    KO  4Dy_Rev
SampleFiles/251486810768_GE2-v5_95_Feb07_1_3.txt    KO  7Dy_Rev
SampleFiles/251486810942_GE2-v5_95_Feb07_1_1.txt    WT  4Wk_TAC
SampleFiles/251486810942_GE2-v5_95_Feb07_1_2.txt    WT  4Dy_Rev

--------------------------------------------------------

For the Affymetrix arrays:

library("oligo")

#Read in the data into a dataframe
targetinfo <- readTargets("Targets.txt", sep="\t")
CELFiles <- list.celfiles("SampleFiles/", full.names = TRUE)

#Raw intensity data
project <- read.celfiles(CELFiles)

#Background correct, normalize, and calculate gene expression
project.bgcorrect.norm.avg <- rma(project, background=TRUE, normalize=TRUE, target="core")

Targets.txt:

FileName                                    SampleID    Group
SampleFiles/1_CS0911a_(HuGene-2_0-st).CEL   CS0911a     KN92
SampleFiles/10_CS0812d_(HuGene-2_0-st).CEL  CS0812d     KN92_WNT3A
SampleFiles/11_CS0812e_(HuGene-2_0-st).CEL  CS0812e     KN93_WNT3A

---------------------------------------------------

After you have normalised the data, in each case, you can perform differential expression analysis using the limma package - again, this will be performed independently in each study.

It is your role to learn what each step above is doing, and it is your role to learn how to use limma to perform differential expression analysis. You can also learn how to annotate your data with gene names (which will be required, possibly using biomaRt) and to perform the end meta-analysis, if that is your aim.

What I have written here is a rough guide to help you get started.

Good luck

Kevin