Question

gcRMA normalize and run differential expression for signal intensity data without CEL files

0

Entering edit mode

4.9 years ago

shawn.w.foley ★ 1.3k

Hello!

I have some Illumina Bead Array data from a publication, and I'm struggling to properly analyze them. The data are deposited in GEO, and the arrays were performed on Illumina HumanWG-6_V2_0_R2 and Illumina HumanWG-6 v2.0 expression beadchip. All of my experience is with Affy, so this may be a case of user error.

As opposed to a .cel file, these data are uploaded as .txt files. For example:

ID_REF  VALUE   Detection Pval
ILMN_1762337    6.061683178     0.6073781
ILMN_2055271    6.506861687     0.03162055
ILMN_1736007    6.121051788     0.4624506
ILMN_2383229    5.960764885     0.8155468

I've generated density plots of the signals in the VALUE column across multiple samples, and the density plots are very divergent, indicating that these are raw data I'm looking at (the manuscript and GEO entry were unclear).

My goal is to perform gcrma normalization, then analyze these data via limma. The issue I'm running into is that when I use read.table to import these data I receive errors when I try to run gcrma:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘indexProbes’ for signature ‘"data.frame", "character"’

What's the best way to import, normalize, and analyze array data from a .txt file using limma?

Thank you!

microarray limma gcrma illumina • 1.9k views

ADD COMMENT • link updated 4.9 years ago by Kevin Blighe 87k • written 4.9 years ago by shawn.w.foley ★ 1.3k

score 1 · Answer 1 · 2019-05-24

1

Entering edit mode

4.9 years ago

Kevin Blighe 87k

Hey, limma has dedicated functions for reading Agilent microarrays. Indeed, Affymetrix raw signal files are CEL, whereas Agilent's are mostly TXT.

I have put a full pipeline here: A: build the expression matrix step by step from GEO raw data

Please take time to consider the parameters used in each line of my code - it is possible that you will have to modify some parameters depending on whether your arrays are one- or two-colour.

By the way, the concept of performing RMA normalisation on Agilent arrays just does not apply in the same way as it does for applying RMA on Affymetrix. What do I mean? - RMA and gcRMA were developed for Affymetrix arrays and this method is implemented in oligo and affy packages. The way that we normalise Agilent arrays only has some resemblance to the RMA method, but they are not identical.

Kevin

ADD COMMENT • link 4.9 years ago by Kevin Blighe 87k

0

Entering edit mode

Thank you for sharing that very useful workflow, and for clarifying some of the differences between Illumina and Affy arrays.

The problem I'm running into now is that read.maimages is looking for columns for two colors. Would the VALUE column in the OP correspond to the raw intensity of a single color array, or has this file from GEO already undergone some normalization? At what step in the workflow would this be the equivalent of? I've tried to run:

project <- read.maimages(targetinfo, columns='ID',annotation='HumanWG-6_V3_0_R3_11282955_A.txt')
project.bgcorrect <- backgroundCorrect(project, method="normexp", offset=16)

And have gotten:

Array 1Error in normexp.fit(x, method = normexp.method) : 
  Not enough data: need at least 4 non-missing corrected intensities

Sorry for the basic questions, I've been scouring the internet and can't seem to get the answers I'm looking for. Thank you for the help!

ADD REPLY • link 4.9 years ago by shawn.w.foley ★ 1.3k