How to exclude probes from affy microarray before normalization?
2
0
Entering edit mode
4.6 years ago
peter pfand ▴ 110

Dear community,

I have several Clariom D Human Arrays (Affymetrix) for which I want to remove some probes (the same set of probes in all of them).

I thought it might be possible to construct a library package in R with pdInfoBuilder. In order to do this, I downloaded the Analysis r1 files from Affy website and I filtered out the set of probes from the .pgf and .clf files. Then I created the library using the clf, pgf, mps and annotation (probeset and transcript) files. However, when I read my arrays (CEL files) with oligo package (Bioconductor), I still observe signal for the probes I removed. How can it be possible?

Is there any oher way to exclude such probes? EDIT:

I carried out the next steps:

1. To make the new library:

library(pdInfoBuilder) pkg <- new("AffyHTAPDInfoPkgSeed", pgfFile = pgfFile, clfFile = clfFile, probeFile = csvProbAnno, transFile = csvTransAnno, coreMps = coreMpsFile, extendedMps = extendedMpsFile, fullMps = fullMpsFile, biocViews = "AnnotationData", genomebuild = "hg38", organism = "Human", species = "Homo sapiens", url = "", chipName = 'clariom.d.human') makePdInfoPackage(pkg, destDir=".")

2. Then I compressed (.tar.gz) the directory where the new package is located and installed it with:

library('pd.clariom.d.human', repos=NULL)

3. To read the CEL files:

library(oligo) library(pd.clariom.d.human) raw_data <- read.celfiles(list.celfiles())

Thanks

microarrays affy probes bioconductor • 2.1k views
0
Entering edit mode

What are the commands that you are running?

0
Entering edit mode
4.6 years ago
peter pfand ▴ 110

These are the steps I carried out, with no success: To make the new library:

library(pdInfoBuilder)
pkg <- new("AffyHTAPDInfoPkgSeed", pgfFile = pgfFile, clfFile = clfFile,
probeFile = csvProbAnno, transFile = csvTransAnno, coreMps = coreMpsFile, extendedMps = extendedMpsFile,
fullMps = fullMpsFile,
biocViews = "AnnotationData", genomebuild = "hg38", organism = "Human",
species = "Homo sapiens", url = "", chipName = 'clariom.d.human')
makePdInfoPackage(pkg, destDir=".")


Then I compressed (.tar.gz) the directory where the new package is located and installed it with:

library('pd.clariom.d.human', repos=NULL)


To read the CEL files:

library(oligo)
library(pd.clariom.d.human)

0
Entering edit mode

Please move it to main Q rather than in comment; it is more useful there.

0
Entering edit mode
4.6 years ago

I don't know makePdInfoPackage, so cant comment on that. However, if you read cellfiles like this

raw_data <- read.celfiles(list.celfiles())


It will fetch the default annotations. See details of list.celfiles

The function guesses which annotation package to use from the header of the CEL file. The user can also provide the name of the annotaion package to be used (via the pkgname argument). If the annotation package cannot be loaded, the function returns an error. If the annotation package is not available from BioConductor, one can use the pdInfoBuilder package to build one.

I would not create a different package for removing probes. Rather, would have gotten the quantification matrix (exprs data), and then remove the undesired probes later. see

http://kasperdanielhansen.github.io/genbioconductor/html/oligo.html

0
Entering edit mode

That's the first thing I tried out, indeed. I read the CEL files and removed the probes from the raw data (HTAFeatureSet) object but I got an error when normalizing. That's why I uninstalled the package (pd.clariom.d.human) and installed a modified (by me) version without the probes.

0
Entering edit mode

You can first normalize, and then remove since each gene/probe is independent.

0
Entering edit mode

When I normalize, summarization takes place at gene/probeset level, so at that step I can't remove the probes, just either probesets or genes.