removing Control probesets
1
0
Entering edit mode
7.6 years ago
fi1d18 ★ 4.1k

Hello,

I have some CEL files and a CDF file for them, now how I can remove Control probesets (probesets with name started by AFFX) please? I mean how I can define these probesets to enter them in text file to be excluded without need to R?

gene R • 2.4k views
0
Entering edit mode

Hi Alexander and thanks a lot,

I am not familiar enough with R and I don't know the nature of codes and just copying and pasting them from tutorials...

I entered your codes like below:

> library(affy)

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from 'package:stats':

xtabs

The following objects are masked from 'package:base':

anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
do.call, duplicated, eval, evalq, Filter, Find, get, intersect,
is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax,
pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rep.int,
rownames, sapply, setdiff, sort, table, tapply, union, unique,
unlist, unsplit

Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

>
> exp <- rma(data)

Background correcting
Normalizing
Calculating Expression
> control <- grep("AFFX",rownames(exp))
> exp <- exp[-control,]
>


Now where is the resulted normalized text file please to use it in the later purpose please? I need a normalized text file

If the control probsets have removed now that I use these Cel files as a input in RMAExpress for more normalizing?

1
Entering edit mode

You won't need to normalise the data again. You can get the normalised text file with

write.exprs(exp, file="output.txt", sep="\t")

0
Entering edit mode

Thank you very much

4
Entering edit mode
7.6 years ago

Hi Sarah, CEL files are binary files, so they will not be readable in a text editor -- you will have to use R or Matlab. Is there any reason in particular you do not want to use R?

Once they are loaded into R, it shouldn't be too difficult to remove them; given all of them have the standardised format of starting with "AFFX", you can use grep:

# load the affy library
library(affy)

exp <- rma(data)

# remove all probes starting with "AFFX"
control <- grep("AFFX",rownames(exp))
exp <- exp[-control,]

0
Entering edit mode

Sorry Alexander, with which codes in above script I could perform these arrays median centering and set standard deviation to 1 per array please?

1
Entering edit mode
# exprs gives a data frame with expression levels
e <- exprs(exp)

# median centreing
e <- apply(e,1,function(x) x-median(x))

# set SD to 1 with scale function
e.scaled <- t(scale(t(e),center=FALSE))

0
Entering edit mode

Thank you for your valuable help

0
Entering edit mode

So sorry to disturb you, but may I know please from where I can get these codes because whenever I am using sporadic codes from different tutorials, I face error then I got frustrated to R

1
Entering edit mode

I know the feeling, I only started learning R recently too. It really is an invaluable tool to know your way around, and I'd recommend spending some time familiarising yourself with data types and methods if possible. I don't have any one resource that I use, I tend to go to either the manual pages or just simply Google things. I cannot stress how much having an underlying understanding of the language helps when amalgamating bits and pieces of code from the web though.

There is an online course that's started recently that might be of interest to you if you are interested in learning R: https://www.coursera.org/course/rprog

1
Entering edit mode

Thanks again