Question

Microarray Data in R, Annotation of Transcript clusters

0

Entering edit mode

4.0 years ago

gonzalezb549 • 0

This is the error I keep getting

Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.

This is what the featureNames look like for the filltered

> View(dorsey_manfiltered)
> featureNames(dorsey_manfiltered)
   [1] "1007_s_at"    "1053_at"      "117_at"       "121_at"       "1255_g_at"    "1294_at"      "1316_at"     
   [8] "1320_at"      "1405_i_at"    "1431_at"      "1438_at"      "1487_at"      "1494_f_at"    "1552256_a_at"
  [15] "1552257_a_at" "1552258_at"   "1552261_at"   "1552263_at"   "1552264_a_at" "1552266_at"   "1552269_at"  
  [22] "1552271_at"   "1552272_a_at" "1552274_at"   "1552275_s_at" "1552276_a_at" "1552277_a_at" "1552278_a_at"

d

orsey_medians <- rowMedians(Biobase::exprs(dorsey_eset))
#I guess threshold would be 1, graph looks differen tho
man_threshold <- 1
hist_res <- hist(dorsey_eset, 100, col = "red", freq = FALSE, 
                 main = "Histogram of the median intensities", 
                 border = "antiquewhite4",
                 xlab = "Median intensities")
abline(v = man_threshold, col = "coral4", lwd = 2)

#Transcripts that do not have intensities larger 
#than the threshold in at least as many arrays as the smallest experimental group are excluded.

#In order to do so, we first have to get a list with the number of samples (=arrays) 
#(no_of_samples) in the experimental groups:
no_of_samples <- 
  table(paste0(pData(dorsey_eset)$FactorValue..sample.type., "_", 
               pData(dorsey_eset)$FactorValue..age.))
no_of_samples 

samples_cutoff <- min(no_of_samples)

idx_man_threshold <- apply(Biobase::exprs(dorsey_eset), 1,
                           function(x){
                             sum(x > man_threshold) >= samples_cutoff})
#After filtering out the transcripts that intensities are not greater than thresheold in 
#at least 1 array can see how many genes are filtered out(54675)
table(idx_man_threshold)
#subset expression set to only include those who pass the filtering
dorsey_manfiltered <- subset(dorsey_eset, idx_man_threshold)

featureNames(dorsey_manfiltered)
library(org.Hs.eg.db)
library(AnnotationDbi)
#Before we continue with the linear models for microarrays and 
#differential expression, we first add “feature data”, i.e. annotation information to the transcript cluster 
#identifiers stored in the featureData of our ExpressionSet:
anno_dorsey <- AnnotationDbi::select(hugene10sttranscriptcluster.db,
                                       keys = (featureNames(dorsey_manfiltered)),
                                       columns = c("SYMBOL", "GENENAME"),
                                       keytype = "PROBEID")

anno_dorsey <- subset(anno_dorsey, !is.na(SYMBOL))`enter code here`

RStudio affy microarray • 1.3k views

ADD COMMENT • link updated 4.0 years ago by Kevin Blighe 89k • written 4.0 years ago by gonzalezb549 • 0

0

Entering edit mode

Can you show how you normalised the data?, i.e., the rma() or gcrma() command. Please also confirm the array type and version.

It's possible that you need hugene10stprobeset.db

ADD REPLY • link 4.0 years ago by Kevin Blighe 89k

0

Entering edit mode

I used the rma() The array is an expressionset

ADD REPLY • link 4.0 years ago by gonzalezb549 • 0

score 0 · Answer 1 · 2021-07-23

0

Entering edit mode

4.0 years ago

gonzalezb549 • 0

I think I was using the wrong .db I should have used"hgu133plus2.db". Thankyou, your response was the clue that led me to discover the solution

ADD COMMENT • link 4.0 years ago by gonzalezb549 • 0

0

Entering edit mode

Yes, a quick check reveals that these probes are from the Affymetrix Human Genome U133 Plus 2.0 Array

ADD REPLY • link 4.0 years ago by Kevin Blighe 89k