Question

subsetting limma EList in a dataframe

1

Entering edit mode

5.9 years ago

salvatore.raieli2 ▴ 90

Hi everyone,

I downloaded this dataset from Array Express, I followed the limma user guide. I did not find the annotation packages but a file with the probe names and entrez gene. I want to extract from the EList a dataframe with the probe name and the expression value, this will be much more suitable for me. how I can do this?

URL <- "https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-1781/"
SDRF.file <- "E-MTAB-1781.sdrf.txt"
Data.file <- "E-MTAB-1781.raw.1.zip"
download.file("https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-1781/E-MTAB-1781.sdrf.txt", SDRF.file)
download.file("https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-1781/E-MTAB-1781.raw.1.zip", Data.file)
unzip(Data.file)
SDRF <- read.delim("E-MTAB-1781.sdrf.txt",check.names=FALSE,stringsAsFactors=FALSE)
x <- read.maimages(SDRF[,"Array Data File"], source="agilent", green.only=TRUE, other.columns="gIsWellAboveBG")

y <- backgroundCorrect(x, method="normexp")
y <- normalizeBetweenArrays(y, method="quantile")
Control <- y$genes$ControlType==1L
IsExpr <- rowSums(y$other$gIsWellAboveBG > 0) >= 4
yfilt <- y[!Control & IsExpr, ]
names(yfilt$genes)

thank you in advance for your help,

Salvo

R Limma Microarray • 2.1k views

ADD COMMENT • link updated 5.9 years ago by Kevin Blighe 87k • written 5.9 years ago by salvatore.raieli2 ▴ 90

score 2 · Answer 1 · 2018-06-12

Care Salvo, buonasera, ecco qua la risposta:

To see what is within any object in R, 2 useful functions to use are str() and summary():

summary(yfilt)
        Length Class      Mode     
E       191555 -none-     numeric  
targets      1 data.frame list     
genes        5 data.frame list     
source       1 -none-     character
other        1 -none-     list     


yfilt$E[1:5,1:5]
     US22502540_252038210041_1_1 US22502540_252038210165_1_4
[1,]                    8.637449                    8.700608
[2,]                   10.442242                   10.184019
[3,]                    8.696243                    8.865778
[4,]                    9.774359                   10.186498
[5,]                    8.965035                    9.074906
     US22502540_252038210040_1_3 US22502540_252038210040_1_4
[1,]                    8.705305                    8.646266
[2,]                   11.342835                   10.393071
[3,]                    8.639361                    8.596739
[4,]                   10.002712                    9.921155
[5,]                    9.069770                    9.071458
     US22502540_252038210087_1_4
[1,]                    8.605702
[2,]                   10.916877
[3,]                    8.742966
[4,]                    9.978770
[5,]                    8.711117


yfilt$genes[1:5,]
   Row Col ControlType         ProbeName    SystematicName
12   1  12           0 UKv4_A_23_P314216 UKv4_A_23_P314216
13   1  13           0 UKv4_A_24_P126851 UKv4_A_24_P126851
14   1  14           0  UKv4_A_32_P77762  UKv4_A_32_P77762
15   1  15           0  UKv4_A_23_P71864  UKv4_A_23_P71864
16   1  16           0  UKv4_A_32_P48198  UKv4_A_32_P48198

So, The E object contains the normalised expression values, and the genes object contains the rownames.

nrow(yfilt$genes)
[1] 38311

nrow(yfilt$E)
[1] 38311

Ci vediamo dopo,

Kevin