17 months ago
parinv

I am learning R language and trying some basic analysis with datasets. I normalized the data and converted it to gene level from probe level. But now, while working with summarized experiment package, I am facing lots of error. So, anyone can please suggest correct workflow and R script for summarizedExperiment. Thank you.

It could help if you gave some examples to your errors or what you were trying to achieve.

I tried running this command :

nrows <- 20962
ncols <- 93
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
colData <- DataFrame(sample="1:20962", row.names= paste("sample", letters[5:1]))
sumex <- SummarizedExperiment(assays=SimpleList(counts=counts),
colData=colData)


And it shows error:

Error in validObject(.Object) :
invalid class “SummarizedExperiment” object:
nb of cols in 'assay' (20962) must equal nb of rows in 'colData' (1)

You didn't generate colData properly, you have only one row in there, you should remove the quotes from 1:20962. But that is not the only error I think

As I have very basic knowledge about R. Can you suggest how can I modify them?

colData <- DataFrame(sample=1:93)

17 months ago
Asaf

To make some sense of the mess I made in the comments.

You were trying to reproduce the example from the SummarizedExperimet vignette. Let's see what each statement does:

nrows <- 200
ncols <- 6


This line constructs a count matrix with random values from the uniform distribution. The number of rows is the number of genes, columns are samples:

counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)


This one defines the genes' chromosomal locations (you skipped it which is fine)

rowRanges <- GRanges(rep(c("chr1", "chr2"), c(50, 150)),
IRanges(floor(runif(200, 1e5, 1e6)), width=100),
strand=sample(c("+", "-"), 200, TRUE),
feature_id=sprintf("ID%03d", 1:200))


Here they build the metadata table with one row for each sample (6) and they give each row one parameter, namely the Treatment which is either ChIP or Input

colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
row.names=LETTERS[1:6])


Now build the container:

SummarizedExperiment(assays=list(counts=counts),
rowRanges=rowRanges, colData=colData)


The part you're failing on is generating the metadata. If you have your own data then you should put it in a count matrix and generate the metadata table properly with the correct dimensions.

Needless to say, there are packages that can generate SummarizedExperiment for you for some data sets.