Question

summarized experiment in R

0

Entering edit mode

4.3 years ago

parinv ▴ 80

I am learning R language and trying some basic analysis with datasets. I normalized the data and converted it to gene level from probe level. But now, while working with summarized experiment package, I am facing lots of error. So, anyone can please suggest correct workflow and R script for summarizedExperiment. Thank you.

R • 3.6k views

ADD COMMENT • link 2.4 years ago by parinv ▴ 80

0

Entering edit mode

It could help if you gave some examples to your errors or what you were trying to achieve.

ADD REPLY • link 4.3 years ago by Asaf 10k

0

Entering edit mode

I tried running this command :

nrows <- 20962
ncols <- 93
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
colData <- DataFrame(sample="1:20962", row.names= paste("sample", letters[5:1]))
sumex <- SummarizedExperiment(assays=SimpleList(counts=counts),
                            colData=colData)

And it shows error:

Error in validObject(.Object) : 
  invalid class “SummarizedExperiment” object: 
    nb of cols in 'assay' (20962) must equal nb of rows in 'colData' (1)

ADD REPLY • link updated 4.3 years ago by zx8754 11k • written 4.3 years ago by parinv ▴ 80

0

Entering edit mode

Don't create an answer to provide more info. This makes the question looks like it's been answered. Use the 'Add reply' button to reply to a comment.

ADD REPLY • link 4.3 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

You didn't generate colData properly, you have only one row in there, you should remove the quotes from 1:20962. But that is not the only error I think

ADD REPLY • link 4.3 years ago by Asaf 10k

0

Entering edit mode

As I have very basic knowledge about R. Can you suggest how can I modify them?

ADD REPLY • link 4.3 years ago by parinv ▴ 80

0

Entering edit mode

colData <- DataFrame(sample=1:93)

ADD REPLY • link 4.3 years ago by Asaf 10k

0

Entering edit mode

Hello parinv,

May I ask you please how did you generate the gene expression level from the probe levels. I would like to know which package did you use for this purpose.

Thank you!

ADD REPLY • link 2.5 years ago by ryme ▴ 30

0

Entering edit mode

Hi ryme,

I used summarize function first to summarize the probe-level data and then used probe2gene function from the EnrichmentBrowser package.

Hope this is useful.

ADD REPLY • link 2.4 years ago by parinv ▴ 80

0

Entering edit mode

Hello parinv,

May I ask you please how did you generate the gene expression level from the probe levels. I would like to know which package did you use for this purpose.

Thank you!

ADD REPLY • link 2.5 years ago by ryme ▴ 30

0

Entering edit mode

This past thread should be helpful: Human Exon array probeset to gene-level expression Also: How to combine expression values of multiple probes for one gene?

ADD REPLY • link 2.5 years ago by GenoMax 142k

0

Entering edit mode

You can also consider an answer by Asaf at the end of the thread.

ADD REPLY • link 2.4 years ago by parinv ▴ 80

score 4 · Accepted Answer · 2020-02-05

To make some sense of the mess I made in the comments.

You were trying to reproduce the example from the SummarizedExperimet vignette. Let's see what each statement does:

nrows <- 200
ncols <- 6

This line constructs a count matrix with random values from the uniform distribution. The number of rows is the number of genes, columns are samples:

counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)

This one defines the genes' chromosomal locations (you skipped it which is fine)

rowRanges <- GRanges(rep(c("chr1", "chr2"), c(50, 150)),
                     IRanges(floor(runif(200, 1e5, 1e6)), width=100),
                     strand=sample(c("+", "-"), 200, TRUE),
                     feature_id=sprintf("ID%03d", 1:200))

Here they build the metadata table with one row for each sample (6) and they give each row one parameter, namely the Treatment which is either ChIP or Input

colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
                     row.names=LETTERS[1:6])

Now build the container:

SummarizedExperiment(assays=list(counts=counts),
                     rowRanges=rowRanges, colData=colData)

The part you're failing on is generating the metadata. If you have your own data then you should put it in a count matrix and generate the metadata table properly with the correct dimensions.

Needless to say, there are packages that can generate SummarizedExperiment for you for some data sets.