Question: How to subset ExpressionSet based on vector of sample names
0
gravatar for nkabo
14 months ago by
nkabo20
nkabo20 wrote:

I have an ExpressionSet object composed of 37 samples for HCC and cirrhosis situation and I would like to subset it according to names that I specified in two vectors. As a result, I want to have 2 expression arrays (one set is for HCC and other is for cirrhosis) after I subset this ExpressionSet. In order to subset the ExpressionSet, I have tried several methods but I could not get the samples and features at the same time.

expset_forall is ExpressionSet and names_HCC and names_cirr are the character vectors containing the names of samples.

This is expset_forall:

ExpressionSet (storageMode: lockedEnvironment)

assayData: 20962 features, 37 samples 

protocolData
  sampleNames: GSM437457.CEL.gz GSM437458.CEL.gz ... GSM437493.CEL.gz (37 total)...

I have tried:

eset_forHCC = expset_forall[, sampleNames(expset_forall) %in% names_HCC]

it gives error of "incorrect number of dimensions"

then I tried:

eset_forHCC= exprs(expset_forall[expset_forall@protocolData$sampleNames==names_HCC,]
dim(eset_forHCC)
[1]  0 37

At last, I tried to subset it by reaching via pData:

levels(pData(expset_forall)$sampleNames)

it gives "NULL"

As eset_forHCC, I expect the output:

ExpressionSet (storageMode: lockedEnvironment)

assayData: 20962 features, 17 samples

element names: exprs, se.exprs

protocolData

sampleNames: GSM437458.CEL.gz GSM437459.CEL.gz ... GSM437493.CEL.gz (17 total)

As eset_forcirr, I expect the output:

ExpressionSet (storageMode: lockedEnvironment)

assayData: 20962 features, 17 samples

element names: exprs, se.exprs

protocolData

sampleNames: GSM437460.CEL.gz GSM437459.CEL.gz ... GSM437491.CEL.gz (17 total)
ADD COMMENTlink modified 14 months ago • written 14 months ago by nkabo20
1

Once you get the ExpressionSet in a data frame object you can try to perform subset() a base function or filter() from dplyr package.

ADD REPLYlink written 14 months ago by sangram_keshari230

Does:

eset_forHCC= exprs(expset_forall)[,sampleNames(expset_forall) %in% names_HCC]

not work?

ADD REPLYlink modified 14 months ago • written 14 months ago by benformatics1.7k

Thank you for your reply, it works but it gives a matrix I should have an ExpressionSet.

ADD REPLYlink written 14 months ago by nkabo20
eset_forHCC = expset_forall[, sampleNames(expset_forall) %in% names_HCC]

This is correct and recommended way to get the subset. Could you recheck it?

Also, see that the output of sampleNames(expset_forall) %in% names_HCC is as intended.

ADD REPLYlink written 14 months ago by Santosh Anand5.1k

Thank you for your reply, I checked it again and it works fine but it gives an expression matrix, I would like to have it as ExpressionSet.

ADD REPLYlink written 14 months ago by nkabo20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1637 users visited in the last hour