Question: How to remove certain samples from SummarizedExperiment dataset? (BioConductor)
1
gravatar for BPors
2.0 years ago by
BPors30
BPors30 wrote:

Hi,

I am having a problem with my SummarizedExperiment dataset. I have a RNA-seq data and I want to analyze gene expression from there. However, I want to remove certain samples from the dataset and I could not be able to do it. The code I had tried until now:

> library(SummarizedExperiment)

> data <- readRDS("ABC.rds")

> colData(data)[1:5, 1:2]

> data

Output is:

class: RangedSummarizedExperiment
dim: 20115 424
assays(2): counts logCPM rownames(20115): 1 2 ... 102724473 103091865 rowRanges metadata column names(3): symbol txlen txgc colnames(424): TCGA.KL.AAAAA
TCGA.KL.BBBBBB ... TCGA.KL.ZZZZZ colData names(549): type bcr_patient_uuid

And the output follows as:

TCGA.KL.AAAAA na

TCGA.KL.BBBBB na

TCGA.KL.CCCCC na

TCGA.KL.DDDD na

When I do batch identification with the following code:

> TSS <- substr(colnames(data), 6, 7) table(TSS)

Output is:

> TSS

KJ KJ1 KJ2 KJ3
30 0 1 16

And I want to remove the samples (for example,TCGA.KL.AAAAAA or any other), which has KJ1 or KJ2 in their information. However, since the dataset is shaped very differently, if I remove KJ1 and KJ2 from TSS, their related samples are not getting erased from the dataset:

> TSS<- TSS[!(TSS %in% c('KJ1','KJ2')]

Output is:

KJ KJ3
30 16

However, I still have the same number of samples(20115)..But I want them to be less than that because I am removing some bathces.. How can I remove these samples associated with specific batches?

ADD COMMENTlink written 2.0 years ago by BPors30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1630 users visited in the last hour