Splitting data frame based on median value
1
0
Entering edit mode
3.4 years ago

Hi everyone, I am working on the TCGA cancer cohort where I have RNA-counts and clinical information merge into one big file. I wanted to split this file into a higher and lower expression based on the median value of one gene. I used two R scripts, unfortunately, both of them do not work as I was expecting: The first script split the data frame but keep only the genes count matrix with no matched clinical information, Which was something I wasn't expected. The second one was so memory intense takes ages to run then come up with an error.

First:

med<-median(df2$gene)
upper_median<-df[which(df2$gene >= med]
lower_median<-df[which(df2$gene < med]

Second:

med<-median(df2$gene)
upper<-split(df, which(df$gene >= med), drop = TRUE)
lower<-split(df, which(df$gene < med), drop = TRUE)

Any idea what I am missing or doing wrong??

Thank you very much! Imran

R RNA-Seq • 533 views
ADD COMMENT
1
Entering edit mode
3.4 years ago

I try to solve the issue, fortunately, Just sharing the script if anyone has the same issue.

med<-median(df2$gene)
upper_median<-df[which(df2$gene >= med),]
lower_median<-df[which(df2$gene < med),]
ADD COMMENT

Login before adding your answer.

Traffic: 1487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6