problem for gene survival analysis through "Survival" package in R
3
0
Entering edit mode
2.9 years ago
modarzi ▴ 140

Hello,

My data belong to subtype of TCGA breast cancer. I have to do survival analysis for this data set. I know "survival" package but I don't know how can I involve genes profiles to survival analysis. Surly, for OS analysis, I can use "survfit" as a function for survival analysis and by "Time" and "OS status" I can get my target. but really, I don't know how to input my transcriptom profile as input in my survival analysis. In other words, I want to know how can I gene survival analysis through "survival" package or another nice package. due to this research, I found "RTCGA" as a bioconductor package but it's not possible to customize arguments of "survivalTCGA" for for survival analysis about subtype of breasat cancer. I deeply appreciate if you share your comment with me.

Best Regards,

TCGA Survival Analysis Gene Survival Analysis R • 1.0k views
2
Entering edit mode
2.9 years ago
chris86 ▴ 370

I do survival analysis with (patient) consensus cluster as a independent variable explaining survival time. This is an example.

coxph(Surv(time = Time, event = Death) ~ as.factor(myresults\$consensuscluster), data = myresults, ties = "exact")


Perhaps this helps. I'd cluster your data first which assigns clusters to every patient, then go from there. Then the genes associating with a cluster can be analysed separately. Good class discovery/ clustering packages in my experiments on a single platform have been CLEST and M3C.

0
Entering edit mode

Dear Dr. Chris

Hello,

Thanks for your comment. Let me to explain more about my data in samples and patient for you. My genes profiles is data frame include N samples with M genes type as call it "mydata” based on below structure:

                   gene1     gene2    gene3 …….    geneM
Sample 1
Sample 2
Sample 3
…..
Sample N


And my patient as call it "myclinicaldata" based on below structure:

                   Time     OS Status
Sample 1            100          1
Sample 2            200          0
Sample 3            150          1
…..
Sample N            400          0


Ok. I clusterd "mydata" based on genes and now I have 20 clusters that I called them cluster01, cluster02, cluster03…, cluster20 and in each cluster I have some genes. I would like to import each cluster to my survival analysis.

Once again, I deeply appreciate if you share your R solution based on my scenario.

Best Regards

1
Entering edit mode

Well you basically get the output of your cluster algorithm and then stick it together with your survival data in the correct order, then stick it in the equation above.

20 clusters sounds a bit high to me. How many patients do you have? I usually get under 10 clusters 100-800 patients. What method did you use?

0
Entering edit mode

all values in my last message are not real and were just for presenting my problem. but as you ask me, I would like to explain in real for you. I used WGCNA as a method for constructing my network and in my experiment I have 53 patient through 56000 gene types. so based on this method I have 62 modules (clusters). honestly, I think my modules number are more than my expectation and I don't know why? I will be happy if you share your experience with me.

2
Entering edit mode
2.9 years ago
raunakms ★ 1.1k

The main strategy here is to first use the information of the gene to stratify the patients into different groups (for example: High gene expression group vs. Low gene expression group, or Mutated gene group vs. Non-mutated gene group, etc)and only then perform the survival analysis. Then call the following survfit() function

survfit(Surv(OS_time, VitalStatus == 1) ~ Patient_group, data=df)


here, OS_time: overall survival time; Vital Status (Dead=1, Alive=0), Patient_group: Grouping of patient using information of the gene of interest. df=your data frame containing the data.

2
Entering edit mode

Indeed, I go through a working tutorial here, but with GEO data: Survival analysis with gene expression

0
Entering edit mode
7 weeks ago

Hi! now you can do this task very easily with my recently developed tool named 'geneSA' (https://github.com/huynguyen250896/geneSA). Its output will automatically report genes statistically significant with survival outcome. Give it a try ;)