Question: problem for gene survival analysis through "Survival" package in R
0
gravatar for modarzi
5 months ago by
modarzi70
modarzi70 wrote:

Hello,

My data belong to subtype of TCGA breast cancer. I have to do survival analysis for this data set. I know "survival" package but I don't know how can I involve genes profiles to survival analysis. Surly, for OS analysis, I can use "survfit" as a function for survival analysis and by "Time" and "OS status" I can get my target. but really, I don't know how to input my transcriptom profile as input in my survival analysis. In other words, I want to know how can I gene survival analysis through "survival" package or another nice package. due to this research, I found "RTCGA" as a bioconductor package but it's not possible to customize arguments of "survivalTCGA" for for survival analysis about subtype of breasat cancer. I deeply appreciate if you share your comment with me.

Best Regards,

Mohammad Darzi

ADD COMMENTlink modified 5 months ago by raunakms1.1k • written 5 months ago by modarzi70
2
gravatar for chris86
5 months ago by
chris86250
United Kingdom, London
chris86250 wrote:

I do survival analysis with (patient) consensus cluster as a independent variable explaining survival time. This is an example.

coxph(Surv(time = Time, event = Death) ~ as.factor(myresults$consensuscluster), data = myresults, ties = "exact")

Perhaps this helps. I'd cluster your data first which assigns clusters to every patient, then go from there. Then the genes associating with a cluster can be analysed separately. Good class discovery/ clustering packages in my experiments on a single platform have been CLEST and M3C.

ADD COMMENTlink modified 5 months ago by RamRS21k • written 5 months ago by chris86250

Dear Dr. Chris

Hello,

Thanks for your comment. Let me to explain more about my data in samples and patient for you. My genes profiles is data frame include N samples with M genes type as call it "mydata” based on below structure:

                   gene1     gene2    gene3 …….    geneM  
Sample 1 
Sample 2
Sample 3
…..
Sample N

And my patient as call it "myclinicaldata" based on below structure:

                   Time     OS Status    
Sample 1            100          1
Sample 2            200          0
Sample 3            150          1
…..
Sample N            400          0

Ok. I clusterd "mydata" based on genes and now I have 20 clusters that I called them cluster01, cluster02, cluster03…, cluster20 and in each cluster I have some genes. I would like to import each cluster to my survival analysis.

Once again, I deeply appreciate if you share your R solution based on my scenario.

Best Regards

ADD REPLYlink modified 5 months ago • written 5 months ago by modarzi70
1

Well you basically get the output of your cluster algorithm and then stick it together with your survival data in the correct order, then stick it in the equation above.

20 clusters sounds a bit high to me. How many patients do you have? I usually get under 10 clusters 100-800 patients. What method did you use?

ADD REPLYlink written 5 months ago by chris86250

all values in my last message are not real and were just for presenting my problem. but as you ask me, I would like to explain in real for you. I used WGCNA as a method for constructing my network and in my experiment I have 53 patient through 56000 gene types. so based on this method I have 62 modules (clusters). honestly, I think my modules number are more than my expectation and I don't know why? I will be happy if you share your experience with me.

ADD REPLYlink written 5 months ago by modarzi70
2
gravatar for raunakms
5 months ago by
raunakms1.1k
Vancouver, BC, Canada
raunakms1.1k wrote:

The main strategy here is to first use the information of the gene to stratify the patients into different groups (for example: High gene expression group vs. Low gene expression group, or Mutated gene group vs. Non-mutated gene group, etc)and only then perform the survival analysis. Then call the following survfit() function

survfit(Surv(OS_time, VitalStatus == 1) ~ Patient_group, data=df)

here, OS_time: overall survival time; Vital Status (Dead=1, Alive=0), Patient_group: Grouping of patient using information of the gene of interest. df=your data frame containing the data.

ADD COMMENTlink written 5 months ago by raunakms1.1k
2

Indeed, I go through a working tutorial here, but with GEO data: Survival analysis with gene expression

ADD REPLYlink written 5 months ago by Kevin Blighe42k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1960 users visited in the last hour