Multivariate logistic regression for gene expression
0
0
Entering edit mode
9 months ago
silsie645 ▴ 20

I am working on a project where I am supposed to perform a univariate and multivariate regression analysis for some specific genes that affect cancer (14 in all). After performing the univariate analysis with:

#testing each gene with the data
Res1<-RegParallel(data=coxdata, formula='Surv(OverallSurvival_months, Death)~[*]', FUN=function(formula,data) coxph(formula = formula, data=data, ties='breslow', singular.ok=TRUE), FUNtype='coxph', variables=colnames(coxdata)[9:ncol(coxdata)], blocksize=2000, cores=2, nestedParallel=FALSE, conflevel=95)
Res2<-RegParallel(data=coxdata, formula='Surv(TumorFreeSurvival_months, Rezidiv)~[*]', FUN=function(formula,data) coxph(formula = formula, data=data, ties='breslow', singular.ok=TRUE), FUNtype='coxph', variables=colnames(coxdata)[9:ncol(coxdata)], blocksize=2000, cores=2, nestedParallel=FALSE, conflevel=95)

#filtering by logrank<0.01
Res1<-Res1[order(Res1$LogRank, decreasing = FALSE),] final1<-subset(Res1, LogRank<0.01) probe1<-gsub('^X','',final1$Variable)

Res2<-Res2[order(Res2$LogRank, decreasing = FALSE),] final2<-subset(Res2, LogRank<0.01) probe2<-gsub('^X','',final2$Variable)

#annotating top hits with biomart
mart <- useMart('ENSEMBL_MART_ENSEMBL', host='useast.ensembl.org')
mart <- useDataset("hsapiens_gene_ensembl", mart)
annotLookup1 <- getBM(mart = mart,attributes = c('affy_hg_u133a','ensembl_gene_id', 'gene_biotype','external_gene_name'), filter = 'affy_hg_u133a',values = probe1, uniqueRows = TRUE)
annotLookup2 <- getBM(mart = mart,attributes = c('affy_hg_u133a','ensembl_gene_id', 'gene_biotype','external_gene_name'), filter = 'affy_hg_u133a',values = probe2, uniqueRows = TRUE)

#extract OS data for downstream analysis
survplotdata1<-coxdata[,c('OverallSurvival_months','Death','X205027_s_at')]
colnames(survplotdata1)<- c ('OverallSurvival_months','Death','TPL2')

#set Z-scale cut-offs for high and low expression
highExpr<- 1.0
lowExpr<- -1.0
survplotdata1$TPL2<-ifelse(survplotdata1$TPL2 >= highExpr, 'High',ifelse(survplotdata1$TPL2<= lowExpr, 'Low', 'Mid')) #relevelling factors to have mid as ref level survplotdata1$TPL2 <- factor(survplotdata1\$TPL2,levels = c('Mid', 'Low', 'High'))
ggsurvplot(survfit(Surv(OverallSurvival_months,Death)~TPL2,data=survplotdata1),data=survplotdata1,risk.table=TRUE,pval=TRUE,ggtheme=theme_pubr(), risk.table.y.text.col=TRUE,risk.table.y.text=FALSE,xlab='Time (months)')


I realised that only one of my plots had a pvalue <0.05. Can I still perform the multivariate logistic regression and if I can how do I do it please?

r microarray • 291 views