Unreliable results of R while retrieving DEG
0
0
Entering edit mode
5.3 years ago
786 ▴ 50

Hi I was working on GSE55650 and received different number of differentially expressed genes just by changing the group format. Although the control and diseased were same in both types of groups but still get different number of genes. I just don't understand why is it so?Can anyone help me in understanding this?I'll be very thankful.

First situation:
groups = pData(phenoData(gse55650dat[[1]]))$Disease_states groups=as.character(groups) groups=sort(groups) groups[c(1:6)]="control" groups[c(7:12)]="diabetic" groups[c(13:17)]="control" groups[c(18:23)]="diabetic"  It gives 21 number of differentially expressed genes BH-adjusted p < 0.05 as the cut-off. Second situation: groups = pData(phenoData(gse55650dat[[1]]))$source_name_ch1
groups=as.character(groups)
groups[groups=="Control myoblasts"]="non.diabetic"
groups[groups=="Diabetes myoblasts"]="diabetic"
groups[groups=="Control myotubes"]="non.diabetic"
groups[groups=="Diabetes myotubes"]="diabetic"


It gives 14 number of differentially expressed genes at same p value criteria. Why this difference appears?Please help me understanding this it'll be highly appreciated.

R affy limma GEOquery • 1.1k views
0
Entering edit mode

can you please post the groups vector before and after the renaming? i.e. for both cases you tried, I would like to see the groups vector after the as.character(groups) thing and then after the whole renaming step :)

0
Entering edit mode
groups = pData(phenoData(gse55650dat[[1]]))$source_name_ch1 groups=as.character(groups) groups[groups=="Control myoblasts"]="non.diabetic" groups[groups=="Diabetes myoblasts"]="diabetic" groups[groups=="Control myotubes"]="non.diabetic" groups[groups=="Diabetes myotubes"]="diabetic" f = factor(groups, levels=c("non.diabetic","diabetic")) design_gse55650 = model.matrix(~0+f) colnames(design_gse55650) = levels(f) cont.matrix = makeContrasts(diabetic-non.diabetic, levels=design_gse55650) fit = lmFit(gse55650eset, design_gse55650) fit2= contrasts.fit(fit, cont.matrix) fit2 = eBayes(fit2) fit2$genes$Symbol = getSYMBOL(fit2$genes$ID, "hgu133plus2") fit2$genes$GeneName = unlist(mget(fit$genes\$ID, hgu133plus2GENENAME))

results = topTable(fit2, adjust ="BH", number = nrow(gse55650eset))

0
Entering edit mode

Thanks but I mean can you actually print the groups vector? :)