Hello everyone, I'm new in bioinfomatics analysis. I am analyzing datasets that I download from the GEO database, my goal is to identify genes that are differentially expressed in cancer patients undergoing different treatments, using normal tissue expression data as controls. I am using the Exatlas tool to perform this analysis. I have many doubts about the statistical parameters that I should use, especially when performing the ANOVA. I would like to know if any of you can give me a general screen about what the following terms mean and that thresholds and error models recommend using it to make this analysis reliable. Thank you so much.
ANOVA Parameters:
- Cutoff (probes with maximum value below cutoff are ignored)
- Threshold z-value used to remove outliers (0 - 1000)
- Proportion of probes with high error variances to ignore in error models (0 - 0.5)
- Bayesian degrees of freedom (used with error models 3 or 5) (4 - 1000) 5.Error model number (see the list below)
List of error models 1 = Actual error variance for each probe 2 = Average error variance for probes with similar expression level 3 = Bayesian correction of error variance (Baldi & Long 2001) 4 = Maximum between actual and expected average error variances 5 = Maximum between actual and Bayesian error variances