Currently a dataset of Affymetrix microarray experiment has been given to me, and I'm trying to analyze it. I read a lot of papers and tutorials out there to take it step by step.
Data are in a data.frame form like this :
| Ge/treat | Control_1 | Control_2 | Control_3 | TreatA_1 | TreatA_2 | TreatA_3 | |----------|:-------------:|----------:|----------:|---------:|---------:|---------:| | gene1 | 2.65 | 3.01 | 2.20 | 3.65 | 4.01 | 3.25 | | gene2 | 1.54 | 1.27 | 2.01 | 2.65 | 3.11 | 2.90 | | gene3 | 1.34 | 1.00 | 2.50 | 1.65 | 2.01 | 2.24 |
Values are random and there are more than three genes and more treatments ( TreatB_1,TreatB_2,TreatB_3 etc)
Until now i have done the normalization step by using Biocondactors normalize.quantiles() function.
For further analysis (to be able to create MA-plots, count differential expression) a summarization step is needed. As I understood for what I read, one common method to do this is the median polish method which is used at RMA algorithm ( but I don't have the CEL files to be able to run it).
If I am right, by applying that method you get a final matrix with one value/treatment and after that you can create MA plots and go further for the deferential expression step.
I search all over the net to find a way to summarize my table but couldn't find any solution. For example the medpolish function in R provides me the overall median and the residuals terms relative to the additive model behind the median polish but I am not quite sure on how to add this values to get the correct expression value for the gene for each array.
Can someone help me /give me a hint or example, on how to get a summarized matrix that will look like the above ?
Also if you think that I am approaching it in a wrong way, I would be thankful.
| Ge/treat | Control_Sum | TreatA_Sum | |----------|:-----------:|-----------:| | gene1 | 2.45 | 3.31 | | gene2 | 1.24 | 1.47 | | gene3 | 1.54 | 2.00 |