I have a dataset of z-scores of cell viability derived from siRNAs targetting 792 genes in a panel of 4x p53 WT and 6x p53 mutant breast cell lines.
What I am trying to do is work out which genes are differentially required between the p53 WT and p53 mutant cell lines.
I want to find the top 50 genes that lead to a loss of viability in p53 mutant cell lines compared to the p53 WT cell lines and display these in a heatmap that clusters p53 WT cell lines together and p53 mutant cell lines together.
The data I have is in a csv file and looks like this when in excel:
Row 1 is p53 mutational status Row 2 is cell line name Row 3 onwards is gene name followed by the z-scores for each cell line
I have been trying to do this using heatmaps in R but as I have no background in this, I am getting nowhere. I have tried to make a data matrix but the problem I seem to have is that I have 2 column headers (p53 mutation status row 1, cell line name row 2).
Getting rid of the cell line names (row 2) might make it simpler so columns are labelled either "p53 WT" or "p53 mutant"
Any comments on how best to determine the top 50 differentially required genes would be gratefully recieved. If anyone would be able to guide me through how to do this using R that would be super.
Many thanks in advance.