I have to combine two datasets obtained using two different platforms (Illumina and Affymetrix). The combined dataset contains gene expression for 11 cell types. For my purpose, I do not need to find the differentially expressed genes of one cell type to the others, but I need to find the upregulated genes of each cell type. To do this, I ranked ~20000 genes for each sample, and selected genes that were ranked within the top 20% of the ~20000 genes for 80% of the replicates of each cell type (all the cell types have >=5 replicates). However, I am not sure how to estimate the statistical significance (e.g., FDR) for my selected genes. Any advice is appreciated. Also, does anybody know any methods that suit my purpose?
Thank you very much.