I don't think clustering or other subgroup discovery methods would really be appropriate to perform on a combined data set. You could try applying the LIMMA/Voom normalization to the RNAseq data -- this corrects for total library size and attempts to capture mean-variance relationships and applies a log normalizaiton. Then, for each data set separately, you could z-scale each gene's expression values. This might put things on an identical scale and focus on the mean-variance relationships within the RNAseq data. Perhaps limit your examination to genes that are above some detection threshold (>10 raw reads in 50% of samples, or something similar) in the RNAseq data. You could try clustering or subgroup discover, however, if your clustering solution consistently aligns with the two platforms, then you know some bias still exists in the data. Instead, maybe perform separate clustering analysis for each data set.
If you're interested in finding differentially expressed genes, then one acceptable approach might be to model each data set separately using appropriate methods for each data type, then combining the resulting test statistics using a meta-analytic method. On the other hand, meta-analysis (e.g. per each gene) might require both the microarray and RNAseq test statistics (e.g. p-value and effect size) to be produced by the same statistical test. In that case, you might consider using normalizing the RNAseq data following the LIMMA voom approach -- supposedly this renders the data suitable for parametric analyses (i.e. it might be appropriate to use the same statistical model as used for the microarray, facilitating meta-analysis.).
I found this article https://peerj.com/articles/1621/ quite interesting. They get good results with quantile normalization [targeted, that means that they adapt a target dataset (RNA-seq) to a reference dataset (microarray)] and TMD methods. For checking the code they use just go to the Supplementary info.