I would like to use public omics datasets (ChIP-seq, RNA-seq, and ATAC-seq) from different studies to do an integrative analysis as follow:
- Normalise samples, within each type of omics, from different public datasets.
- Convert the normalised values into a uniform scale to make the comparison between ChIP-seq, RNA-seq and ATAC-seq possible.
- Feed the normalised uniformed values into machine learning to infer one feature (e.g. RNA expression) from other features (e.g. TF or histone marks ChIP-seq).
Does anyone have experience with this type of analysis? I would like to hear about preferable approaches, problems, caveats, etc.. that I need to worry about / take care of before I start working on it.