I am starting out with bulk ATAC data as bed files that include the read counts. I want to use this data for a package called MOFA, which requires these preprocessing steps:
Normalisation: For count-based data such as RNA-seq or ATAC-seq we recommend size factor normalisation + variance stabilisation (i.e. a log transformation).
Feature selection: It is strongly recommended that you select highly variable features (HVGs) per assay before fitting the model. This ensures a faster training and a more robust inference procedure. Also, for data modalities that have very different dimensionalities we suggest a stronger feature selection fort he bigger views, with the aim of reducing the feature imbalance between data modalities.
I am finding a lot of information on how to do this with single-cell data in R, but not bulk data. Are there any tutorials for how to do these steps with bulk data?