I want to find the master transcription factors (the transcription factors that govern the highest number of genes) per cell type in my animal. I will have single cell ATAC-seq and single cell RNAseq data. I was thinking of using the RNAseq data to group cells into metacells, then finding the most highly accessible peaks in those metacells using Chromvar. Would I be using as input the consensus peak set per metacell and then use the consensus peak set from all cells from all metacells as background? I am still a little confused about how Chromvar works after reading the Nature Methods paper and the online vignette.
Do you have single cell mRNA expression and genome accessibility in the same cells?? If not, you'll need to think carefully about how you are clustering your cells and mapping clusters in one space to those in the other (i.e. how do you say that one RNA-seq cluster of cells is biologically equivalent to some other cluster of cells in ATAC-seq space?).
You may want to have a look at the SnapATAC package, which seems to out preform ChromVar in clustering single cells in ATAC-seq space (see this manuscript). Basically what I would suggest you do is first cluster your cells in ATAC-seq space, then call peaks and perform differential accessibility analysis among the clusters and finally perform your motif analysis on those differentially accessible features.
To connect these data to mRNA expression (single cell or otherwise) I would annotate your ATAC-seq peaks according to some sensible cis-regulatory element, mRNA expression joint network or HiC data in the relevant cell type.