I'm analyzing a dataset with enzymatic methyl-seq data (EM-seq).
I got the data already pre-processed by a colleague and I asked her how to proceed with the statistical analysis but my boss is now asking to confirm the analysis plan due to the results we obtained.
The package used (DSS) resorts back a t-test, which doesn't allow to:
- create a "volcano plot" (it creates just a continuous line with the statistic used and the corresponding p-value),
- It doesn't "just" use M-values as my boss is used to (Not possible to use raw heatmaps or similar aproches).
I found several packages in Bioconductor for Methylation, most of them seem designed for arrays but I am not sure if they are adapted to EM-seq or not (like limma was for microarrays but was adapted to RNA-seq via the voom
approach).
Other packages are for whole genome bisulfite sequencing (WGBS) but seem to be used too in arrays.
Looking up for papers I found relatively few that use EM-seq with very little detail of how they were analyzed.
What is the current recommended practices for analyzing EM-seq data? I would appreciate to pointers to reviews comparing analysis methods of EM-seq and/or WGBS. If there are no such reviews any good and recent paper using these methods would be fine too (Special points if it is in the context of viral infections).
Many thanks for the links. The dataset I got is already in the DSS object for R processing. I'll check the Methylkit package. I wasn't sure if these make some assumptions that EM-seq later do not hold.