I'm trying to reproduce the ATAC analysis here. I have a few questions about alignment and peak-calling.
In the "ATAC-seq alignment" section in Method Details, they use Trimmomatic to remove adapters. However, I don't know the adapter sequence. Would it still be possible for me to remove adapters?
In "ATAC-seq replicate correlation and peak calling", it seems like they did peak calling on samples from each time point separately. However, in "ATAC-seq peak fuzzy clustering", they cluster the peaks by RPKM across the time points (i.e. the same peak had an RPKM value for each time point). How do I do peak calling across multiple time points, so that there are the same set of peaks for each time point?
Thank you for your help -- sorry I'm new to ATAC-seq!
Thank you for the help!
1) That makes sense, I'll try it
How do I do this and have a separate score column for each time point?
You want to use the MACS2 score columns? You can retain these with bedtools option
-c
. This will retain the score columns from any merged regions in a column with the scores comma-separated. Then you could parse them out afterwards. To keep track of the source timepoint (e.g. some regions may be merged from 3 timepoints, while some will be merged from different timepoints, or only one timepoint) I would add a column of the timepoint name to the indvidual files before merging, and retain this column during merging as the keys.Easier though would be to generate the merged regions, then run any counting/scoring over those regions with all timepoints. Probably depends what exactly you want to calculate. Example of what I sort of have in mind: