Entering edit mode
5.0 years ago
nanoide
▴
120
Hi there, So I'm working with ATAC-seq data and I have applied the pymakeVplot (https://github.com/jinxu9/ATACseq/blob/master/libs/pyMakeVplot.py) code to generate a V-plot centered at TSSs.
So this is what I get:
My question is regarding the interpretation and the 2 small peaks surrounding the enrichment at the TSS, before the profile gets flat. Is it correct to affirm these are pointing to nucleosomes surrounding the TSSs?
Thanks!
I am confused. This does not look like a V-plot. This is a V-plot:
What you have is a simple profile plot that shows the read counts around the TSS whereas a V-plot is supposed to plot the relationship of distance from a genomic feature (e.g. TSS) as a function of fragment size. Is this script really from the ATAC-seq author (Buenrostro) or has it been adapted without changing the header lines that indicate the original author? Correct me if I am wrong.
Thanks you for your edit and your clarification. It seems I was indeed mistaken calling this V-plot, I guess I was misled by the title and header of the script. Apologies for that. I think in other places and pipelines this is referred as "TSS enrichment" plot, with distance in the x axis and insertions/tags at y axis. I ignore whether the script is from J. Buenrostro, I do not own the github link I provided. I have edited the post title to avoid confusion.
Either way, the profile I'm observing, i.e. enrichment at TSSs and then those smaller peaks, seems to be also observed for other eukaryotes. Any comment on what are these smaller peaks pointing to?
Thanks
Could indeed be the spacing region between two flanking nucleosomes. The question is what you want to do with that information. Any particular analysis you want to perform?
Thank you for your response! For now just following the first steps of a pipeline and doing quality control. So this was just for descriptive purposes. Thanks for your time and sharing your expertise
You're very welcome. I think the most relevant quality criteria are 1) check for the nucleosomal pattern in the aligned data (e.g.
CollectInsertSizeMetrics
from Picard, and 2) to check for signal-to-noise by calculating FRiPs (fraction of reads per peak) which simply measures how many reads overlap callable peaks. For a good ATAC-seq from human or mouse I typically can call tens-of-thousands of peaks (for samples made with the most recent OmniATAC protocol even somewhat 150k peaks at times) and FRiPs somewhere > 0.2 to 0.6 (the latter in very good samples). Also check visually on a genome browser for good separation between peaks and noise.Thanks for the very useful advice. Regards
This is slightly off-topic, but what made you chose this particular ATAC-seq tool for your analysis? Seems quite outdated.
It was used on a pipeline I inherited, and seems to be similar to the step checking enrichment at TSS in other pipelines, so I just kept using it. In any case, it seems then like a good idea to use a more updated one. Any suggestions? Thanks
Ah I see, no worries! Was just generally curious. Like ATpoint recommended, I too would suggest something such as Picard for general QC needs.