High levels of duplication in ATAC gDNA control?
0
0
Entering edit mode
4 months ago
a_bis ▴ 40

Hi, I've been analysing an ATAC-seq dataset and I'm currently deduplicating the reads (after removing mitochondrial reads). Although I get something on the order of 15% reduction in reads when deduplicating my samples, I've gotten about a 90% reduction in reads after deduplicating the gDNA control (the experimental setup for the gDNA control was phenolchloroform-extracted DNA from the relevant cells incubated in the Tn5 in the same way as the permeabilised nuclei for the experimental samples).

Has anyone had this level of duplication in a gDNA control in an ATAC-seq experiment? Is it expected? Should I even bother deduplicating the gDNA control? Any advice will be much appreciated. Thanks in advance.

atac-seq deduplication • 642 views
ADD COMMENT
0
Entering edit mode

It's the first time I hear about such a control and we do the assay pretty much since its early days. What's the purpose for it? Isn't it mainly mitochondrial DNA?

ADD REPLY
0
Entering edit mode

It's to give an estimate of what the background cleavage of protein-free DNA is. Then peaks can be called against that control. May I ask then if you use controls for your ATAC experiments, and if so what they are? Thanks!

ADD REPLY
1
Entering edit mode

We do not do controls, and I've also not seen a study doing so. It makes some sense to do it honestly to assess bias of individual loci in terms of coverage, you would need to sequence it quite deep to coverage across the genome and standard peak callers like macs2 would then downsample it again towards the ATAC-seq samples which is typically just like 25mio reads, so there is imo little point in even doing the controls. After all we (and probably most people) are interested in differential analysis rather than "defining" open chromatin, hence the controls are not super necessary.

ADD REPLY
0
Entering edit mode

Thanks, good point. Will try peak calling with or without control files to see how different the outputs are.

ADD REPLY
0
Entering edit mode
Although I get something on the order of 15% reduction in reads when deduplicating my samples

I've gotten about a 90% reduction in reads after deduplicating the gDNA control

I don't understand. I don't do ATAC-seq, but I think it would be helpful here if you explained what the controls are and how you prepared them. If you are getting much higher duplication levels from supposedly randomly-fragmented DNA than reads that are supposed to be clustered around peaks, at the same depth, your experiment is probably not valid.

ADD REPLY
0
Entering edit mode

I was trying to figure this out, thanks for pointing it out. I believe it has to do with the number of PCR cycles for library prep done in the samples vs the control, which for a technical reason was different. I was just gauging what others might think about this result.

ADD REPLY

Login before adding your answer.

Traffic: 1694 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6