Question: High PCR duplication rate in ATAC-seq library
gravatar for ttsutsui1028
7 weeks ago by
ttsutsui102820 wrote:

Dear all,

I am new in ATAC-seq. I have tried ATAC-seq using THP1 cell (50,000 cells per sample) exact following Greenleaf lab protocol.

Prepared libraries showed ladder like pattern in bioanalyzer. First peak locate at ~180bp and second one locates at ~360bp... Because of this, I thought my library preparation works fine. But in the end, I have got around 40-50% PCR duplicate in each sample.

I think this due to my library preparation issue. So far, I am planning to do 1) Cell number titration from 5000, 15000, 50000 and 150000 per sample; and 2) reduce additional PCR cycle.

Does anyone have any suggestion what to do my next library preparation?


ADD COMMENTlink modified 7 weeks ago by ATpoint12k • written 7 weeks ago by ttsutsui102820
gravatar for ATpoint
7 weeks ago by
ATpoint12k wrote:

While not truely bioinformatics-related, here are my thoughts, having done ATAC-seq in both primary and cell lines of human and mouse: It this duplication rate with out without reads aligned to chrM included? If included, this result is normal and expected as mitochondrial DNA (chrM is only 17kb, so very high coverage of a tiny genome using standard Illumina sequencing, and therefore many duplicates). gets tagmentated during the library prep as well. A simple yet powerful addition to the additional protocol is to add Tween-20 at 0.1% to both the lysis and tagmentation buffer, so lysis 10mM Nacl, 10mM Tris, 3mM MgCl2, 0.1% NP-40, 0.1% Tween and tagmentation 25µl tagment buffer, 5µl 1% Tween, 2.5µl transposase to 50µl water. Doing this, we typically reduce mtDNA percentage from like 50-80% in cell lines to about 15-20% without affecting library quality, even increasing signal-to-noise ration. A reference for this modifiction to the standard protocol is here.

Check the duplication rate in the files without chrM reads (hope you had chrM them in your reference genome index!). Possible code:

samtools idxstats in.bam | cut -f 1 | grep -v 'chrM' | xargs samtools view -o without_chrM.bam in.bam

% of Mitochondrial reads can be checked with:

function mtDNA {

  mtReads=$(samtools idxstats $1 | grep 'chrM' | cut -f 3)
  totalReads=$(samtools idxstats $1 | awk '{SUM += $3} END {print SUM}')
  echo '[mtDNA Content]:' $(bc <<< "scale=2;100*$mtReads/$totalReads")'%'

}; export -f mtDNA
mtDNA in.bam

As for your proposed modifications, I do not recommend any of this. ATAC-seq, if done properly, is highly reliable and in our hands always perfectly fine given the cells are in good condition and viable without larger percentages of death cells. Should not be an issue for cell lines. Experimenting with cell numbers and things, unless you are in a organism with quiet different properties like flies, worms etc, is not necessary as the standard numbers have been extensively tested and validated. We did ATAC-seq in THP-1 cells a while back (unpublished) and both the standard protocol and the one I proposed above work perfectly fine. I recommend the one with Tween-20. We routinely do 11 PCR cycles for all libraries. Hope that helps.

ADD COMMENTlink modified 6 weeks ago • written 7 weeks ago by ATpoint12k

Thank you for your suggestion.

I had forgotten to think about the number of mitocondoria per cell. I was so stupid! Actually, mtDNA rates were around 30-45% in my libraries. I will try to remove mtDNA from my fastq to see if how mtDNA affect in the PCR duplicate ratio.

Addition, thank you for your advice for new protocol. This is my first time to hear about this modification. Next time I will compare tween20 method and regular method.

Again, thank you very much.

ADD REPLYlink written 6 weeks ago by ttsutsui102820

The Tween-method is pretty much the standard by now (or at least it should be IMHO). There are even modifications of it, see "OmniATAC" for example.

ADD REPLYlink written 6 weeks ago by ATpoint12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1788 users visited in the last hour