ATAC-seq 150bp reads
3
5
Entering edit mode
4.2 years ago
ta_awwad ▴ 340

Hello everyone, I recently performed ATAC-seq experiment following regular protocol with double-sided bead clean up. however, when we aligned the reads with bowtie2 we get 40% alignment rate for some unknown reason to us. we tried to clip the reads to 75 bp, and we increased the alignment rate up to 60%. Does anyone have an explanation??

next-gen ATAC-seq ChIP-Seq sequencing • 5.5k views
ADD COMMENT
1
Entering edit mode

Have you tried with 50 bp to see if there is further improvement? 150 is awfully long for ATACseq.

ADD REPLY
0
Entering edit mode

I tried 35 and was working pretty cool .. point is, I would like to analyse allele-specific open chromatin and in order to cover SNPs we decided to re-sequence our samples using 150 bp.

ADD REPLY
0
Entering edit mode

Sounds like trimming adapters with any of the popular tools will do the trick then.

ADD REPLY
4
Entering edit mode
4.2 years ago
ATpoint 81k

Two things:

1) 150bp is quite long for ATAC-seq. Many fragments (so DNA inserts in the fragment), mainly those that come from nucleosome-free regions will have sizes of about 80 bp. Therefore you will need to trim the reads for adapter contamination. You should see strong adapter contents when using fastqc for quality control. The adapter sequence to trim away is CTGTCTCTTATACACATCT. I personally use cutadapt but any trimmer that handles paired-end data will do. Since many reads will be shorter than the original 150bp you might even consider to trim everything to like 75 bp even before the actual adapter trimming. The reason is that longer reads will map more uniquely, and having reads with 150bp (untrimmed) and such with e.g. 75bp after adapter removal, you might create a mapping bias towards the longer fragments. I never investigated if this has an impact on downstream analysis, but trimming to a common length would/could/maybe avoid this.

2) Does your reference contain chrM (so the mitochondrial genome)? ATAC-seq, especially the older protocols such as the original one from 2013 will also transpose the mitochondrial genome and give chrM contamination from up to 80%. The more recent protocol do not do that and there it should be like 5% in my hands. Still, many reads may go unmapped if chrM is not in the reference. Did ypur bead purification narrow down the original size distributon? So did you do right-size exclusion? Did you run a Bioanalyzer, if so what was size of the longers fragments after purification?

Speaking of protocols, I strongly recommend the recent OmniATAC protocol if you do not already use that. It is superior to all other protocols and if done properly gives you excellent data quality. I get FRiPs (fractions of reads per peak) of up to 60% from both cell line and freshly FACS-sorted samples (human/mouse).

Edit: These are some typical insert sizes for ATAC-seq. You already see that the actual insert size is often smaller than 150bp. Also check the original paper, I think it is figure 1.

enter image description here

ADD COMMENT
0
Entering edit mode

many many tanks ATpoint.

ADD REPLY
0
Entering edit mode

im trying to trim the adapters by trimmomatic with the following shell command:

$trimmomatic PE   -phred33 -threads 20 ILLUMINACLIP:/adapter.fa:2:40:15    LEADING:5    TRAILING:5   SLIDINGWINDOW:5:20  MINLEN:100 ATAC_R1.fastq.gz ATAC_R2.fastq.gz  -baseout ATAC1_100

but seems not working!!!

ADD REPLY
1
Entering edit mode

Not working is not an error message. What is not working? Did you check adapter content with fastqc before and after trimming? Did you check if the adapter sequence I posted is in that fasta file? As I explained many reads will be short, so minlen of 100 might be too much.

Simply run this command and then align the reads.

ADAPTER="CTGTCTCTTATACACATCT"
cutadapt -a "${ADAPTER}" -A "${ADAPTER}" -o read_trim_1.fq -p read_trim_2.fq read_1.fq read_2.fq
ADD REPLY
0
Entering edit mode

trimmomatic gives this message :

Using templated Output files: ATAC1_100_1P ATAC1_100_1U ATAC1_100_2P ATAC1_100_2U
Exception in thread "main" java.lang.RuntimeException: Unknown trimmer: ATAC_R1.fastq.gz
        at org.usadellab.trimmomatic.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:73)
        at org.usadellab.trimmomatic.Trimmomatic.createTrimmers(Trimmomatic.java:59)
        at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:552)
        at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)

i checked the fastq files, they still contain the adapter seq. i will try cutadapt

ADD REPLY
0
Entering edit mode

Hi, ATpoint, Thanks for the insightful answer! I plan to send out my sample for sequencing. Do you think sequencing ATAC library at a short length directly would be better? Like 35bp paired end reads. Or do long reads 150bp and then clip it shorter by some tools?

Thank you!

ADD REPLY
0
Entering edit mode

We never do longer than 2x75 (shortest PE on HiSeq3000/Nextseq500) or shorter than 2x50 (shortest PE on Novaseq6000). Longer is a waste of resources and shorter only makes unique mapping more difficult. Technically you can do any combination you like with the number of cycles you have available.

ADD REPLY
2
Entering edit mode
4.2 years ago

Hi! I had more or less the same problem last week. This might be due to the default behavior of Trimmomatic when there is adapter contamination (as would be expected to happen in 150 bp reads from ATAC-seq libraries, as explained by ATpoint).

I found the following explanation:

"Trimmomatic's default behaviour is to drop the reverse reads when it trims adapters, so you get forward reads only surviving as a result. The reasoning behind this is that when you read into the adapter sequences it means that the insert is shorter than one of the reads, so the reverse read doesn't add any extra information, it is just the reverse complement of the forward read."

Therefore, all the reads that contain some adapter sequence (I guess about 60% in your case) will be left as unpaired after trimming. Since I'm using BWA, I don't exactly know how Bowtie would handle these unpaired reads by default, but from what you say, it might be possible that they are not mapped.

When you said that you were clipping the sequences to 75 bp, did you mean that you clipped them before trimming? This would reduce the number of sequences with adapter contamination, and therefore would increase the number of sequences with both strands surviving after trimming. If Bowtie2 would only consider PE reads for mapping (I don't know this), then this would explain why your mapping improves with the clipping.

It is possible to change the default behavior from Trimmomatic in order to keep the reverse reads after trimming. For this you would need to do the following:

"add TRUE as the last parameter to ILLUMINACLIP"

ADD COMMENT
1
Entering edit mode
4.2 years ago
igor 13k

If your fragments are shorter than your reads, Bowtie would consider those discordant. From the manual:

Bowtie 2’s default behavior is to consider overlapping and containing as being consistent with concordant alignment. By default, dovetailing is considered inconsistent with concordant alignment.

You can set the --dovetail flag to override that behavior.

ADD COMMENT

Login before adding your answer.

Traffic: 1923 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6