Question

ATAC-seq bowtie2 alignment and pre-alignment adapter clipping

0

Entering edit mode

3.2 years ago

cwwong13 ▴ 40

I am looking at the ENCODE ATAC-seq pipeline and its source code https://github.com/dnanexus-rnd/atac-seq-pipeline

I found that the default of Multimapping reads is 4 (which then further translates to 5, after the m+1 as describe in their code https://github.com/ENCODE-DCC/atac-seq-pipeline/blob/master/src/encode_task_bowtie2.py)

I am also checking the bcbio code, I think they used no multimapping (which should be the default for bowtie2).

I wonder whether we should set the multimapping to 0 instead?

================================

The second question is about the necessity of trimming the adapter prior to the alignment using bowtie2. Someone suggested that these aligners can do soft clipping, therefore I am not sure should we do the trimming.

Similarly, bcbio's tutorial example ymal file does not include any trimming step.

If we are going to need the trimming of the adapter what adapter sequence should we supply? Is it the index sequence?

What is the best practice when analyzing the atac-seq data (I am mapping to mm10 and hg38)?

Thank you!

ATAC-seq alignment bowtie2 • 1.5k views

ADD COMMENT • link updated 3.2 years ago by ATpoint 81k • written 3.2 years ago by cwwong13 ▴ 40

score 1 · Answer 1 · 2021-02-08

1

Entering edit mode

3.2 years ago

ATpoint 81k

I can only speak for myself, but I always trim adapters which can make be present in up to (iirc) 20% of reads as many of the fragments are short, therefore standard sequencing often picks it up, see for adapter A: What sequences should I use for ATAC-Seq adapter trimming using trimmomatic 0.36, and I discard multimappers as one would need an elaborate way of actually dealing with them during downstream analysis. Just naively dividing them across all possible regions is most likely give plenty of false positives.

ADD COMMENT • link 3.2 years ago by ATpoint 81k

0

Entering edit mode

2 follow-up questions are:

You suggested using CTGTCTCTTATACACATCT as the adapter. But when I look at the "NexteraPE-PE.fa provided by Trimmomatic" in the the discussion you direct me to, I find that the adapter sequence is actually in the NexteraPE-PE.fa: Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTCCGAGCCCACGAGAC' Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTGACGCTGCCGACGA'

Should we use this "long version" of the adapter sequence? Or purely CTGTCTCTTATACACATCT will do the job?
You mentioned discarding multimappers. Do you mean you set the bowtie2 -k to 0? Or, you suggest discarding them using samtools? While I am looking at the bowtie2 manual after posting the original question, it seems that we can only know there is multimapping happening when we use the -k option (a number larger than 1). It is because bowtie2 will randomly report 1 aligned region when there is a tie (though, I am not sure whether the XS:i: will still present in the default mode).

Thanks!

ADD REPLY • link 3.2 years ago by cwwong13 ▴ 40