ATAC-Seq Data looking like RNA-Seq data...?
Entering edit mode
9 months ago

Hi new user here although I've been lurking trying to find answers to my issue!

I'm relatively new to bioinformatics but have been working on analyzing data from an ATAC-Seq experiments in Arabidopsis.

I'm having issues with my data not piling up around TSS like ATAC-Seq data should.... I've tried everything I can and asked 4 colleagues and no one has any idea what happened to my data.

Here's the basic pipeline for my analysis:

Trimming reads: in=read1.fq.gz in2=read2.fq.gz out=Read1.trimmed.fq.gz out2=Read2.fq.gz ftr=49 tossbrokenreads
Input:                      71468752 reads      10720312800 bases.
FTrimmed:                   71468752 reads (100.00%)    7146875200 bases (66.67%)
Total Removed:              0 reads (0.00%)     7146875200 bases (66.67%)
Result:                     71468752 reads (100.00%)    3573437600 bases (33.33%)


./bowtie -X 1000 -t -p 10 -v 2 -m 3 -y -a  tair10 -1  Read1.trimmed.fq.gz -2  Read2.trimmed.fq.gz  -S aligned.sam > aligned.txt


reads processed: 44774133
reads with at least one reported alignment: 33483782 (74.78%)
reads that failed to align: 11290351 (25.22%)
reads with alignments suppressed due to -m: 2648269 (5.91%)
Reported 41506698 paired-end alignments

Sorted, removed chloroplastic and mitochondrial reads, and then removed PCR duplicates with Samtools

Converted .bam files to .bed files with bamToBed

Shifted reads for Tn5 integration sites:

awk 'BEGIN {OFS = "\t"} ; {if ($6 == "+") print $1, $2 + 4, $2 + 5, $6; else print $1, $3 - 6, $3 - 5, $6}' FILE.rmdup.bed > FILE.Tn5.bed

Called peaks using MACS2:

macs2 callpeak -t FILE.Tn5.bed -f BED -g 150000000 --nomodel --keep-dup all --extsize 150 --shift -75 --qvalue 0.01 -n FILE --outdir /macsoutput

I've included a picture of the TSS graph generated using ChipSeeker in R.

Does anyone have ANY idea why this happened? It looks like what I would expect from RNA Seq data...

enter image description here

sequence ATAC-Seq TSS • 312 views
Entering edit mode

I'm definitely no ATAC-seq expert, but I'm not familiar with the "shifting" step, what is the purpose of this step? I've always had pretty good luck just following this pipeline from ENCODE:

Entering edit mode

The ends of the reads are where Tn5 bound chromatin and inserted sequencing adapters, giving rise to the observed sequences. Thus, sequencing reads flank accessible chromatin. Tn5 binds 9 bp of DNA. That’s why we adjust the coordinates +4 for positive strand reads, and -5 for negative strand reads.


Login before adding your answer.

Traffic: 2111 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6