Question: ATAC-Seq Data looking like RNA-Seq data...?
0
gravatar for katywithawhy11
5 weeks ago by
Colorado State University
katywithawhy110 wrote:

Hi new user here although I've been lurking trying to find answers to my issue!

I'm relatively new to bioinformatics but have been working on analyzing data from an ATAC-Seq experiments in Arabidopsis.

I'm having issues with my data not piling up around TSS like ATAC-Seq data should.... I've tried everything I can and asked 4 colleagues and no one has any idea what happened to my data.

Here's the basic pipeline for my analysis:

Trimming reads:

bbduk.sh in=read1.fq.gz in2=read2.fq.gz out=Read1.trimmed.fq.gz out2=Read2.fq.gz ftr=49 tossbrokenreads
Result:
Input:                      71468752 reads      10720312800 bases.
FTrimmed:                   71468752 reads (100.00%)    7146875200 bases (66.67%)
Total Removed:              0 reads (0.00%)     7146875200 bases (66.67%)
Result:                     71468752 reads (100.00%)    3573437600 bases (33.33%)

Alignment:

./bowtie -X 1000 -t -p 10 -v 2 -m 3 -y -a  tair10 -1  Read1.trimmed.fq.gz -2  Read2.trimmed.fq.gz  -S aligned.sam > aligned.txt

Result:

reads processed: 44774133
reads with at least one reported alignment: 33483782 (74.78%)
reads that failed to align: 11290351 (25.22%)
reads with alignments suppressed due to -m: 2648269 (5.91%)
Reported 41506698 paired-end alignments

Sorted, removed chloroplastic and mitochondrial reads, and then removed PCR duplicates with Samtools

Converted .bam files to .bed files with bamToBed

Shifted reads for Tn5 integration sites:

awk 'BEGIN {OFS = "\t"} ; {if ($6 == "+") print $1, $2 + 4, $2 + 5, $6; else print $1, $3 - 6, $3 - 5, $6}' FILE.rmdup.bed > FILE.Tn5.bed

Called peaks using MACS2:

macs2 callpeak -t FILE.Tn5.bed -f BED -g 150000000 --nomodel --keep-dup all --extsize 150 --shift -75 --qvalue 0.01 -n FILE --outdir /macsoutput

I've included a picture of the TSS graph generated using ChipSeeker in R.

Does anyone have ANY idea why this happened? It looks like what I would expect from RNA Seq data...

enter image description here

https://imgur.com/Mdp0XU2

tss atac-seq sequence • 104 views
ADD COMMENTlink modified 5 weeks ago by science_lizard0 • written 5 weeks ago by katywithawhy110

I'm definitely no ATAC-seq expert, but I'm not familiar with the "shifting" step, what is the purpose of this step? I've always had pretty good luck just following this pipeline from ENCODE:

https://www.encodeproject.org/documents/0eb389f9-d23d-4053-b25b-1e2826ee5a86/@@download/attachment/ATACpipelineV7.pdf

ADD REPLYlink written 5 weeks ago by science_lizard0

The ends of the reads are where Tn5 bound chromatin and inserted sequencing adapters, giving rise to the observed sequences. Thus, sequencing reads flank accessible chromatin. Tn5 binds 9 bp of DNA. That’s why we adjust the coordinates +4 for positive strand reads, and -5 for negative strand reads.

ADD REPLYlink written 5 weeks ago by katywithawhy110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 822 users visited in the last hour