I went through a couple of posts before posting this question here. I have ATAC-seq paired-end samples and did the following steps but always end up with blue peaks in the insert size metrics. I found a similar post in biostars but did not find a solution to that, and there was also a post in galaxy (https://galaxyproject.github.io/training-material/topics/epigenetics/tutorials/atac-seq/tutorial.html) with the blue peaks that explains that those blue peaks are reverse forward-oriented . I am not sure if am making an error in the pipeline or is at the sequencing level? Could someone take a look at the script to see where the mistake might be and if anything should be different here? This is my first time with ATAC-Seq data. GrCH38 was used as the ref genome. I see a lot of forums with the same question but no solution. Could an ATAC expert help so that all of us learners could benefit. Thanks so much. Below are the steps I followed:
# remove the nextera adapter sequences
cutadapt -q 10 -a CTGTCTCTTATA -A CTGTCTCTTATA --minimum-length 25 -o sample_P1.fastq -p sample_P2.fastq sample_R1_001.fastq sample_R2_001.fastq
#mapping using Bowtie2
bowtie2 -p 4 -q --very-sensitive -x genome -1 sample_P1.fastq -2 sample_P2.fastq -S sample_aln_unsorted.sam
#remove MT chromosomes
grep -v 'chrM' sample_aln_unsorted.sam | samtools view -b -h -F 4 -f 0x2 - >sample_aln_unsorted.bam
#sam to bam
java -jar picard.jar SortSam I=sample_aln_unsorted.bam O=sample_aln_sorted.bam SORT_ORDER=coordinate
##mark duplicates
java -jar picard.jar MarkDuplicates INPUT=sample_aln_sorted.bam OUTPUT=sample_aln_sorted.rem.bam M=sample.txt REMOVE_DUPLICATES=true
##index
samtools index sample_aln_sorted.rem.bam
##insertsizemetrics
java -jar picard.jar CollectInsertSizeMetrics I=sample_aln_sorted.rem.bam O=sample_metrics.txt H=insert_size_sample_histogram.pdf