Hi, I'm totally new here and totally new to bioinformatic (I think I technically started learning this last week) Story goes like this, I got my atac-seq fastq data last Monday, and I started to turn these fastq files into peaks. I learnt how to trim, align and visualize my data and a little bit QC afterward. I chose Trimmomatic in the galaxy of my university (Illuminaclip Nextera pair end adapter) to trim my fastq file, and got 4 files, 2 paireds, and 2 unpaireds. I only aligned my paired fastq files with Bowtie2 -X 2000, and got mapped rate as 90%. I converted the BAM files (default output of the bowtie2 in our galaxy) into bedgraph then tdf in IGV for visualization. I plotted the distribution of the reads surrounding the TSS of my annotated genome and got highly enrichment near TSS. OK, weird thing happened. I plotted the insert distribution using picard tool, and got this plot:
It appeared that I lost all the inserts smaller than 120 bp which is actually the nucleosome-free-regions that I need most. THen I guessed I must have some data that were not mapped, so I went back to my fastq file, and found those unpaired data generated from Trimmomatic are huge. For example, each of the paired file is 8 GB, and one unpaired R1 file is 4.5GB, the other is 10mb. I wonder whether my nucleosom-free-regions just lied in these unpaired data, and how I can combine this unpaired data with my paired data generated from Trimmomatic?