Hi! I'm trying to do the alignment for an ATAC-seq experiment using bowtie2. We did paired-end 100 pb sequencing. By checking with FastQC, of course I have adapter contamination. So I started trimming the reads as follows:
TrimmomaticPE $(DATAPATH)$(FILE1) $(DATAPATH)$(FILE2) $(SCRATCH_JOBDIR)/$(NAME)_output1_paired_trimmed.fastq $(SCRATCH_JOBDIR)/$(NAME)_output1_unpaired_trimmed.fastq $(SCRATCH_JOBDIR)/$(NAME)_output2_paired_trimmed.fastq $(SCRATCH_JOBDIR)/$(NAME)_output2_unpaired_trimmed.fastq ILLUMINACLIP:/home/ecorrales/Programs/Trimmomatic-0.36/adapters/NexteraPE-PE.fa:2:30:10:8:TRUE TRAILING:10 MINLEN:25
Afterwards I don't see any longer adapter contamination, and only few (0.11%) of the reads are removed. Then I proceeded to do the mapping with bowtie2:
bowtie2 -X 2000 --very-sensitive -x $(GENOMEDIR) -1 $(SCRATCH_JOBDIR)/$(NAME)_output1_paired_trimmed.fastq -2 $(SCRATCH_JOBDIR)/$(NAME)_output2_paired_trimmed.fastq -S $(SCRATCH_JOBDIR)/$(NAME)_output.sam
I then checked the insert size distribution and there were no inserts smaller than 100 bp. I assumed that this could be related to those fragments where the insert size was smaller than the read length. So I tried trimming the reads to different lengths before mapping. Indeed all the inserts smaller than the length that I trimmed to were not mapped.
In the plot below I show the insert size distribution after trimming the reads to 50 bp or 75 bp (or 100 bp for no trimming):
I thought that adding the --dovetail parameter to the mapping would do the trick, but nothing changed. Is there any other parameter that I could be missing?