I am struggeling with the alignment quality of my mapped ChIP-seq data. This topic in general has been discussed before, however only for paired-end and my data is single-end.
I used public ChIP-seq data from GEO (GSE55062) for H3K27ac, IgG and some others (single-end). In the corresponding paper, they do not mention any trimming, but start their ChIP-seq analysis directly with “reads were aligned to the HG19 reference genome using Bowtie2 with all default settings.” As my fastQC on the downloaded raw fastq files revealed some moderate sequence quality scores I decided to trim them before mapping (if I skip this, the alignment rates are around 1% ):
java –jar trimmomatic-0.36.jar SE –threads 4 –phred33 IgG.fastq IgG-trimmed.fastq TRAILING:25 SLIDINGWINDOW:4:25 ./bowtie2 –U IgG.fastq –x index/hg19 –p 6 –S IgG-mapped.sam
29853240 reads; of these:
29853240 (100.00%) were unpaired; of these:
17350252 (58.12%) aligned 0 times
7188392 (24.08%) aligned exactly 1 time
5314596 (17.80%) aligned >1 times
41.88% overall alignment rate
However, I only get a pretty bad overall alignment rate of 42% for IgG and 21% for H3K27ac. The rates for my other files are all around this range. Is there a way to increase the alignment rate? Did I miss anything important in my steps which leads to these bad rates? Do I have to do an additional quality improving step before aligning? They do not mention any quality control before mapping in the paper, but quote alignment rates above 50%.
Thanks for your help,