Question

Problem mapping paired-end Illumina reads

0

Entering edit mode

5.3 years ago

biostart ▴ 370

Hello, Could you please advise on the following:

We have ChIP-seq data with paired-end Illumina reads. For some of the samples only about 11% or reads could be mapped with Bowtie. When remapping these samples with Bowtie2, up to 85% reads could be mapped, but the pairs have been lost, meaning that for most mapped reads there is no pair available. What could go wrong and how to fix it? Thanks!

ChIP-Seq alignment paired-end • 2.0k views

ADD COMMENT • link updated 5.3 years ago by swbarnes2 14k • written 5.3 years ago by biostart ▴ 370

0

Entering edit mode

Try BWA with just one file, over even a subset of your reads. BWA will estimate insert size from the mapping, and its output may help you understand what went wrong with your Bowtie mapping. If you want to stick with Bowtie / Bowtie2, you then may use BWA estimated mean and sd insert size values as input for Bowtie.

ADD REPLY • link 5.3 years ago by h.mon 35k

0

Entering edit mode

I can estimate the average DNA fragment length as ~150 based on the reads which successfully aligned

ADD REPLY • link 5.3 years ago by biostart ▴ 370

score 0 · Answer 1 · 2019-01-06

0

Entering edit mode

5.3 years ago

swbarnes2 14k

In my limited experience with bowtie, the default settings require a very stringent ranges of acceptable insert sizes. Try changing these in your command line to be more generous.

ADD COMMENT • link 5.3 years ago by swbarnes2 14k

0

Entering edit mode

I tried changing to "-X 1000", but it did not help

ADD REPLY • link 5.3 years ago by biostart ▴ 370

0

Entering edit mode

The other possibility is that the reads from your fastqs are out of sync. Are they the exact same number of lines?

ADD REPLY • link 5.3 years ago by swbarnes2 14k

0

Entering edit mode

The numbers of reads is the same, but their quality seems to be different: I have mapped with Bowtie separately each of the two paired fastq files: for one file I've got 50% reads with at least one reported alignment, whereas for the second fastq file I've got 31% reads with at least one reported alignment. I guess this explains how I end up with even smaller percent of aligned pairs when using paired-end alignment. This is then unrelated to the insert size... But how to fix this is the question

ADD REPLY • link 5.3 years ago by biostart ▴ 370

0

Entering edit mode

Have a look at the quality of read 1 and read2 with FASTQC or similar (fastp ). Do the quality values of the second read drop off markedly along the read ? Try trimming ? Or as suggested above BWA. Bad R2 is pretty common, especially on >150bp reads from some illumina sequencers.

ADD REPLY • link 5.3 years ago by colindaven 6.3k

0

Entering edit mode

FastQC reports for both read 1 and 2 for the problematic samples look similarly problematic, I am posting the images below. Any idea how to correct this?

FastQC quality score