Question: Bwa-mem alignment on duplicated regions
14 months ago by
annecarol710 wrote:

I have aligned ChiP-seq data using BWA-mem. The reference genome has a duplicated region on 2 different chromosomes. When I count the number of reads aligned to each of the regions I get a different number. If BWA-mem align the reads randomly on duplicated regions wasn't I supposed to get the same number of reads for each duplicated region?

bwa mem -B 40 -O 60 -E 10 -L 50 -M -R '@RG\tID:WTCHG_261086_294\tSM:WTCHG_261086_294' -t 10 /home/plxacb/Fixed_fasta/Ade_fasta/Cen8Ade6.fa WTCHG_261086_294.unmapped_ecoli_1.fastq WTCHG_261086_294.unmapped_ecoli_2.fastq | samtools view -bS - > WTCHG_261086_294.strict_bwamem.bam

samtools view WTCHG_261086_294.strict_bwamem.sorted.bam Not76_Chr2_ref:125168-133320 | wc -l


samtools view WTCHG_261086_294.strict_bwamem.sorted.bam Not76_Chr4_ref:1619693-1627846 | wc -l


modified 14 months ago by harold.smith.tarheel3.7k • written 14 months ago by annecarol710

To be pedantic, your Chr2 region is 1bp shorter than the Chr4 region (8152 vs 8153), it shouldn't make much difference but still who knows...

written 14 months ago by dariober8.0k

When I remove this extra 1bp from chr4 the result is 271688, only 8 reads less.

written 14 months ago by annecarol710
14 months ago by
United States
harold.smith.tarheel3.7k wrote:

It looks like you're using paired-end reads. BWA MEM disambiguates multi-mapping reads if the mate is uniquely aligned. Your results suggest that there are more mates mapped to chr4 than chr2, which would explain the discrepancy.

written 14 months ago by harold.smith.tarheel3.7k

It makes sense but the regions are 8kb long and flanked by the same sequences where my precipitated protein doesn't bind so I would not expect uniquely alignment for any of the sequences.

written 14 months ago by annecarol710
