What does % properly aligned pairs means when using BWA alignment? Why removing -P?
1
0
Entering edit mode
4.9 years ago
mlope921 • 0

Hi! Thank you for taking the time to read this post 😊 I am currently in the process of aligning my trimmed reads to a reference genome. For that, I am using bwa alignment. After the alignment was completed, I ran flagstats (samtools) to get a sense of the % of alignment to the reference genome. This is the command line I used for that:

bwa mem -P -M Gac-HiC_revised_genome_assembly.fa MUI004_output_forward_paired.fq MUI004_output_reverse_paired.fq > bwa_alignment_P_MUI004.sam

And when running flagstats, this was my output

52248219 + 0 in total (QC-passed reads + QC-failed reads)
1105309 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
27363819 + 0 mapped (52.37% : N/A)
51142910 + 0 paired in sequencing
25571455 + 0 read1
25571455 + 0 read2
0 + 0 properly paired (0.00% : N/A)
20761748 + 0 with itself and mate mapped
5496762 + 0 singletons (10.75% : N/A)
3606028 + 0 with mate mapped to a different chr
1105347 + 0 with mate mapped to a different chr (mapQ>=5)*

I saw that other people here have posted about this 0.0% properly paired issue, and they suggested to remove -P out of the command line. Once I removed P, the new output was this:

52248219 + 0 in total (QC-passed reads + QC-failed reads)
1105309 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
37815854 + 0 mapped (72.38% : N/A)
51142910 + 0 paired in sequencing
25571455 + 0 read1
25571455 + 0 read2
32281574 + 0 properly paired (63.12% : N/A)
34482372 + 0 with itself and mate mapped
2228173 + 0 singletons (4.36% : N/A)
1994818 + 0 with mate mapped to a different chr
939713 + 0 with mate mapped to a different chr (mapQ>=5)*

My questions are:

  • What does properly paired means in this context?
  • Why removing “-P” caused a change from 0 to 63.12% ?
  • Are these values “good”? I am aligning trimmed DNA reads from one species to the reference genome of a second species (although they are closely related)… should my percentage of properly aligned paired then be higher? Or my results are within what is expected for a situation such as this one? Is there a command that I could use to increase the number of properly aligned pairs?

Beforehand, I appreciate your help in this regard. I am new to using shell commands and bioinformatics, so any help is greatly appreciated!

alignment bwa stats • 2.1k views
ADD COMMENT
0
Entering edit mode

I did trim the reads using Trimmomatic and It kept over 95% of the pairs, so I don't know if this is due to the trimming. More over, as I mentioned in the second portion of my post, once that I removed -P from the bwa parameters, it did change from 0% to 63.12% of pairs properly aligned ... I guess that a fundamental part of all of this is that I don't quite understand what "properly paired" means in this context

ADD REPLY
0
Entering edit mode

95% of the pairs might mean that your read1 and read2 files are out of sync. Though if lots of your reasd are properly paired, that's likely not your problem. But I think ATpoint has the right answer here.

ADD REPLY
2
Entering edit mode
4.9 years ago
ATpoint 81k

The definition of a proper pair depends on the aligner and is not standardized. For BWA mem I think it means that the two reads face each other in F/R orientation and are aligned within 10kb of each other.

-P            skip pairing; mate rescue performed unless -S also in use

This option disables the paired-end component of bwa mem. THerefore of course you won't get any proper pairs. Do not use this option.

ADD COMMENT

Login before adding your answer.

Traffic: 2802 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6