Question: What does % properly aligned pairs means when using BWA alignment? Why removing -P?
0
gravatar for mlope921
5 weeks ago by
mlope9210
mlope9210 wrote:

Hi! Thank you for taking the time to read this post 😊 I am currently in the process of aligning my trimmed reads to a reference genome. For that, I am using bwa alignment. After the alignment was completed, I ran flagstats (samtools) to get a sense of the % of alignment to the reference genome. This is the command line I used for that:

bwa mem -P -M Gac-HiC_revised_genome_assembly.fa MUI004_output_forward_paired.fq MUI004_output_reverse_paired.fq > bwa_alignment_P_MUI004.sam

And when running flagstats, this was my output

52248219 + 0 in total (QC-passed reads + QC-failed reads)
1105309 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
27363819 + 0 mapped (52.37% : N/A)
51142910 + 0 paired in sequencing
25571455 + 0 read1
25571455 + 0 read2
0 + 0 properly paired (0.00% : N/A)
20761748 + 0 with itself and mate mapped
5496762 + 0 singletons (10.75% : N/A)
3606028 + 0 with mate mapped to a different chr
1105347 + 0 with mate mapped to a different chr (mapQ>=5)*

I saw that other people here have posted about this 0.0% properly paired issue, and they suggested to remove -P out of the command line. Once I removed P, the new output was this:

52248219 + 0 in total (QC-passed reads + QC-failed reads)
1105309 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
37815854 + 0 mapped (72.38% : N/A)
51142910 + 0 paired in sequencing
25571455 + 0 read1
25571455 + 0 read2
32281574 + 0 properly paired (63.12% : N/A)
34482372 + 0 with itself and mate mapped
2228173 + 0 singletons (4.36% : N/A)
1994818 + 0 with mate mapped to a different chr
939713 + 0 with mate mapped to a different chr (mapQ>=5)*

My questions are:

  • What does properly paired means in this context?
  • Why removing “-P” caused a change from 0 to 63.12% ?
  • Are these values “good”? I am aligning trimmed DNA reads from one species to the reference genome of a second species (although they are closely related)… should my percentage of properly aligned paired then be higher? Or my results are within what is expected for a situation such as this one? Is there a command that I could use to increase the number of properly aligned pairs?

Beforehand, I appreciate your help in this regard. I am new to using shell commands and bioinformatics, so any help is greatly appreciated!

bwa stats alignment • 146 views
ADD COMMENTlink modified 5 weeks ago by h.mon26k • written 5 weeks ago by mlope9210

I did trim the reads using Trimmomatic and It kept over 95% of the pairs, so I don't know if this is due to the trimming. More over, as I mentioned in the second portion of my post, once that I removed -P from the bwa parameters, it did change from 0% to 63.12% of pairs properly aligned ... I guess that a fundamental part of all of this is that I don't quite understand what "properly paired" means in this context

ADD REPLYlink written 5 weeks ago by mlope9210

95% of the pairs might mean that your read1 and read2 files are out of sync. Though if lots of your reasd are properly paired, that's likely not your problem. But I think ATpoint has the right answer here.

ADD REPLYlink written 5 weeks ago by swbarnes25.8k
0
gravatar for ATpoint
5 weeks ago by
ATpoint17k
Germany
ATpoint17k wrote:

The definition of a proper pair depends on the aligner and is not standardized. For BWA mem I think it means that the two reads face each other in F/R orientation and are aligned within 10kb of each other.

-P            skip pairing; mate rescue performed unless -S also in use

This option disables the paired-end component of bwa mem. THerefore of course you won't get any proper pairs. Do not use this option.

ADD COMMENTlink written 5 weeks ago by ATpoint17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1252 users visited in the last hour