Question: bwa mem fastq order
0
gravatar for guffrann
4 months ago by
guffrann0
guffrann0 wrote:

Hello everyone,

I have recently started to use bwa. As I was using this tool I tried to change the orders of the fastq files. So instead of

R1.fastq R2.fastq > result1.sam

I used

R2.fastq R1.fastq > result2.sam

vcf files generated from those sam files were different than each other. They had a little bit of difference but still I am wondering why they had different result.

Thank You

bwa alignment fastq bwa-mem • 163 views
ADD COMMENTlink modified 4 months ago by Pierre Lindenbaum121k • written 4 months ago by guffrann0
2
gravatar for Pierre Lindenbaum
4 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum121k wrote:

don't do that. BWA expects the fastq to be 1st-in-pair fastq + 2nd-in-pair fastq

if you switch the fastqs your reads will be mapped, but the pairs of reads won't be in 'proper pair' anymore. What Does The "Proper Pair" Bitwise Flag Mean In A Sam File?

I was wrong, please don't upvote. see ATpoint 's comment below.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Pierre Lindenbaum121k
2

I do not agree with this. Reads being properly-paired is not influenced by the order of the two mate files towards each other. See e.g. with a selection of 100k human reads:

## bwa mem (...) test1.fq test2.fq | samtools flagstat -
200104 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
104 + 0 supplementary
0 + 0 duplicates
198498 + 0 mapped (99.20% : N/A)
200000 + 0 paired in sequencing
100000 + 0 read1
100000 + 0 read2
196240 + 0 properly paired (98.12% : N/A)
197768 + 0 with itself and mate mapped
626 + 0 singletons (0.31% : N/A)
546 + 0 with mate mapped to a different chr
352 + 0 with mate mapped to a different chr (mapQ>=5)

## bwa mem (...) test2.fq test1.fq | samtools flagstat -
200102 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
102 + 0 supplementary
0 + 0 duplicates
198496 + 0 mapped (99.20% : N/A)
200000 + 0 paired in sequencing
100000 + 0 read1
100000 + 0 read2
196240 + 0 properly paired (98.12% : N/A)
197768 + 0 with itself and mate mapped
626 + 0 singletons (0.31% : N/A)
560 + 0 with mate mapped to a different chr
351 + 0 with mate mapped to a different chr (mapQ>=5)

There is no information about first/second in pair in fastq files, this is assigned after alignment. Switching order should only change the strand information. Rather than that OP should give information about the variant calling, please provide the full command line. Maybe the differences come from low-quality alignments and/or low-complexity regions.

Edit: I would still recommend that you keep the correct fastq order at all times to avoid any potential issues that could happen. There is no reason to mess around with this.

ADD REPLYlink modified 4 months ago • written 4 months ago by ATpoint18k

thank you so much. can I ask what is what is OP. What commandline do you want me to provide.

ADD REPLYlink written 4 months ago by guffrann0

First thanks a lot for your quick response. I'll try to learn about proper pairs. I have checked a little bit but if you know how does it change the outcome?

ADD REPLYlink written 4 months ago by guffrann0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1513 users visited in the last hour