Question: Using paired-end date as single-end for mapping with bwa mem. Good, bad or ugly?
0
gravatar for resug
7 weeks ago by
resug0
resug0 wrote:

Hi Biostars,

I am trying to align my paired-end reads to my assembly with bwa mem (to be used for polishing with Racon afterwards). However due to downstream application requirements (Racon) my paired-end reads have to have the same ID, which bwa does not accept on its paired-end mode. So in order to use all my reads I treated my paired-end reads as single reads for alignment with bwa mem (by deleting the ID content after the space in the header line '@' and merging both fastq files).

Now I am wondering if this approach I took was a good call or not? If so what would be the problematic here? How significant would be the difference in the quality of the alignment treated this way vs properly as paired-end? Can this alignment cause a spurious polishing later when using Racon?

I would appreciate much to have your thoughts. Thanks!

ADD COMMENTlink modified 7 weeks ago by GenoMax94k • written 7 weeks ago by resug0
1
gravatar for GenoMax
7 weeks ago by
GenoMax94k
United States
GenoMax94k wrote:

my paired-end reads have to have the same ID, which bwa does not accept on its paired-end mode.

That should be easy to change with reformat.sh from BBMap suite.

reformat.sh in1=your_R1.fq in2=your_R2.fq out1=Fixed_R1.fq out2=Fixed_R2.fq addcolon=t

addcolon=t              Append ' 1:' and ' 2:' to read names, if not already present.  Please include the flag 'int=t' if the reads are interleaved.
ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by GenoMax94k

Thank Genomax. For Racon no two reads should have the same identifier up to the first whitespace, so Racon would accept this happily. However BWA-mem would not accept it because it requires the identifiers to be the same until the first whitespace in PE. This is the discrepancy I am dealing with on using Racon for polishing with a sam file generated with BWA-mem. So to be compatible I run my PE reads with BWA-mem as single reads (though not sure how good this alignment is, maybe it's just fine), or it would be great to know how to run BWA-mem with different identifiers until the whitespace in PE mode. Thanks again.

ADD REPLYlink written 7 weeks ago by resug0

However BWA-mem would not accept it because it requires the identifiers to be the same until the first whitespace in PE.

What addcolon= does it it will add a standard 1:N:0 after the first white space. So these reads should work for both.

You could also do the following as an alternate to addcolon=. This will create old style Illumina read headers.

addslash=t              Append ' /1' and ' /2' to read names, if not already present.

This should make the reads unique without the whitespace.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by GenoMax94k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2333 users visited in the last hour
_