Question

Bwa Mem: How To Specify The Fastq Phred Format?

3

Entering edit mode

10.3 years ago

Giovanni M Dall'Olio 28k

Hello,

I have a dataset of short reads in which some fastq files are in the Illumina 1.5 format, and others in the Illumina 1.8. My plan is to align these reads using bwa mem, and later do SNP calling on these.

The main difference between these two formats is that the phred scores are encoded in a different way (e.g. see http://en.wikipedia.org/wiki/FASTQ_format ). Thus, when I used bwa aln on the Illumina 1.5 format, I had to use the -I option to specify that the phred scores were encoded differently. I used to run something like:

bwa aln -I reference seq_illumina15.fastq.gz
bwa aln    reference seq_illumina18.fastq.gz

However, in bwa mem, there is no documentation about a -I option, or about how to specify which version of the fastq format is used (http://bio-bwa.sourceforge.net/bwa.shtml ). Thus, what is the correct way to specify how the phred scores are encoded, in bwa mem?

bwa fastq format • 7.3k views

ADD COMMENT • link updated 15 months ago by Ram 43k • written 10.3 years ago by Giovanni M Dall'Olio 28k

Ram · Answer 1 · 2013-12-16

5

Entering edit mode

10.3 years ago

Devon Ryan 104k

See this thread on the mailing list. The short answer is that there is no option to tell bwa mem this, it assumes phred+33.

Edit: Just to add some more information and a reply from Heng Li, have a look through this thread as well (Heng Li's reply is the 5th one). Basically, Heng doesn't expect bwa mem to support phred+64 since the format isn't being used anymore. He happened to add a converter to seqtk, so that's one option (there are others out there).

ADD COMMENT • link 10.3 years ago by Devon Ryan 104k

0

Entering edit mode

thank you very much for the answer. So, I will have to convert the Illumina 1.5 files to 1.8 (or, explained in different words, phred+33 to phred+64), before running bwa mem.

ADD REPLY • link 10.3 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

I guess you mean phred+64 to phred+33? Because phred+33 is the new one (Illumina 1.8 and Sanger) - just to prevent confusion.

ADD REPLY • link updated 15 months ago by Ram 43k • written 8.9 years ago by Alexander Vowinkel • 0