Question

Convert Illumina Reads To Sanger Score

4

Entering edit mode

13.2 years ago

Khader Shameer 18k

What tool do you use to convert your Illumina paired-end reads (Illumina's fastq is encoded in ASCII-64) to Sanger score (ASCII-33) ?

I am looking at two methods included in maq (both written by lh3): Do you use one of this methods associated with maq or recommend any other tools.

http://maq.sourceforge.net/fq_all2std.pl
maq ill2sanger P1_R1.fastq P1_R1_sanger.fastq

illumina short next-gen sequencing • 16k views

ADD COMMENT • link updated 2.1 years ago by Ram 44k • written 13.2 years ago by Khader Shameer 18k

3

Entering edit mode

fq_all2std.pl is outdated...

ADD REPLY • link 13.2 years ago by lh3 33k

0

Entering edit mode

What does it mean? The script is not suitable?

ADD REPLY • link updated 4.7 years ago by Ram 44k • written 8.7 years ago by Shicheng Guo ★ 9.5k

3

Entering edit mode

I think converting to the sanger scale should be the first step.

ADD REPLY • link 13.2 years ago by lh3 33k

0

Entering edit mode

@lh3: Thanks for the info. I am wondering whether I should do the data-cleanup before / after converting the illumina to sanger score. Please let me know your thoughts.

ADD REPLY • link 13.2 years ago by Khader Shameer 18k

0

Entering edit mode

Thanks @lh3 !!

ADD REPLY • link 13.2 years ago by Khader Shameer 18k

0

Entering edit mode

I was wondering if you run a bunch of mixed formatted (some Illumina and some Sanger) using maq or seqret to create them all in a fixed ASCII-33 format, will these two tools skip the files that are already in Sanger (ASCII-33) format and convert only ASCII-64 files to ASCII-33?

ADD REPLY • link updated 2.1 years ago by Ram 44k • written 9.3 years ago by bioinfo ▴ 840

score 9 · Answer 1 · 2011-05-19

9

Entering edit mode

13.2 years ago

Louis Letourneau ▴ 820

We use emboss seqret:


$EMBOSS_HOME/seqret fastq-illumina::phred64Data.fastq fastq::phred33Data.fastq

ADD COMMENT • link 13.2 years ago by Louis Letourneau ▴ 820

0

Entering edit mode

Is there documentation for which Fastq dialects seqret support and how to reference those dialects with a seqret command? I'm not seeing it in the EMBOSS documentation.

ADD REPLY • link 11.4 years ago by Daniel Standage 4.1k

0

Entering edit mode

Just found it: http://emboss.sourceforge.net/docs/themes/SequenceFormats.html

ADD REPLY • link 11.4 years ago by Daniel Standage 4.1k

score 3 · Answer 2 · 2011-05-06

3

Entering edit mode

13.2 years ago

Farhat ★ 2.9k

Galaxy's FASTQ groomer will do this job if you don't mind the web interface.

ADD COMMENT • link 13.2 years ago by Farhat ★ 2.9k

0

Entering edit mode

Thanks Farhat. I am looking at a non-Galaxy solution at the moment.

ADD REPLY • link 13.2 years ago by Khader Shameer 18k

Ram · Answer 3 · 2011-05-06

2

Entering edit mode

13.2 years ago

brentp 24k

I haven't used this particular tool, but here is a tool built with Jim Kent's libraries to do the conversion 64to33 (or 33to64).

It comes with a makefile and all the includes necessary so it should be quite fast.

EDIT: there's also a very nice C-API in the Kent-tools: https://github.com/jstjohn/KentLib/blob/master/lib/fastq.c The function signature looks like:

inline void phred64ToPhred33( char * p64, int l)

So it should be easy to use.

ADD COMMENT • link 13.2 years ago by brentp 24k

0

Entering edit mode

Thanks Brent, I noticed that you mentioned about a potential issue with paired end read that not taken up by fastx_toolkit ( Filtering Paired End Reads ) Does this tool take care of that?

ADD REPLY • link updated 4.8 years ago by Ram 44k • written 13.2 years ago by Khader Shameer 18k

0

Entering edit mode

If you're just converting, not filtering that won't be an issue. If you filter after the conversion (for whatever reason), then yes, you'll probably have to figure out how to make sure you get neither or both reads.

ADD REPLY • link 13.2 years ago by brentp 24k

0

Entering edit mode

Thanks. I am posted another question on filtering I am not sure if it is good to do the filtering of the reads before / after the QC makes much difference.

ADD REPLY • link updated 4.8 years ago by Ram 44k • written 13.2 years ago by Khader Shameer 18k

0

Entering edit mode

I'm not so familiar with C. If I want to use this script to convert fastq illumina quality score to fastq sanger quality score, what command should I run?

ADD REPLY • link 12.1 years ago by Chai_AF ▴ 80

Ram · Answer 4 · 2011-06-17

2

Entering edit mode

13.1 years ago

Weronika ▴ 300

You an use the HTSeq package in python: http://www-huber.embl.de/users/anders/HTSeq/doc/sequences.html#sequences. It will read fastq files with any of the common quality encodings, but always write using the Sanger (Phred) encoding.

ADD COMMENT • link updated 4.9 years ago by Ram 44k • written 13.1 years ago by Weronika ▴ 300

score 1 · Answer 5 · 2011-05-19

A dirty and quick solution would be to make a FASTA file with Ns or any other rarely occurring homopolymer sequence of length equal to your read length. Align you FATSQ file against this reference with any quality aware aligner like BWA or Bowtie to get the BAM file (you can parallelize it for speed). Now by definition in BAM file quality scores are recored as Sanger scores. All the reads will be reported only once as unaligned reads. Now you can use Picard to get the Fastq back from the BAM file with Sanger scores!

Note: there is a difference in the way quality scores are recored in Ilummina Fastq files pre and post 1.3 version of Casava.

http://en.wikipedia.org/wiki/FASTQ_format

score 1 · Answer 6 · 2012-07-10

1

Entering edit mode

12.1 years ago

Obi Griffith 20k

For anyone dealing with the problem of various fastq encoding schemes and looking to do some sanity checks on their method of conversion. Or, if you just want to look up the phred score for specific ascii code under one of the different schemes. I have found this blog entry extremely useful. It provides Sanger (And Illumina 1.3+ (And Solexa)) Phred Score (Q) ASCII Glyph Base Error Conversion Tables.

ADD COMMENT • link 12.1 years ago by Obi Griffith 20k

0

Entering edit mode

damn, the link you posted has been vandalized!

ADD REPLY • link 10.6 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

Is nothing sacred? I emailed the author to let him know.

ADD REPLY • link 10.6 years ago by Obi Griffith 20k

1

Entering edit mode

Fixed. Site is back up.

ADD REPLY • link 10.6 years ago by Obi Griffith 20k