What To Do With An Error In The Fastq Illumina Quality Scores
3
5
Entering edit mode
11.5 years ago
Allpowerde ★ 1.2k

I have a FASTQ* file with reads from an Illumina machine and try to do the quality control filtering with the FASTX-Toolkit but get problems with the quality scores (see this post for a nice discussion about the scores)

While fastx_quality_stats and fastx_trimmer run without complaining, 'fastq_quality_filter' is suddenly not happy with the files

fastq_quality_filter: Error: invalid quality score data on line 148 (quality_tok = "Z]aaaaa]O]aabaaaaa]" Petra*read2.fasta


The particular read looks like this :

@Petra_4_1_1_10_1327/1
AGTATTTTTGAATCTCATCATCGTCACTTCACTAAG
+Petra_4_1_1_10_1327/1
Z]aaaaa]O]aabaaaaa][FW__a\FW_X[M


Does anyone have a suggestion (other than deleting this read) or some experience?

*well it is labeled .FASTA but looks like a FASTQ file

next-gen sequencing quality fastq fastx • 7.8k views
1
Entering edit mode

This may be a stupid question, but are you sure about which encoding is being used for quality scores? I'd assume the latest Illumina encoding, but can you be sure?

4
Entering edit mode
11.5 years ago

Which version of the fastx toolkit do you have installed? This seems to be working with the latest version (0.0.13):

% cat test.fastq
@Petra_4_1_1_10_1327/1
AGTATTTTTGAATCTCATCATCGTCACTTCACTAAG
+Petra_4_1_1_10_1327/1
Z]aaaaa]O]aabaaaaa][FW__a\FW_X[M
mothra:fastq % fastq_quality_filter -q 20 -p 50 -i test.fastq
@Petra_4_1_1_10_1327/1
AGTATTTTTGAATCTCATCATCGTCACTTCACTAAG
+Petra_4_1_1_10_1327/1
Z]aaaaa]O]aabaaaaa][FW__a\FW_X[M


You might have a version before support for the latest 1.3+ Illumina pipeline scores were introduced. An alternative to upgrading the fastx toolkit is to use Galaxy to convert the scores into Solexa format.

3
Entering edit mode
9.7 years ago
Zhilong Jia ★ 2.0k

0
Entering edit mode
11.5 years ago
Allpowerde ★ 1.2k

It must have something to do with the wrong file endings. Since Brad demonstrated that fastx toolkit does not really have a problem with the particular read, I changed the file ending from the wrong FASTA into FASTQ and now it works like a charm.

So the above error message should have been: fastqqualityfilter: Error: invalid input for the specified file type X.fasta

0
Entering edit mode

Glad you got this working. I can't replicate your error by copying the filename to test.fasta in my example above; it still works fine. Perhaps this is an issue that recent versions take care of; it would be useful to leave a comment with the version you were having problems with for future folks who find this thread.

0
Entering edit mode

Hmm that is interesting. I'm using the latest version and I know that it is not an issue of my OS because another machine (not sure what version it uses) returned the exact same error when presented with this file. Could it be some line ending issue that was resolved when I copied from .fasta to .fastq ?