Invalid quality score value when using fastq_quality_filter from FASTX_toolkit
1
0
Entering edit mode
4.9 years ago
kelvinfrog75 ▴ 10

Hey, I am using the function fastq_quality_filter from FASTX_toolkit. Here is my code:

fastq_quality_filter -q 20 -i input_path -o output_path


However, I got this error message "Invalid quality score value (char '.' ord 46 quality value -18) on line 4".

My sequences were generated from Illumina and I think it is TrueSeq. Below is the few sequences in the fastq file. Does anyone know why I am getting the invalid quality score?

@NS500216:139:H2JLWAFXX:1:11101:9276:1046 1:N:0:31
CACCTATCCCAACGCTGCCCATGCCGTCCGCCCGGCCGTCGCCGATGCCCGGCAGCCGCAACACGCCCTTCCCGGTGACCGCCTCGTGCGTCAACCCGCCCCCTCCCCGGGAACCTGGGCGTTCTGGCGACGCGACAGCCGGGATNTGGCN
+
FF....7A<<..A)F7F.FFF.AFAAA.<F<.))A<FF<<AF<AAF<<7<F<FFFF.FFFAF7FA.F...FFAAA)FAF)7)F.F)F<FFFFF7<F.<F..FF.F)).7F<F.F.7FFFA<).F.FFFFFF<.F<FFF.<7FFFF#<AAA#
@NS500216:139:H2JLWAFXX:1:11101:24009:1049 1:N:0:31
CAGGTGGATTGGGGGAGCAAGGGTGAGTCAGCCACGGTGTGCATGGACGGCAACAATGCGAACGCGCCGAAGAAGGAATGCAAGTCGGGCGAGGAGTAATCGCTAGACTGGCTGTTTGGCGACATCGGCGGCGCGTCCGCCANTGAGN
+
FF7F<7<<7)<..F.F.FFF7<7)FFF.)..F..FA).AFFFAFFFFFF.7FFF.FFFF)F<7F.<7FFFFAA.F.FFF<AFFFA.<7.<FFF<FFF.F<FAA.F7AF7.F.AFFAF.FF7A7FFAF<FAAFAF<<F<F.FF#<AA<#
@NS500216:139:H2JLWAFXX:1:11101:21209:1051 1:N:0:31
ACCGCGACGATCTGCTGCGCTTGCAGGACAAGGAGCAGCGCACCCTCCGGCCGATGGTGGTGCCGTTCAACCTGAAATGAGGAGGGCAGGACCGTGCCGCTGAAGCAGTAGCAAGAGCGCGTGCTGCGGGAGGTCAAGCACTTCCNTGAAN
+
AAAFFAFFA7AF.7<)F.FA.)A.7FFAF<7FFAF)<)FFF.A<<F<AFAFFF)F<7A<F)F7.F.FFFFF<.FFFA<FAF7AF<FFFFFFFFAFAFFAF)FFAFFFFFF.FAFFFFFFAFFFF<FFFFFA.F.F7AAFFFA.FF#AAAA#
@NS500216:139:H2JLWAFXX:1:11101:16640:1054 1:N:0:31
ATGCCCCTCTATGTTACGGCGTTCGATATTGTCAGCGGTCGCCTCCTTCTCTTTGGCGAAGACCCTCGCGCACCAGTGGCCGAGGCTGTGTTGGCTAGTTCATCCATCCCAGGCAGCCATCCTCCTCTGAATTATCACGGACTCCNGCTTN
+

sequencing • 1.8k views
2
Entering edit mode

Add option -Q33 if you want to keep using FASTX_toolkit.

I second using BBDuk.sh instead.

1
Entering edit mode
4.9 years ago

IIRC, FastX Toolkit assumes input data is encoded using old ASCII-64 quality scores, which is basically never the case any more. I suggest you use a more modern program such as BBDuk for doing quality-score trimming or filtering; it's faster, will do a better job, and will not break the pairing order of your reads, which FastX will. To do that operation, with input fles named read1.fq and read2.fq, you would type:

bbduk.sh in=read#.fq out=filtered#.fq maq=20


Not that I would recommend doing that, by the way. 20 is usually much too high of a level for quality-filtering, and I think quality-trimming is a better operation anyway, in most cases. Using very high thresholds will increase bias.

0
Entering edit mode

Thanks!. I will check out the BBDuk. What is the score you will recommend for quality filtering. Does quality-trimming also use score? If so, what score will you recommend? Thanks.

0
Entering edit mode

If you are aligning to a reference then you could omit Q-score based filtering altogether (or if you must, filter Q10 and below). For de novo assembly work you may want to be more stringent (Q20 or more).

0
Entering edit mode

Hey, I want to ask if the Q-score is referred to quality filtering or trimming. Also, if I want to do trimming using BBDuk, what will be the script look like? Do I need to use both qtrim and trimq? Thanks

0
Entering edit mode

Yes. "qtrim" tells it which side to trim on, and "trimq" tells it the quality threshold. A sample command would be:

bbduk.sh in=read#.fq out=trimmed#.fq qtrim=rl trimq=12


That will trim the left and right ends of each read to Q12 (the remaining portion of the read will have average quality scores of at least 12). Both quality-filtering and quality-trimming use quality scores, but filtering throws away the entire read, while trimming just removes the low-quality bases from the ends and keeps the rest of the read.

0
Entering edit mode

Great, I do think BBDuk is better than FASTx tools. Thanks.

0
Entering edit mode

@Brian: This would not remove any adapter contamination, if present, correct?

0
Entering edit mode

Correct. And it's best to remove adapters prior to quality-trimming. In the bbmap directory there is a subdirectory "docs" which has a subdirectory "guides". In that there is BBDukGuide.txt and PreprocessingGuide.txt. The preprocessing guide contains my recommended procedures for preprocessing raw Illumina reads prior to use, including the best order (note that many of the steps are optional and depend on your experiment, but that's the order you would do them if you wanted to). The BBDuk guide has sample command lines for typical operations like quality-trimming or adapter-trimming.