Question: Fastx Invalid Quality Score Value
5
gravatar for sckinta
7.2 years ago by
sckinta600
United States
sckinta600 wrote:

I have multiple RNAseq libraries to parse, so I wrote a pipeline using bash and submitted them in batch. Most of libraries ran well and gave me the results I wanted. But two of libraries failed at quality filter part (fastx_clipper), reporting like "fastx_clipper: Invalid quality score value (char '#' ord 35 quality value -29) on line 4". Another one reported like "Invalid quality score value (char ',' ord 44 quality value -20) on line 4".

In fact, I have not indicated quality score value. Here is the part of code.

        tar xjf StHe51G3_reads.tar.bz2;
        PairFiles=(1 2);
        TrimmedFile=();
        cd StHe51G3_reads
        for PairIndex in ${PairFiles[@]}
        do
                RawFile='StHe51G3_read'$PairIndex'.fastq';
                TrimmedFile='StHe51G3_read'$PairIndex'_trimmed.fastq';
                fastx_clipper -a 'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT' -n -v -i $RawFile | 
                fastx_clipper -a 'CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT' -n -v -i - | 
                fastx_clipper -a 'AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT' -n -v -i - | 
                fastx_clipper -a 'AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG' -n -v -i - | 
                fastx_clipper -a 'TTTTTTTTTTAATGATACGGCGACCACCGAGATCTACAC' -n -v -i - | 
                fastx_clipper -a 'TTTTTTTTTTCAAGCAGAAGACGGCATACGA' -n -v -i - |
                fastq_quality_trimmer -t 20 -l 25 -v -i - -o $TrimmedFile
                TrimmedFile+=($TrimmedFile);
                rm $RawFile;
        done

I have checked the fastq file for quality encoding formats to see which score system it used. It should be "Illumina 1.3+ Phred+64" since majority contains the quality coding like "^_`abcdefg" , there is no way that the coding can be "Sanger Phred+33". According to the https://en.wikipedia.org/wiki/FASTQ_format#Encoding. No system can cover "Illumina 1.3+ Phred+64" and "Sanger Phred+33" at same time. So how does "#" and "," (Sanger Phred+33) come from, since all the libraries are sequenced by the same platform ?

Anyone help ??????

fastx bash • 13k views
ADD COMMENTlink modified 5.8 years ago by Biostar ♦♦ 20 • written 7.2 years ago by sckinta600
2

add -Q33 option

ADD REPLYlink written 7.2 years ago by Rm8.0k

duplicate of FASTQ quality check

ADD REPLYlink written 7.2 years ago by Pierre Lindenbaum133k
11
gravatar for Istvan Albert
7.2 years ago by
Istvan Albert ♦♦ 86k
University Park, USA
Istvan Albert ♦♦ 86k wrote:

Adding a comment as an answer, use the

-Q33

option

see also: FASTQ quality check

ADD COMMENTlink written 7.2 years ago by Istvan Albert ♦♦ 86k

It works. Thank you:)

ADD REPLYlink written 7.2 years ago by sckinta600
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1608 users visited in the last hour
_