I'm trying to check sequencing quality of FASTQ file from HiSeq2000. I used fastx_quality_stats script of FASTX-Toolkit (Version 0.0.13) for it. However I've got an error as follows:
$ fastx_quality_stats -i 6_1.fastq -o 6_1.stats <br /> fastx_quality_stats: Invalid quality score value (char '#' ord 35 quality value -29) on line 4
The FASTQ file really contains "#" character.
@HWI-ST621:210:C03D4ACXX:4:1101:1475:1957 1:N:0:ATCACG NACTACAATTTACAGATAACTTTAAATTAAATTTTGGAATCAAATATAAAGATTGAAAATGAATTTTGAATATATGAAAATCCATTTAAAGAGTTTGGTAC + #1=DDDFFHHDHHIIIJJEHIJJJJJIIIJFIGGJJJFICGIGGGIIJIEIIIIJIJIIIIHIIIJIGGIJIIIJGHIEHJJJHHHHHHHFFF;B@CA;;@
"#" charater is invalid quality score value? I heard this FASTQ file was checked using quality trim program of NGS Cell package of CLCBio, and sequencing quality was good. Then, "#" character is invalid for FASTX-Toolkit only?
I also used Popoolation toolbox (Version 1.2.2) for quality trimming of the FASTQ, and I've got some results as follows:
$trim-fastq.pl --input1 6_1.fastq --input2 6_2.fastq --output trimmed ...................................................... FINISHED: end statistics Read-pairs processed: 53675033 Read-pairs trimmed in pairs: 0 Read-pairs trimmed as singles: 0 FIRST READ STATISTICS First reads passing: 0 5p poly-N sequences trimmed: 632578 3p poly-N sequences trimmed: 0 Reads discarded during 'remaining N filtering': 0 Reads discarded during length filtering: 53675033 Count sequences trimed during quality filtering: 53675033 Read length distribution first read length count SECOND READ STATISTICS Second reads passing: 0 5p poly-N sequences trimmed: 628623 3p poly-N sequences trimmed: 801 Reads discarded during 'remaining N filtering': 0 Reads discarded during length filtering: 53675033 Count sequences trimed during quality filtering: 53675033 Read length distribution second read length count
As you see, all of reads were trimmed during the process of quality trimming.
I've been working with some GAII and HiSeq2000 sequence data, but this is the first case. I wonder whether this problem was caused by bad sequencing quality or my mistake.
I appreciate any help.