Dear all, I am relatively new to RNAseq analysis so I am really hoping someone can help me with this issue. I am using STAR aligner for mapping my paired-end RNAseq reads. For the first 10 samples, this worked seamlessly but for the last 2, I keep getting the following error message:
> *EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
> @GWNJ-0957:375:GW1902221898:2:2211:26494:23442
> CGAAGACAGACCGAAGATGATCCAAGTAGCTAAGGAACTCAAGCGGATTGAAACATCACTGAAAGGTGTTAGCTAAATACCTCTTCTCTGTTCTTGGACTG
> AAAFFJJFFJJFFF-AAFAAAJFJFFFJJF<<AJ- SOLUTION: fix your fastq file*
The sequence length is 100bps and I am doing the alignment on trimmed reads. I would really appreciate any help you can provide to fix this issue. Many thanks, Midhat
Looks like you may have mangled fastq record(s) in some way. What pre-processing did you do with these files?
The fastq files had adapter contamination so I trimmed these using TRIMMOMATIC 'CROP:101 before using them for alignment. The other reads that STAR successfully aligned were trimmed the exact same way.
Inspect the read using this command
zgrep -A 3 '@GWNJ-0957:375:GW1902221898:2:2211:26494:23442' file.fq.gz
(or plaingrep
if files are not compressed) to see if the record is indeed malformed.I am not sure what
TRIMMOMATIC 'CROP:101
did, if your reads are only 100 bp long.I'll check the files as per your suggestion. The reads were originally 150bps. They were about 100 bps after cropping using TRIMMOMATIC
Unless you had a reason to do so you may have thrown away 33% good data by doing a hard crop like that.