Entering edit mode
3.2 years ago
Marco Pannone
▴
790
Hello everybody
I have paired-end fastq files from RNA-seq which I am aligning with hisat2. Among all my 18 pairs of fastq files, I encountered an error for only one of them, which shows the message below:
Error: Read V300055969L2C002R0400381671/1 has more read characters than quality values.
libc++abi.dylib: terminating with uncaught exception of type int
(ERR): hisat2-align died with signal 6 (ABRT)
when I go looking at such V300055969L2C002R0400381671/1 read into the specific fastq file, it looks like this:
+
FFFFFFBFF>FEFFFAGFFG=GGGFDGDGGGEDFEFFFCFCGCGFFFGGFCFFGEEG2BGG<GFFEFFAF4FFF=GF8FEFG@FFFGGFFFGGF<FGEGF
@V300055969L2C002R0400381671/1
CATGGAAAAGGTTTTCAGCCCTAGTGGGTTTTGCTGGTTGAACTGGAGGCTGCCCAGAGGAGACAGTGAGGCTCCATTTACGACTCAGCGATCCAAGAGA
+
It's the first time I encountered an error like this and I am not sure what is the cause. I also tried to re-download the file but nothing changed.
Hope some of you can explain to me what is possibly wrong. I would be very grateful!
Thanks!
What is the output of
gzip -cd your.fastq.gz | grep -A 3 "@V300055969L2C002R0400381671/1"
? The error tells you what is wrong, lets see whether this is true with the above command. If so there is some corruption the int file which could be repaired withrepair.sh
from BBMap suite.Thanks for the reply! This is the output I got executing the command you wrote:
Looks quite malformed, you'll want to contact who ever uploaded it.
It should actually be the read name (line1) followed by the nucleotide sequence (2), a "+" (3) and the quality line (4). 2 and 4 must have identical length. Something is wrong there. As suggested, contact the facility that created the files.
Thanks a lot for your answers!