hisat2 alignment error
0
0
Entering edit mode
3.3 years ago
Marco Pannone ▴ 790

Hello everybody

I have paired-end fastq files from RNA-seq which I am aligning with hisat2. Among all my 18 pairs of fastq files, I encountered an error for only one of them, which shows the message below:

Error: Read V300055969L2C002R0400381671/1 has more read characters than quality values.
libc++abi.dylib: terminating with uncaught exception of type int
(ERR): hisat2-align died with signal 6 (ABRT)

when I go looking at such V300055969L2C002R0400381671/1 read into the specific fastq file, it looks like this:

+
FFFFFFBFF>FEFFFAGFFG=GGGFDGDGGGEDFEFFFCFCGCGFFFGGFCFFGEEG2BGG<GFFEFFAF4FFF=GF8FEFG@FFFGGFFFGGF<FGEGF
@V300055969L2C002R0400381671/1
CATGGAAAAGGTTTTCAGCCCTAGTGGGTTTTGCTGGTTGAACTGGAGGCTGCCCAGAGGAGACAGTGAGGCTCCATTTACGACTCAGCGATCCAAGAGA
+

It's the first time I encountered an error like this and I am not sure what is the cause. I also tried to re-download the file but nothing changed.

Hope some of you can explain to me what is possibly wrong. I would be very grateful!

Thanks!

hisat2 RNA-Seq alignment software error • 1.6k views
ADD COMMENT
0
Entering edit mode

What is the output of gzip -cd your.fastq.gz | grep -A 3 "@V300055969L2C002R0400381671/1" ? The error tells you what is wrong, lets see whether this is true with the above command. If so there is some corruption the int file which could be repaired with repair.sh from BBMap suite.

ADD REPLY
0
Entering edit mode

Thanks for the reply! This is the output I got executing the command you wrote:

@V300055969L2C002R0400381671/1
CATGGAAAAGGTTTTCAGCCCTAGTGGGTTTTGCTGGTTGAACTGGAGGCTGCCCAGAGGAGACAGTGAGGCTAATAG@FD1E0
@V3AATCCTAGGCCTTTEDFFFCCTACCTCAAATGTTGCTTGCTTG70038109FF
@TATCTGTTATTGGTTAAGCTCAACAAGGCTTGGFBGFFFFFF@F5969L2GTTGGAACGCCTAATCAACAACCGTCTCCATTCTTTBFF8FF;FG82DCTTAGAAFFFTATAG=@6FFF9FFF;DFFGGGGG063215TFACCCTATE?E381670/1GTTTCA6DDBE@DDF>DEFD=EEE2D:A@DFEFEFDFFFAEEFFF8FFBFFFFFTACTTCTGGTAA5GCTCTGEGGGFFEDDDDDDDDDDDDDDDDDDDDDDDDDDF@FFFCGGAGCCCCTAATTG02R04009L2C00><FEFDDFF?FD>F?E@DF<FGEGFATCTTGGCCCCTTACTTTAACF3EFD@GFFFEFFFGFAACTTCCA
ADD REPLY
2
Entering edit mode

Looks quite malformed, you'll want to contact who ever uploaded it.

ADD REPLY
1
Entering edit mode

It should actually be the read name (line1) followed by the nucleotide sequence (2), a "+" (3) and the quality line (4). 2 and 4 must have identical length. Something is wrong there. As suggested, contact the facility that created the files.

ADD REPLY
0
Entering edit mode

Thanks a lot for your answers!

ADD REPLY

Login before adding your answer.

Traffic: 2179 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6