Question: paired-end fastq file remove mismatching lengths of bases and qualities error
0
gravatar for Peter Chung
23 days ago by
Peter Chung120
Hong Kong
Peter Chung120 wrote:

I have a paired end fastq file and when I ran trim galore, the error said lengths of bases and qualities do not match, so I searched the solution and recommended to using bbtools reformat.sh to discard reads that have mismatching lengths of bases and qualities:

reformat.sh in=pair_1.fq.gz in2=pair_2.fq.gz out=fixed_1.fq.gz out2=fixed_2.fq.gz tossbrokenreads=t

The error

Set INTERLEAVED to false
Input is being processed as paired
pigz: abort: read error on pair_1.fq.gz (Input/output error)
java.lang.AssertionError: 
There appear to be different numbers of reads in the paired input files.
The pairing may have been corrupted by an upstream process.  It may be fixable by running repair.sh.
    at stream.ConcurrentGenericReadInputStream.pair(ConcurrentGenericReadInputStream.java:497)
    at stream.ConcurrentGenericReadInputStream.readLists(ConcurrentGenericReadInputStream.java:362)
    at stream.ConcurrentGenericReadInputStream.run0(ConcurrentGenericReadInputStream.java:206)
    at stream.ConcurrentGenericReadInputStream.run(ConcurrentGenericReadInputStream.java:182)
    at java.lang.Thread.run(Thread.java:745)

so I tried to repair it by using bbtools repairs.sh

repair.sh in1=pair_1.fq.gz in2=pair_2.fq.gz out1=fixed_1.fq.gz out2=fixed_2.fq.gz outs=singletons.fq repair

Set INTERLEAVED to false
Started output stream.
pigz: abort: read error on pair_1.fq.gz (Input/output error)
java.lang.Exception: 
Mismatch between length of bases and qualities for read 107893745 (id=ST-E00126:1085:HF3YVCCX2:1:2106:16620:58339 1:N:0:TAAGCTCC+AGATCTCG).
# qualities=24, # bases=150

AAFFFFJJJJJJJJJJJJJJJJJJ
GTGTAGGACATCCATTTTATCAAGTTTCTGCTACAAGAAATGAAAAAATGAGACACTTGATTACTACAGGCAGACCAACCAAAGTCTTTGTTCCACCTTTTAAAACTAAATCGCATTTTCACAGAGTTGAACAGTGTGTTAGGAATATTA

This can be bypassed with the flag 'tossbrokenreads' or 'nullifybrokenquality'
    at shared.KillSwitch.kill(KillSwitch.java:96)
    at stream.Read.validateQualityLength(Read.java:214)
    at stream.Read.validate(Read.java:104)
    at stream.Read.<init>(Read.java:76)
    at stream.Read.<init>(Read.java:50)
    at stream.FASTQ.quadToRead_slow(FASTQ.java:809)
    at stream.FASTQ.toReadList(FASTQ.java:646)
    at stream.FastqReadInputStream.fillBuffer(FastqReadInputStream.java:107)
    at stream.FastqReadInputStream.nextList(FastqReadInputStream.java:93)
    at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:680)
    at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:656)

but the error said tossbrokenreads again, it kinda fall into a loop.

Anyone has experience this, please advice. Thanks.

repair bbtools reformat fastq • 138 views
ADD COMMENTlink modified 22 days ago by Istvan Albert ♦♦ 86k • written 23 days ago by Peter Chung120
0
gravatar for Istvan Albert
22 days ago by
Istvan Albert ♦♦ 86k
University Park, USA
Istvan Albert ♦♦ 86k wrote:

You have multiple different errors, note how the first error says:

  • There appear to be different numbers of reads in the paired input files.

whereas the second error is

  • Mismatch between length of bases and qualities for read 107893745

your data seems to have multiple, overlapping problems

In addition, when you ran the repair.sh you did not toss the broken reads.

ADD COMMENTlink written 22 days ago by Istvan Albert ♦♦ 86k

I second this. repair.sh should be able to take care of the problem. Can you try?

repair.sh in1=pair_1.fq.gz in2=pair_2.fq.gz out1=fixed_1.fq.gz out2=fixed_2.fq.gz outs=singletons.fq repair tossbrokenreads=t

In "repaired" broken reads Q scores will be replaced with ?.

@NB511934:132:HPTKHPGX2:1:11101:1446:1079
CGAGCNCGTAAGGATTTTTCAGTG
+
?????!??????????????????
ADD REPLYlink written 22 days ago by GenoMax95k

Thanks for the reply.

I think it is the download error but less likely I can re-download those data. I can repair the read 2 fastq file successfully which treated as single end so I think the error came from read1 fastq file.

Set INTERLEAVED to false Started output stream. pigz: abort: read error on pair_1.fq.gz (Input/output error) Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3520) at shared.KillSwitch.copyOfRange(KillSwitch.java:377) at fileIO.ByteFile1.nextLine(ByteFile1.java:174) at fileIO.ByteFile2$BF1Thread.run(ByteFile2.java:274) java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3664) at java.lang.String.<init>(String.java:207) at java.lang.String.substring(String.java:1969) at java.lang.String.subSequence(String.java:2003) at java.util.regex.Pattern.split(Pattern.java:1216) at java.lang.String.split(String.java:2380) at java.lang.String.split(String.java:2422) at jgi.SplitPairsAndSingles.repair(SplitPairsAndSingles.java:693) at jgi.SplitPairsAndSingles.process3_repair(SplitPairsAndSingles.java:518) at jgi.SplitPairsAndSingles.process2(SplitPairsAndSingles.java:284) at jgi.SplitPairsAndSingles.process(SplitPairsAndSingles.java:220) at jgi.SplitPairsAndSingles.main(SplitPairsAndSingles.java:37)

This program ran out of memory. Try increasing the -Xmx flag and using tool-specific memory-related parameters.

I already tried -Xmx32G but still error.

is there anything I can do or which step I should do first ? Thanks.

ADD REPLYlink written 22 days ago by Peter Chung120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2519 users visited in the last hour
_