bbduk truncated fastq file according to FastQC - help?
1
0
Entering edit mode
12 months ago
hannahf • 0

I was able to succeessfully remove adapters from my PE 150bp x 2 reads using bbduk.sh, but I kept seeing that I had a long string of Gs in my R2 sequences (~ 0.1% of my R2 reads).

I reran the original fastq files with bbduk to remove the string of Gs, and this worked for most of my PE files except for one pair.

When I ran this trimmed set of PE reads through FastQC (after bbduk), I received an error that said:

Failed to process file EA_Pool-POW_1-1a_S28_L001_R1_CLEANEST.fastq
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Ran out of data in the middle of a fastq entry.  Your file is probably truncated
at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:125)
at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:77)


Has anyone encountered this issue before with output fastqs from bbduk?

Should I just not worry about the ~0.1% of GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG kmers in my R2 reads?

Thank you!

bbduk metagenome PE fastqc • 493 views
3
Entering edit mode
12 months ago

I'd start by rerunning the trimming; it might have just randomly stopped.

Honestly, a string of G's isn't going to align to anything, so it's not going to do anything bad if you leave them in.

0
Entering edit mode

That was my thought, and it's such a low % of the R2 reads. Thanks for the input!

1
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they work. This will help future users that might find this post find the right answer.