I have been trying to recover corrupted fastqs files. I had a decompression error;
invalid compressed data--crc error.
I got around the crc error by using
gzrecover and then used a
seqkit sana to fix sequence inconsistencies. Now, the issue is when I run FastQC, it complains that some sequences lack “+” under the sequence. I thought about using
sed but am not sure how to add missing "+" to where it should be.
Any help will be appreciated.
I run ValidateFasta and found an issue;
INFO [2021-05-21 16:13:40,878] [ValidateFastq$$anonfun$main$1] - 107300000 reads processed Exception in thread "main" htsjdk.samtools.SAMException: Quality header must start with +: GCCCTGAAAAACAACAGTAATGATATTGTAAATGCTATTATGGAATTAACAATGTAACTATTTGACAGCGAAGACAACTCCCCCTTTCCCC at line 429343625 in fastq /Volumes/Aura/rec.test.fastq
I should be able to add "+" right below this line by