FASTQ header editing
1
0
Entering edit mode
7.2 years ago

Hi everyone,

This is my first post here, sorry if this issue is out of place but I am really new in bioinformatics and scripting. I am working in de novo genome assemblies with some marine invertebrates, and as part of my pipeline I have error-corrected some FASTQ files using Rcorrector. This software added the following information to the FASTQ headers

In the header line for each read, Rcorrector will append some information.

"cor": some bases of the sequence are corrected "unfixable_error": the errors could not be corrected "l:INT m:INT h:INT": the lowest, median and highest kmer count of the kmers from the read

So, I have some FASTQ files with the following headers:

@HWI-ST169:272:C0RCGACXX:1:1306:13471:25027 1:N:0: l:185516 m:185516 h:185516 unfixable_error
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
?@@FF<D@6DFDDIGFHFB@B?@BBB6BBBBDBDDD

@HWI-ST169:272:C0RCGACXX:1:2107:18438:124552 1:N:0: l:185516 m:185516 h:185516 unfixable_error
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
=@;::)<AFD)0?AFFFDB6;?B637:BBBB6BBBB

@HWI-ST169:272:C0RCGACXX:1:1204:15681:165032 1:N:0: l:185516 m:185516 h:185516 unfixable_error
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

This is just a snapshot of the headers but other reads have the information that I mentioned above. So, I am wondering how I can remove the additional information included by Rcorrector. For example, l:185516 m:185516 h:185516 unfixable_error

I want to use Meraculous for de novo genome assembly but I am getting the error that FASTQ header is not valid. I guess this error is related to this additional information in the FASTQ headers.

Hope someone here can help me.

Thanks in advance,

Felipe

sequence • 2.5k views
ADD COMMENT
2
Entering edit mode
7.2 years ago
cut -d ' ' -f1,2 in.fastq > out.fastq
ADD COMMENT
0
Entering edit mode

Thanks Pierre, your command works so smoothy. Now I can see if Meraculous accepts the new FASTQ headers of my files. Great!!!

ADD REPLY
0
Entering edit mode

you can now validate my answer (green mark on the left) to close this question.

ADD REPLY

Login before adding your answer.

Traffic: 2630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6