Change header fastq files
0
0
Entering edit mode
3.8 years ago

Hi all,

My original fastq files heve the followig header format:

@000e86e5-6b4a-4e28-9e4d-67c4e41987f5 runid=99d90dd5f9459ef0df31d28d69ecc5852c85f135 read=1061 ch=119 start_time=2019-08-17T03:58:38Z flow_cell_id=ABB818 protocol_group_id=xxx_B2_D2_H12_13112019 sample_id=xxx_B2_D2_H12_13112019
GTTGTGCTGTTCGGTTTCGTTTGATTGTTTGCTCGGTTGCTCGCCCTAAGCGACAAGAAAGTTGTCGGTGTCTTGTGCGTTTTTCGTGCGCCAACCTGTGTTGCTTCAAGCTGG
+ 
)0/%$$#'&320')))(&%#$$%#$%'#$%$#&,$&%$$#(/-..)'%&%&'&$$$(1//56<868'0.-,-&&$('04BBC;@7.<1*)(&$%$$(%&%&'$"#%#$%*++'&$$#(

After running minimap2, to remove human reads, they look like this:

@000e86e5-6b4a-4e28-9e4d-67c4e41987f5
GTTGTGCTGTTCGGTTTCGTTTGATTGTTTGCTCGGTTGCTCGCCCTAAGCGACAAGAAAGTTGTCGGTGTCTTGTGCGTTTTTCGTGCGCCAACCTGTGTTGCTTCAAGCTGG
+
)0/%$$#'&320')))(&%#$$%#$%'#$%$#&,$&%$$#(/-..)'%&%&'&$$$(1//56<868'0.-,-&&$('04BBC;@7.<1*)(&$%$$(%&%&'$"#%#$%*++'&$$#(

The header chaged, with only the read ID. I need to replace the header with the original one, to have a fastq files without human reads but the full header. How can I replace it? Any script or tools?

Thanks

sequence Assembly sequencing • 1.3k views
ADD COMMENT
2
Entering edit mode

Most aligners will drop fastq headers after the first whitespace. Looks like that is happening here. You can replace the space in your headers with an _. That should keep the full header. sed s/\ /\_/g your.fastq > new.fastq should do that.

ADD REPLY
1
Entering edit mode

how about removing all those spaces in the fastq header with tr before using minimap2 ?

ADD REPLY

Login before adding your answer.

Traffic: 2294 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6