Entering edit mode
3.8 years ago
Hi all,
My original fastq files heve the followig header format:
@000e86e5-6b4a-4e28-9e4d-67c4e41987f5 runid=99d90dd5f9459ef0df31d28d69ecc5852c85f135 read=1061 ch=119 start_time=2019-08-17T03:58:38Z flow_cell_id=ABB818 protocol_group_id=xxx_B2_D2_H12_13112019 sample_id=xxx_B2_D2_H12_13112019
GTTGTGCTGTTCGGTTTCGTTTGATTGTTTGCTCGGTTGCTCGCCCTAAGCGACAAGAAAGTTGTCGGTGTCTTGTGCGTTTTTCGTGCGCCAACCTGTGTTGCTTCAAGCTGG
+
)0/%$$#'&320')))(&%#$$%#$%'#$%$#&,$&%$$#(/-..)'%&%&'&$$$(1//56<868'0.-,-&&$('04BBC;@7.<1*)(&$%$$(%&%&'$"#%#$%*++'&$$#(
After running minimap2, to remove human reads, they look like this:
@000e86e5-6b4a-4e28-9e4d-67c4e41987f5
GTTGTGCTGTTCGGTTTCGTTTGATTGTTTGCTCGGTTGCTCGCCCTAAGCGACAAGAAAGTTGTCGGTGTCTTGTGCGTTTTTCGTGCGCCAACCTGTGTTGCTTCAAGCTGG
+
)0/%$$#'&320')))(&%#$$%#$%'#$%$#&,$&%$$#(/-..)'%&%&'&$$$(1//56<868'0.-,-&&$('04BBC;@7.<1*)(&$%$$(%&%&'$"#%#$%*++'&$$#(
The header chaged, with only the read ID. I need to replace the header with the original one, to have a fastq files without human reads but the full header. How can I replace it? Any script or tools?
Thanks
Most aligners will drop fastq headers after the first whitespace. Looks like that is happening here. You can replace the space in your headers with an
_
. That should keep the full header.sed s/\ /\_/g your.fastq > new.fastq
should do that.how about removing all those spaces in the fastq header with
tr
before using minimap2 ?