Question: Change header fastq files
0
gravatar for riccardo.scotti82
4 weeks ago by
riccardo.scotti820 wrote:

Hi all,

My original fastq files heve the followig header format:

@000e86e5-6b4a-4e28-9e4d-67c4e41987f5 runid=99d90dd5f9459ef0df31d28d69ecc5852c85f135 read=1061 ch=119 start_time=2019-08-17T03:58:38Z flow_cell_id=ABB818 protocol_group_id=xxx_B2_D2_H12_13112019 sample_id=xxx_B2_D2_H12_13112019
GTTGTGCTGTTCGGTTTCGTTTGATTGTTTGCTCGGTTGCTCGCCCTAAGCGACAAGAAAGTTGTCGGTGTCTTGTGCGTTTTTCGTGCGCCAACCTGTGTTGCTTCAAGCTGG
+ 
)0/%$$#'&320')))(&%#$$%#$%'#$%$#&,$&%$$#(/-..)'%&%&'&$$$(1//56<868'0.-,-&&$('04BBC;@7.<1*)(&$%$$(%&%&'$"#%#$%*++'&$$#(

After running minimap2, to remove human reads, they look like this:

@000e86e5-6b4a-4e28-9e4d-67c4e41987f5
GTTGTGCTGTTCGGTTTCGTTTGATTGTTTGCTCGGTTGCTCGCCCTAAGCGACAAGAAAGTTGTCGGTGTCTTGTGCGTTTTTCGTGCGCCAACCTGTGTTGCTTCAAGCTGG
+
)0/%$$#'&320')))(&%#$$%#$%'#$%$#&,$&%$$#(/-..)'%&%&'&$$$(1//56<868'0.-,-&&$('04BBC;@7.<1*)(&$%$$(%&%&'$"#%#$%*++'&$$#(

The header chaged, with only the read ID. I need to replace the header with the original one, to have a fastq files without human reads but the full header. How can I replace it? Any script or tools?

Thanks

sequencing sequence assembly • 118 views
ADD COMMENTlink modified 4 weeks ago by Joe17k • written 4 weeks ago by riccardo.scotti820
2

Most aligners will drop fastq headers after the first whitespace. Looks like that is happening here. You can replace the space in your headers with an _. That should keep the full header. sed s/\ /\_/g your.fastq > new.fastq should do that.

ADD REPLYlink written 4 weeks ago by genomax87k
1

how about removing all those spaces in the fastq header with tr before using minimap2 ?

ADD REPLYlink written 4 weeks ago by Pierre Lindenbaum129k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 673 users visited in the last hour