Hi, Please can anyone help me with this. I have a multifasta file that I want to make a blast database but the header of each sequence is not quite in the correct format. The multifasta has several 100000 sequences in it so really dont want to start again.
correct format should be
>unique-id|my sequence name|etc|etc
I currently have
>notunique___|unique-id|my sequence name|etc|etc
Im pretty sure this should be doable with the sed command but no clue how to do this myself. I want too either just delete 'notunique___|' or replace '>notunique___|' with a new >
notunique is a mix of letters and numbers that are not always the same length of characters.
Any help would be much appreciated
Thank you James