Add words at beginning and end of the same line for the FASTA header line with sed
1
0
Entering edit mode
19 months ago

I have the following line:

>A_1000
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC


I would like to convert the first line as follows:

>INITWORD/A_1000/FINALWORD
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC


I found a similar question that did allow me to append the end and the beginning as I needed (https://stackoverflow.com/questions/68541730/add-words-at-beginning-and-end-of-a-fasta-header-line-with-sed). However, it adds the FINALWORD to the next line.

I ran the following:

 sed 's%^>(.*)%>Initialword/\1/Finalword%' fasta _test.fasta > fasta_test2.fasta


Which returns:

>Initialword/A_0101M/Finalword
ACTTTCGATCTCTTGTAGATCTGTTCTC...CACM
ACTTTCGATCTCTTGTAGATCTGTTCTC...CACM


But in the Fasta file it looks like:

>Initialword/A_0101
/Finalword
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC


How can I fix this to just add the text to the beginning and end of the header? What is the M at the end of each line in the file?

Thank you

fasta sed • 726 views
0
Entering edit mode

kchambers58178 : I formatted the first two lines but I am not sure what you exactly want with last two. Please use the 101010 button in edit mode to apply code formatting for those two lines so they show up the way you want them to.

1
Entering edit mode

I believe I fixed it. Thank you.

1
Entering edit mode
19 months ago

This question has been resolved with this option:

dos2unix <input.fasta | sed -E 's%^>(.*)%>Initialword/\1/Finalword%' >output.fasta

1
Entering edit mode

alternative (after dos2unix):

$sed -re '/^>/ s_^>_>initial/_;s_$_/final_' test.txt

>initial/A_1000/final
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC/final
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC/final