Remove text and add a string in odd rows in a fasta file using awk
12 days ago
Mgs3

I have a file organized as such:

>Prevalence_Sequence_ID:13|ARO_Name:AxyX|ARO:3004143|Detection_Model:Protein Homolog Model
ATGAAGCAAAGAGTCCCTCTACGCACGTTCGTCCTATCTGCCGTATTAATTCTTATTACTGGTTGCTCGAAACCGGAAACCCAACCAGCCG
ATGAATATCTCGAAATTCTTCATCGACCGGCCGATCTTCGCCGGCGTGCTTTCGATCCTGGTGTTGCTGGCGGGCATACTGGCCATGTTCC


For every odd row, i need to keep only the first and third column and add the text "|kraken:taxid|32630" at the end. Example below

>Prevalence_Sequence_ID:13|ARO:3004143|kraken:taxid|32630
ATGAAGCAAAGAGTCCCTCTACGCACGTTCGTCCTATCTGCCGTATTAATTCTTATTACTGGTTGCTCGAAACCGGAAACCCAACCAGCCG
>Prevalence_Sequence_ID:14|ARO:3004143|kraken:taxid|32630
ATGAATATCTCGAAATTCTTCATCGACCGGCCGATCTTCGCCGGCGTGCTTTCGATCCTGGTGTTGCTGGCGGGCATACTGGCCATGTTCC


Is there a simple awk script that i can use? In alternative i could also keep only the first column if it's easier

$awk -F '|' '/^>/ {print fields, "text";next}1' input.fa  ADD REPLY 2 Entering edit mode 12 days ago Sam ★ 3.8k I guess something like awk -F "|" '{if($1 ~/^>/) {print $1"|"$3"|kraken:taxid|32630"}else{print \$0}}' test


Should work (assuming you have a fastq file and you are aiming to change the header, not just the odd rows)

12 days ago

Is there a simple awk script that i can use?

yes: awk -F '|' '/^>/ {printf(TODO);next;} {print;}'

II leave the TODO part as an exercice.