Hey all!
I have a very easy task to do with my sequences but my knowledge on managing fasta files by bash command is not big enough to deal with this :( I'd truly appreciate your help with this. So to the point, this is the issue:
I have 2 multifasta files, in which the headers are like >SEQID_SPECIES. I wan to merge both files to create a new multifasta file in which the sequences are concatenated according to the species. For example:
Inputs:
>SEQ1_ECOL
ACGT
>SEQ2_ECOLI
AAAA
Output:
>NEWSEQ_ECOLI
ACGTAAAA
Any ideas on how to do this?
If easier: I could also use the seuqneces concatenated just by order, i.e. the first sequence of file 1 concatenated with the first one of file 2 and so forth... Also, I care neither about the header of the merged sequences nor about the sequences with no match.
Thanks in advance!
Asked several times before, e. g.:
How to concatenate two fasta files, using regular expressions
Concatenate Two .Fasta Files Into One