My first post, so I hope I'm posting this in the correct place!
I have ~100k fasta sequences - some with duplicate fasta IDs (they also have identical sequences), but with unique descriptions. I would like to extract unique fasta sequences based on ID (so, remove duplicates, but keep one representative sequence), but also append the description associated with the duplicates.
For example, my fasta file might contain the following 3 sequences:
>Contig1 ATGCGAGTAG >Contig1 Description1 ATGCGAGTAG >Contig1 Description2 ATGCGAGTAG
And I'm looking to obtain the following single sequence:
>Contig1 Description1 Description2 ATGCGAGTAG
Thanks for any help :)