Question: Merging all sequences with identical ID's
0
gravatar for hugo.swenson
6 weeks ago by
hugo.swenson0 wrote:

Hi!

I am having issues with multiple genes (fasta files) which i am supposed to concatenate. My issue lies in that all these genes have identical taxon-identifiers, meaning that after concatenating my aligned + trimmed files, i end up with multiple duplicate headers in the combined file. What i am wondering is if there is any method, preferably in python, to merge all sequences with a identical header into one sequence (ie. remove the duplicate header entries, and then merge all sequences matching that header into one sequence?

sequence • 115 views
ADD COMMENTlink modified 6 weeks ago by thackl2.6k • written 6 weeks ago by hugo.swenson0

please provide example.

ADD REPLYlink written 6 weeks ago by shenwei3564.3k

Ha, just realized, I recommended your tool :)

ADD REPLYlink written 6 weeks ago by thackl2.6k
1
gravatar for thackl
6 weeks ago by
thackl2.6k
MIT
thackl2.6k wrote:

seqkit concat might do what you want: "concatenate sequences with same ID from multiple files"

https://github.com/shenwei356/seqkit

ADD COMMENTlink written 6 weeks ago by thackl2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1122 users visited in the last hour