How to split protein tandem domains fasta sequnces to multiple single sequences?
Entering edit mode
8.3 years ago
astrozheng ▴ 10

I have a set of data containing over 3K fasta sequences of a type of protein domains. In some sequences there might be 2 or 3 tandem the same type domains among which the sequences are not identical. I am wonder if there is a way to split all the sequences with more than one domain based on the long asta sequence which contains multiple domains?

All how could I obtain all the diverse single domain sequences from the long sequences and in the same time withou redundant sequences of the single domains?

protein redundancy • 2.2k views

