Let me know if there is any software availble for that.
edit:
Basically I want to reduce the number of sequences prior to phylogenetic tree construction. The first step is to remove >90% identical sequences by cd-hit before multiple sequence alignment. Then I want remove the sequences which are >90% identical in the alignment. I am looking for a program for that purpose.
I use cd-hit to remove sequences by identity. However it does not work with multiple sequence alignments. So I wanted to know if there is software available for that.
Not at real question
I use cd-hit to remove sequences by identity. However it does not work with multiple sequence alignments. So I wanted to know if there is software available for that.
Please edit your question to give more details, about what you are trying to achieve and what is the biological question behind it, then I might re-open it. Hint, you will need at least one or two paragraphs (5-10 sentences) to make a valid question out of this.
I am not able to edit the question since it is closed. Basically I want to reduce the number of sequences prior to phylogenetic tree construction. The first step is to remove >90% identical sequences by cd-hit before multiple sequence alignment. Then I want remove the sequences which are >90% identical in the alignment. I am looking for a program for that purpose.
I have re-opened and inserted your comments, however: what do you mean by "I use cd-hit to remove sequences by identity. However it does not work with multiple sequence alignments." I think it's supposed to work with fasta files, just as you would use it when you apply it before msa.
Why would you try to apply it after msa?
Why would you have a msa with removing sequences afterwards, that makes the whole multiple alignment invalid.
Unclear. You say that you want to use CD-HIT before multiple sequence alignment. Then you say that CD-HIT does not work with multiple sequence alignment.
CD-HIT should output a FASTA file of non-redundant sequences, suitable for input to an aligner.
I want to use the MSA as input for cd-hit to remove >90% identical sequences.
I think that makes no sense.
I agree. The input to CD-HIT is not a MSA. It's a file of sequences in FASTA format. I think you want to do MSA after CD-HIT.