I want to remove gaps from sequence alignment. How should do that is there any tools or software to do it or is it better to write a program and then proceed?
from Bio import SeqIO
from Bio import AlignIO
input_file = sys.argv
output_file = sys.argv
with open(output_file, "w") as o:
for record in AlignIO.read(input_file, "fasta"):
record.seq = record.seq.ungap("-")
SeqIO.write(record, o, "fasta")
save it as python script. Run the script with input and output file names.
Sorry, but your question is unclear. What do you mean by "remove gaps"? why would you want to do that? gaps are part of the alignment result...
If you literally remove all the gaps, it will no longer be an alignment. It will just be a multi-fasta file. Is that what you want? Or are you looking to remove positions/columns in the alignment file that include gaps? This would preserve the structure of alignment, at the cost of removing some information.
If you want to remove sites with lots of gaps (thus inferred to be non-homologous), you can use gblocks or trimal.