Question: How to remove gaps from sequence alignment?
0
gravatar for kapoornancy25
5 weeks ago by
kapoornancy2510 wrote:

I want to remove gaps from sequence alignment. How should do that is there any tools or software to do it or is it better to write a program and then proceed?

ADD COMMENTlink modified 5 weeks ago by h.mon31k • written 5 weeks ago by kapoornancy2510
1

https://mothur.org/wiki/degap.seqs/

in python:

from Bio import SeqIO
import sys
from Bio import AlignIO

input_file = sys.argv[1]
output_file = sys.argv[2]

with open(output_file, "w") as o:
    for record in AlignIO.read(input_file, "fasta"):
        record.seq = record.seq.ungap("-")
        SeqIO.write(record, o, "fasta")

save it as python script. Run the script with input and output file names.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by cpad011214k

Sorry, but your question is unclear. What do you mean by "remove gaps"? why would you want to do that? gaps are part of the alignment result...

ADD REPLYlink written 5 weeks ago by liorglic330

If you literally remove all the gaps, it will no longer be an alignment. It will just be a multi-fasta file. Is that what you want? Or are you looking to remove positions/columns in the alignment file that include gaps? This would preserve the structure of alignment, at the cost of removing some information.

ADD REPLYlink written 5 weeks ago by Dave Carlson390
0
gravatar for h.mon
5 weeks ago by
h.mon31k
Brazil
h.mon31k wrote:

If you want to remove sites with lots of gaps (thus inferred to be non-homologous), you can use gblocks or trimal.

ADD COMMENTlink written 5 weeks ago by h.mon31k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1735 users visited in the last hour