Align two different regions of protein sequences for phylogenetic tree
0
0
Entering edit mode
8.0 years ago
n00bgenome ▴ 40

So I was interested in looking at two protein domains from a single protein, and using just the protein domains, building a phylogenetic tree. I have accomplished getting the sequences into a .txt file, but now I am wondering about the alignment. Should I put a series of dashes in between the sigma factors to help the alignment or not? What I mean is, if the following is the sigma factor amino acid sequence: AAAAAAAAABBBBBBBBCCCCCCCC

I then use python to get a file that removes the middle section, so its now: AAAAAAAAACCCCCCCCCC

Now, I want to do an alignment, but I'm not sure if I should add a series of dashes to help the alignment or not, namely to do an aliangment like this: AAAAAAAAA-------------------CCCCCCCCC

That way, this prevents any unintended alignments bridging the AAAA and CCCC domains.

Does that sound right?

Phylogeny • 1.4k views
ADD COMMENT
0
Entering edit mode

Why not build two separate trees from separating the two domains into two files?

ADD REPLY
0
Entering edit mode

The domains are linked in function. One domain specifies an upstream recognition element, and the other specifies a downstream recognition element. So to group them by total recognition, I need them both.

ADD REPLY
1
Entering edit mode

They why are you worried about intervening sequence. That presumably is similar for all proteins (since they are all homologous)?

ADD REPLY
0
Entering edit mode

I guess you are right. I was worried that a bridging sequence might be possible since I'm using 3000 species, but you are right, there is homology for each domain across the species, so it would align correctly. Thanks!

ADD REPLY
0
Entering edit mode

You will find out soon enough :-)
If they are all homologous they should.

ADD REPLY

Login before adding your answer.

Traffic: 2328 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6