Question: What is the standard way of preparing homolog sequences for a phylogenetic analysis?
0
gravatar for johnnytam100
12 months ago by
johnnytam100100
johnnytam100100 wrote:

I have two protein sequences with around 50% identity between them.

I want to study the phylogenetic relationship between them.

I came up with a method myself:

Step 1: blast each of the sequences to a protein database separately (possibly with less stringent thresholds)

Step 2: extract the subject sequences which are hits common to both blasts

Step 3: multiple sequence alignment using the common subject seqeunces and the two query sequences

Step 4: build the phylogenetic tree

Could anyone comment on this method? If it is not ideal, what is the standard way of preparing homolog sequences for a phylogenetic analysis?

Thank you.

tree homolog phylogenetic • 308 views
ADD COMMENTlink written 12 months ago by johnnytam100100
1

These are the typical steps. However, you'll need to figure out the details, e.g. which species to include, maybe manually tweak the multiple sequence alignment, which tree building algorithm to choose.

ADD REPLYlink written 12 months ago by Jean-Karim Heriche21k

To add to Jean’s answer about which species to include, you may also want to consider a species or sequence that is less related to be an outgruop if you want a rooted tree.

ADD REPLYlink written 12 months ago by Joe14k

For the outgroup, should it be either 'out' in the sense of 1) blast threshold 2) functional annotation of the protein or 3) both?

ADD REPLYlink modified 12 months ago • written 12 months ago by johnnytam100100
1

It should be a more divergent sequence, which would lead to it being the outer most branch in your final tree. I.e. it will be one half of the most basal node bifurcation.

ADD REPLYlink modified 12 months ago • written 12 months ago by Joe14k

I see! What could we achieve if we play with the sequence alignment step?

ADD REPLYlink written 12 months ago by johnnytam100100
1

Thats too broad of a question really. You need to decide what features you’re looking for. If you wanted to examine preservation of an active site or domain for instance, you’d want to use local alignments, but if you were perhaps interested in the overall gene conservation, a global alignment would be more informative most likely.

ADD REPLYlink written 12 months ago by Joe14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 667 users visited in the last hour