7.2 years ago by
United States
I would like to better understand the pipeline for building a phylogenetic tree of a gene's evolution across species (whose genomes have been sequenced). What would be the appropriate procedures and software?

I am inexperienced at bioinformatics, and assume it would go something like this:

1) Elicit the gene sequence
2) BLAST the gene sequence to find homologs in other species
3) Build phylogenetic tree

I am trying to determine how three genes (cgMT1, cgMT2, MTF1) evolved across three types of coral. These genes are shown in humans to reduce injury from heavy metal exposure. My tentative hypothesis is that in the one type of coral (of the three) that is most sensitive to bleaching (prone to lose its symbiosis with dinoflagellates due to UV, pollution, and heavy metal exposure), there will either be an absence of one or more of these genes or a distinct evolutionary history that may have rendered these genes dysfunctional.

However, I do not even know how to elicit the gene sequences (cgMT1, cgMT2, MTF1) from human databases and/or one of these species and/or all species, and compare between the species. I am very lost about how to start, and just would like to develop a short and simple (beginner-level) pipeline to test this hypothesis. I would like this pipeline to be entirely computational.

I know this is a fairly open-ended question (and may not make sense), so please understand that I am new but have really given myself a headache. Thank you for any advice!

7.1 years ago by
Cambridge, UK
Regarding point 2, you probably want to use protein alignments instead of gene based alignments, unless that your species are very close in phylogenetic terms. Regarding point 3, I would recommend you to use the strategy used at ENSEMBL to build your gene trees:

7.1 years ago by
Some hints for solving your problem nr. 1:

Maybe it would help you to search for the names of these genes in NCBI, e.g.:

There you can follow the link "Nucleotide: DNA and RNA sequences" and select a sequence based on the organism. When you for example clicked on one of the results for human, you get here:

Here you can click on the top at "Display Settings:" and select "Fasta", to get the sequence of this entry.

