Question: Looking for an aligned multi-FASTA file in order to practice building phylogenetic tree
1
gravatar for l.roca
2.9 years ago by
l.roca10
Peru
l.roca10 wrote:

Dear All,

 

I am looking for an aligned multi-FASTA file to practice building phylogenetic tree. Do you have any idea where can I find a file like that (It does not need to contain more than 10 speices and I prefer data related to plants, e.g., rbcL)?

 

Thanks.

phylogenetic • 1.4k views
ADD COMMENTlink modified 2.9 years ago by Brice Sarver2.3k • written 2.9 years ago by l.roca10
2
gravatar for Siva
2.9 years ago by
Siva1.5k
United States
Siva1.5k wrote:

You can try TreeBASE which is a repository of phylogenetic trees (12,817 trees from 104,593 distinct taxa) and corresponding multiple sequence alignments (8,233 alignments) from publications. TreeBASE calls the multiple alignment file as matrix. You can do a Taxon search (e.g. Arabidopsis thaliana or NCBI taxonomy ID 3702) to get plant related alignments. You mentioned that you want the alignments in FASTA format. Though, this website provides the alignment only in NEXUS format. If you want to use the data from TreeBASE, you can convert the alignments from NEXUS to FASTA format using readseq available at phylogeny.fr website or you can download readseq and install locally.

 

 

 

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Siva1.5k
1
gravatar for Michael Dondrup
2.9 years ago by
Bergen, Norway
Michael Dondrup43k wrote:

You can make such files easily with one of the many MSA online apps: https://www.ebi.ac.uk/Tools/msa/

Simply choose sequences of homologs from different species of interest and try the different tools. That way you can also compare the effect of different MSA algorithms and parameters on the resulting phylogenies.

Another quick way is to use TreeFam, that way you can save yourself some work, you do not need to pick homologues yourself. Use a single sequence of interest or press the Example button, then for inserting the sequence into the tree, TreeFam will calculate a MAFFT alignment which you can also download.

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Michael Dondrup43k
0
gravatar for Brice Sarver
2.9 years ago by
Brice Sarver2.3k
United States
Brice Sarver2.3k wrote:

Datasets, including multiple sequence alignments, from papers with a phylogenetic component are frequently posted on Dryad. Alternatively, the source code for just about every program often contains an example folder with a trial dataset or two.

ADD COMMENTlink written 2.9 years ago by Brice Sarver2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1277 users visited in the last hour