Get phylogenetic tree from abundance table
3
2
Entering edit mode
5.1 years ago
David ▴ 210

Hi, I have an abundance table (each row corresponds to a taxa). I would like to get the tree (newick default format) of the table. The table contains bacteria, archeae and fungi.

How can i get a phylogenetic tree from this table (the original table has around 600 rows)?? The idea is to use the tree for further analysis using R. Note that it is also possible for me to use ncbi taxa ids since i have them (not shown in this table).

SampleA    Phylum               Class          Order               Family           Genus                 Species
12           Actinobacteria     Actinomycetales Actinobacteria     Actinomycetaceae     Actinomyces Actinomyces neuii
3             Actinobacteria     Actinomycetales Actinobacteria     Actinomycetaceae     Actinomyces   Unknown
34            Actinobacteria     Corynebacteriales Actinobacteria   Corynebacteriaceae Corynebacterium  Corynebacterium sp. HMSC064E10
59            Actinobacteria     Corynebacteriales Actinobacteria   Corynebacteriaceae Corynebacterium  Corynebacterium aquilae
965          Actinobacteria     Propionibacteriales Actinobacteria Propionibacteriaceae   Tessaracoccus Tessaracoccus sp. NSG39
44           Proteobacteria     Unknown               Unknown              Unknown         Unknown        Unknown

phylogenetic newick ncbi • 7.4k views
0
Entering edit mode

thanks Philipps, Do you know if is possible to change the name of the ncbi ids in the tree. Although i will use the ncbi taxids to recover the tree i would like to to change the taxid in the output tree with an OTU id.

For instance each row of my sample corresponds to a specific OTU , OTU_1, OTU_2 and so on...

Is it possible to change that easily ?

Finally i would like to import this tree into phyloseq and the tree needs to have identifiers ids.

Thanks,

0
Entering edit mode

0
Entering edit mode

Sorry added the comment at the bottom

0
Entering edit mode

Thanks guys for your comments and sorry for the confusion. Let me try to better explain.

My data is WGS data (not 16S). What i call OTU (i know this is confusing corresponds to one species or lower rank (e.g order or phylum or class....) if resolution is lower) . Each OTU comes from a subset of my marker genes.

For instance from my contigs i extract marker genes (single copy genes) and assign the taxonomy to them. The reason for this is that the resolution at species level is much better. One single copy gene does not correspond to one species but a subset of marker genes correspond to one species. (e.g GeneA+GeneB+GeneC = Staph Aureus). So each line of my dataframe corresponds to a subset of marker genes but only one species ( or class or phylum depending if resolution is enough).

I need to generate a phylogentic tree (to be used with Phyloseq, note that Phyloseq can work with any type of WGS data, not just 16S data). Normally i would align each sequence from each line(OTU) if this was 16S data, however each line is a combination of several marker genes.

Hope it´s much clear. How would you generate a phylogentic tree in such case ?

2
Entering edit mode
5.1 years ago

There are a few tools which can give you phylogenetic trees for a list of names -

phyloT, just copy paste the list of species; http://phylot.biobyte.de/

Under Python, you can use the ETEtools to download the NCBI taxonomy database and then make a tree: http://etetoolkit.org/docs/2.3/tutorial/tutorial_ncbitaxonomy.html

1
Entering edit mode
5.1 years ago

There is a little misconception here that also has passed over into the other answer: Phylogeny != Taxonomy despite both coming as trees. So, should we give an answer to what you say, or what you mean (or we guess)?

• Do you want a Taxonomy? Then indeed you can use phyloT and enter the scientific names as taxa. Despite the name it gives you the specified part of the NCBI taxonomy, which you can visualize in iTol or just export. Not all of your taxa are from the species level, so you need to enter most specific taxon from each row. Then export your tree to newick or nexus format. In these formats, taxon labels can be edited directly or using programs like FigTree.

• Do you need a Phylogenetic tree? Then you have at least two options:

1. Find a publication or an existing phylogeny that contains all species in your list, or restrict your list to a suitable existing phylogeny.
2. Generate a phylogeny from existing sequence data, e.g. using the 16s-rDNA sequences of the species in your list using a multiple sequence alignment and phylogenetic inference.
0
Entering edit mode
5.1 years ago
Josh Herr 5.7k

In addition to the other two answers, I'm going to chime in. I'm not being critical here, but it sounds like you are a little confused.

It looks like what you show above is the taxonomy table from your OTU picking pipeline. This can be "made" into a phylogenetic tree, but you'll have to have your reference sequences and your "new" sequences from your experimental data. As said below, taxonomy doesn't equal phylogeny in many cases.

Most programs for OTU picking (QIIME and mothur, for example) can provide you with an alignment during the OTU picking process and some will construct the tree for you in the standard OTU picking pipeline. Do you already have the information you need?

Also, you mention you will be wanting to import your files into phyloseq in R -- you'll most importantly need your BIOM file (the OTU table), usually converted to JSON format, your reference sequences file, your phylogenetic tree, and your metadata or mapping file. Most OTU picking pipelines will provide you these at the end. There are plenty of ways of wrangling other data file formats into R. You didn't tell us what you did, so it's hard to know what files you have at hand already.

Traffic: 2299 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.