unifrac distances for WGS data with qiime2
0
0
Entering edit mode
5.1 years ago
josmos43 • 0

I am doing my first downstream analysis on metagenomic WGS data. I was classifying the taxonomy kraken2 (custom database - all refseqs) and calculating species abundances with bracken.

My Idea was to use qiime2 to calculate unifrac distances, since it provides nice plotting features. I obtained the pyhlogenetic tree in newick format from the kraken database using this script.

I tried to import this tree to qiime2 using the following commands:

qiime tools import --input-path ncbi_taxonomy.newick --type 'Phylogeny[Unrooted]' --output-path tax_tree.qza
qiime phylogeny midpoint-root --i-tree tax_tree.qza --o-rooted-tree tax_tree_rooted.qza

..and I get the following error:

    Traceback (most recent call last):
  File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py", line 2418, in get_max_distance
    self._set_max_distance()
  File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py", line 2359, in _set_max_distance
    raise TreeError("No support for single descedent nodes")
skbio.tree._exception.TreeError: No support for single descedent nodes

During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/q2cli/commands.py", line 274, in __call__
        results = action(**arguments)
      File "<decorator-gen-206>", line 2, in midpoint_root
      File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
        output_types, provenance)
      File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 362, in _callable_executor_
        output_views = self._callable(**view_args)
      File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_phylogeny/_util.py", line 13, in midpoint_root
        return tree.root_at_midpoint()
      File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py", line 862, in root_at_midpoint
        max_dist, tips = tree.get_max_distance()
      File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py", line 2420, in get_max_distance
        return self._get_max_distance_singledesc()
      File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py", line 2376, in _get_max_distance_singledesc
        distmtx = self.tip_tip_distances()
      File "/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py", line 2501, in tip_tip_distances
        result = np.zeros((num_tips, num_tips), float)  # tip by tip matrix
    MemoryError

I verified that the input tree is in valid Newick format. I was running the command on a +300GB RAM machine, so memory shouldn't be an issue. I don't know if I am getting something wrong? Does qiime require pyhlogentic trees to show some certain features. I would be glad if somone can give me Ideas or guidelines on how to get a my data into qiime.

software error • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 2263 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6