Question: "out" parameter for _Fasttree module in Bio.Phylo.Applications from Biopython 1.68 doesn't work
1
gravatar for russianconcussion
19 months ago by
United States
russianconcussion20 wrote:

Hi,

When I try to run the FastTree wrapper from Bio.Phylo.Applications, it fails, giving me this message:

ApplicationError: Non-zero return code 1 from '/home/fetz/genome/phylosift_v1.0.1/bin/FastTree -nt -gtr -out 245_hypothetical_protein.tre 245_hypothetical_protein.codon', message 'Unknown or incorrect use of option -out'

My Biopython version is 1.68:

import Bio
Bio.__version__
'1.68'

My code, run in ipython3, is:

from Bio.Phylo.Applications import _Fasttree
fasttree_exe = r"/home/fetz/genome/phylosift_v1.0.1/bin/FastTree"
codonfile = "245_hypothetical_protein.codon"
outfile = codonfile.replace(".codon",".tre")
cmd = _Fasttree.FastTreeCommandline(fasttree_exe, nt = True, gtr = True, input = codonfile, out = outfile)
cmd()

The input file, 'codonfile,' is a verified codon alignment from the Bio.codonalign module that has been written to a file. I cannot find any errors in my command construction according to the example given in the API. Can anyone suggest what I am doing wrong? Thank you in advance.

fasttree biopython • 634 views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 19 months ago by russianconcussion20

Taken from FastTree docs:

By default FastTree expects protein alignments, use -nt for nucleotides

and

-gtr -- generalized time-reversible model (nucleotide alignments only)

It seems that both options nt and gtr are for nucleotide alignments and your file is a codon alignment.

Try cmd = _Fasttree.FastTreeCommandline(fasttree_exe, input = codonfile, out = outfile) to see if it works.

[EDIT] This is is based only on the assumption that your codon alignment is not a nucleotide alignment file.

ADD REPLYlink modified 19 months ago • written 19 months ago by Rodrigo140

Hi Rodrigo. Actually, my codon alignment is in nucleotides, not amino acids. In a codon alignment, you use a protein alignment to constrain the cognate nucleotide alignment. So I'm sure that the -gtr and -nt options are fine. Here is the head of the codon alignment file:

>ID:AMR59844.1 <unknown description>
ATGTTTCACCGTCCTGGGTTTTCAGCTTTAAACACCGATGTCTGTTGGGCTGAGTACGAG
CGGGTGAAGGAGTTCTTACCCGTAAATCCAAAACACATCAACGTGGGTACTATCGGGCGT
GTTGCGTTCGATGATACTCCGCTGAGTACGTGTATTAAAGCTGCGCTTGGTACGCTGCCG
GCGTCTATAGTGTTTGAG---------GAATTAAAA
>ID:PhiSPFM1_216 <unknown description>
ATGTTTCACCGTCCTGGGTTTTCAGCTTTAAACACCGATGTCTGTTGGGCTGAGTACGAG
CGGGTGAAGGAGTTCTTACCCGTAAATCCTAAACACATCAACGTGGGTACTATCGGGCGT
GTTGCGTTCGATGATACTCCGCTGAGTACGTGTATTAAAGCAGCGCTTGGTACGCTGCCT
GAGCCTCACCACCACGACTGGGAGGCAACTTTACCC

I think it might be a bug in the Biopython wrapper, since the FastTree (version 2.1.3 SSE3) options don't include an "out" option.

ADD REPLYlink modified 19 months ago • written 19 months ago by russianconcussion20

Ok I see thanks for the info. It actually has an option -out:

FastTree -out tree protein_alignment

Where tree protein is your .tree. file and the protein_alignment your .fasta (in your case a .codon file). Have you tried running FastTree -gtr -n -out 245_hypothetical_protein.codon 245_hypothetical_protein.tree?

ADD REPLYlink modified 19 months ago • written 19 months ago by Rodrigo140

Hmmm. It definitely doesn't have an '-out' option in my version of FastTree (2.1.3 SSE3). For example, if I try:

FastTree -nt -gtr test.codon

Works fine and yields a tree in stdout:

FastTree Version 2.1.3 SSE3
Alignment: test.codon
Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000
Search: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1
TopHits: 1.00*sqrtN close=default refresh=0.80
ML Model: Generalized Time-Reversible, CAT approximation with 20 rate categories
Initial topology in 0.00 seconds
Refining topology: 4 rounds ME-NNIs, 2 rounds ME-SPRs, 2 rounds ML-NNIs
Total branch-length 0.109 after 0.00 sec
ML-NNI round 1: LogLk = -390.459 NNIs 0 max delta 0.00 Time 0.00
Turning off heuristics for final round of ML NNIs
GTR Frequencies: 0.2225 0.2319 0.2693 0.2763
GTR rates(ac ag at cg ct gt) 11.5787 1.2121 8.1377 3.9478 3.5543 1.0000
Switched to using 20 rate categories (CAT approximation)
Rate categories were divided by 0.649 so that average rate = 1.0
CAT-based log-likelihoods may not be comparable across runs
Use -gamma for approximate but comparable Gamma(20) log-likelihoods
ML-NNI round 2: LogLk = -377.465 NNIs 0 max delta 0.00 Time 0.01
Turning off heuristics for final round of ML NNIs (converged)
Optimize all lengths: LogLk = -377.465 Time 0.01
Total time: 0.02 seconds Unique: 2/23 Bad splits: 0/0
((0:0.0,22:0.0):0.05643,(1:0.0,2:0.0,3:0.0,4:0.0,5:0.0,6:0.0,7:0.0,8:0.0,9:0.0,10:0.0,11:0.0,12:0.0,13:0.0,14:0.0,15:0.0,16:0.0,17:0.0,18:0.0,19:0.0,20:0.0,21:0.0):0.05643);

However, if I add the '-out' option to the same command:

FastTree -nt -gtr test.codon -out test.out

I get no tree and a lecture on how to use FastTree:

  FastTree protein_alignment > tree
  FastTree -nt nucleotide_alignment > tree
  FastTree -nt -gtr < nucleotide_alignment > tree
FastTree accepts alignments in fasta or phylip interleaved formats

Common options (must be before the alignment file):
  -quiet to suppress reporting information
  -nopr to suppress progress indicator
  -log logfile -- save intermediate trees, settings, and model details
  -fastest -- speed up the neighbor joining phase & reduce memory usage
        (recommended for >50,000 sequences)
  -n <number> to analyze multiple alignments (phylip format only)
        (use for global bootstrap, with seqboot and CompareToBootstrap.pl)
  -nosupport to not compute support values
  -intree newick_file to set the starting tree(s)
  -intree1 newick_file to use this starting tree for all the alignments
        (for faster global bootstrap on huge alignments)
  -pseudo to use pseudocounts (recommended for highly gapped sequences)
  -gtr -- generalized time-reversible model (nucleotide alignments only)
  -noml to turn off maximum-likelihood
  -nome to turn off minimum-evolution NNIs and SPRs
        (recommended if running additional ML NNIs with -intree)
  -nome -mllen with -intree to optimize branch lengths for a fixed topology
  -cat # to specify the number of rate categories of sites (default 20)
      or -nocat to use constant rates
  -gamma -- after optimizing the tree under the CAT approximation,
      rescale the lengths to optimize the Gamma20 likelihood
  -constraints constraintAlignment to constrain the topology search
       constraintAlignment should have 1s or 0s to indicates splits
  -expert -- see more options
For more information, see http://www.microbesonline.org/fasttree/

I think it might be down to my version of FastTree or a bug in the wrapper.

ADD REPLYlink modified 19 months ago • written 19 months ago by russianconcussion20
2

The -out version is available for the version 2.1.10 so maybe is a matter of updating your FastTree version. Also the correct way would be to type in the command line FastTree -nt -gtr -out test.out test.codon.

ADD REPLYlink modified 19 months ago • written 19 months ago by Rodrigo140

Oops! Yes, you're right; I wrote the command wrong. Regardless, after updating to FastTree 2.1.10, the Biopython wrapper works! It was my ancient version of FastTree that was causing me trouble. Thank you for your help, Rodrigo!

ADD REPLYlink written 19 months ago by russianconcussion20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1096 users visited in the last hour