Get dS numbers _per branch_ with PAML codeml (or branch length estimates)
1
0
Entering edit mode
7.5 years ago
BlastedBadger ▴ 160

Hi, I am trying to estimate divergence times (of genes) in a gene tree, using dS (synonymous substitution rates). At least I am trying to map substitution numbers to branches, then if it corresponds to time is another question.

I am using complete gene trees from Ensembl (one gene tree at a time), including paralog genes, and the corresponding multiple alignements.

I have run codeml using a _free-ratio branch model_ (model=1, NSSites=0). I have trouble understanding if the output is relevant for what I need:

  • codeml doesn't actually outputs dS _for each_ branch, but rather a distance matrix, right? (in the file 2NG.dS)
  • Plus, the distance matrix in the .mlc and 2NG.dS files is the result from the Nei-Gojobori (1986) method, but (citing the manual) not MLEs [maximum-likelihood estimates] themselves. So are they just rough estimates but codeml can do better?
  • Then what exactly is 2NG.t? Time estimates proportional to dS? [EDIT: found it in the "yn00" section of the manual: "number of nucleotide substitutions per codon", i.e. dS and dN combined]
  • Do I need to run mcmctree or multidivtimes to estimate branch lengths?

PS: I paste here my control file if you need to have a look:

     seqfile = ENSGT00790000122969_genes.phy       * sequence data file name
    treefile = ENSGT00790000122969_genes.nwk       * tree structure file name
     outfile = ENSGT00790000122969.mlc             * main result file name

       noisy = 1   * 0,1,2,3,9: how much rubbish on the screen
     verbose = 1   * 1: detailed output, 0: concise output
     runmode = 0   * 0: user tree;  1: semi-automatic;  2: automatic
                   * 3: StepwiseAddition; (4,5):PerturbationNNI; -2: pairwise

     seqtype = 1   * 1:codons; 2:AAs; 3:codons-->AAs
   CodonFreq = 2   * 0:1/61 each, 1:F1X4, 2:F3X4, 3:codon table
       ndata = 1   * specifies the number of separate data sets in the file
       clock = 0   * 0: no clock, unrooted tree, 1: clock, rooted tree

      aaDist = 0   * 0:equal, +:geometric; -:linear, 1-6:G1974,Miyata,c,p,v,a
                   * 7:AAClasses

       model = 1   * models for codons:
                        * 0:one, 1:b, 2:2 or more dN/dS ratios for branches
                   * models for AAs or codon-translated AAs:
                        * 0:poisson, 1:proportional, 2:Empirical, 3:Empirical+F
                        * 6:FromCodon, 8:REVaa_0, 9:REVaa(nr=189)

     NSsites = 0   * 0:one w; 1:neutral; 2:positive selection; 3:discrete; 4:freqs;
                   * 5:gamma; 6:2gamma; 7:beta; 8:beta&w; 9:betaγ
                   * 10:beta&gamma+1; 11:beta&normal>1; 12:0&2normal>1;
                   * 13:3normal>0

       icode = 0   * 0:universal code; 1:mammalian mt; 2-11:see below
       Mgene = 0   * 0:rates, 1:separate;

   fix_kappa = 0   * 1: kappa fixed, 0: kappa to be estimated
       kappa = 2.05154 * initial or fixed kappa

   fix_omega = 0   * 1: omega or omega_1 fixed, 0: estimate
       omega = 1

       getSE = 0   * 0: don't want them, 1: want S.E.s of estimates
RateAncestor = 1   * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)
  Small_Diff = .5e-6 * small value used in the difference approximation of derivatives

   cleandata = 0   * remove sites with ambiguity data (1:yes, 0:no)?
 fix_blength = 1   * 0: ignore, -1: random, 1: initial, 2: fixed
      method = 0   * 0: simultaneous; 1: one branch at a time
PAML codeml dS substitution rates phylogeny • 3.9k views
ADD COMMENT
0
Entering edit mode

oops, after reading more on the subject on Biostars, I realized that my codeml run was never properly finishing! It gives an error like Species ENSGT00790000122969.a? that I need to fix first. Will come back when it's done...

ADD REPLY
2
Entering edit mode
6.1 years ago
al-ash ▴ 200

Concerning your 1st question (dS for each branch) - you can find it among the large amount of output information in the outfile (ENSGT00790000122969.mlc in your case) in a tabular form (in section labeled "dN & dS for each branch") and also as branch lengths of your tree in section "dS tree:".

ADD COMMENT

Login before adding your answer.

Traffic: 1920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6