Simulate Phylogeny With Ms
1
0
Entering edit mode
11.7 years ago
jianfeng.mao ▴ 30

I am new to ms and simulation. I would like to use ms to simulate a phylogeny, and then use Seq-Gen to simulate DNA sequences based on ms output.

ms works fine for me on the population demographic parameters I specified. Here I have simple questions on understanding the ms output.

Could you please give any directions? Thanks a lot in advance.


(1) ms output the taxon names as integers, I would like to know the population origin of those individuals taxon (in my case, there is several meta-populations, each population may have different number of haplotypes).

(2) the output phylogeny from ms, seems not what I want. So, I am warring if I have done in the wrong way to specify parameters to ms. So could you please help me evaluate if my command for ms is right? please see the command I used, the output, and the phylogeny I want from the links.

(3) command I used

./msdir/ms 12 4000 -T -I 5 4 1 1 5 1 \
-n 1 0.0116299 -n 2 0.0001075 -n 3 0.0000948 -n 4 0.0075559 -n 5 0.00001 \
-ej 0.0022963 1 2 -en 0.0022963 2 0.000218 \
-ej 0.0022984 2 5 -en 0.0022984 5 0.0003916 \
-ej 0.0023044 3 5 -en 0.0023044 5 0.0099899 \
-ej 0.0232045 4 5 -en 0.0232045 5 0.222093 \
> genealogies_A.txt
simulation • 3.7k views
ADD COMMENT
0
Entering edit mode

Perhaps you could tell us what MS is? Multi-Stack? I searched online and could not find anything clear to me and I think I'm pretty current with phylogenetics. If this is a program I can't find anything on it.

Why do you want to simulate a phylogeny? What is your overall goal of doing this?

ADD REPLY
0
Entering edit mode

ms, is a tool for simulation. Here we could find it. http://home.uchicago.edu/~rhudson1/source/mksamples.html

ADD REPLY
0
Entering edit mode

Thanks a lot, Dear Josh. I expect to get helps from anyone of you.

ADD REPLY
0
Entering edit mode
11.7 years ago
David W 4.9k

Jianfeng,

Yes, ms uses integers for taxon names, but the integers correspond to the populations you set up. So, in your case 1:5 will be from one population, 6:9 another and so on. The manual for ms describes how to use it along side seqgen, in your case you'd do something like

$ 12 1 -T -I 5 4 1 1 5 1  [demographic paramaters] | tail -n+ 4 | grep -v // >  test.tree
$ seq_gen -mGTR -l 800 < test.tre > test.phy

The two pipes in the first command are to clean up the output of ms so it's just the trees that are being passed to seq-gen in the next command. If you'd prefer to avoid the pipes and redirects there are functions to do this stuff in the phyloclust library in R.

(btw, you should check your demographic commands, when I ran what you've written above I got a very deep divergence between the second to last populaton and all others - not what you were aiming for?)

ADD COMMENT

Login before adding your answer.

Traffic: 1766 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6