Question: Converting ms or msms output to PLINK or VCF files
2
gravatar for GabrielMontenegro
3.7 years ago by
United Kingdom
GabrielMontenegro560 wrote:

There are two widely used coalescent simulation tools: Hudson’s MS (Hudson 2002), as well as Ewing’s MSMS (Ewing and Hermisson 2010), which incorporates selection.

I was wondering if anyone knew or had a script that could convert the output format of these two programs (which are the same) to either PLINK or VCF format.

Example output from ms or msms:

segsites: 3
positions: 0.05509 0.21466 0.70900
000
110
100
100
100
100
100
100
001
001
snp sequence genome • 1.8k views
ADD COMMENTlink modified 4 months ago by alex.klassmann0 • written 3.7 years ago by GabrielMontenegro560

hello I have the same question, do you have resolve this ? thanks

ADD REPLYlink written 2.9 years ago by karine.durand0

This is a tangent but I really like msprime as it can give you VCF format directly.

ADD REPLYlink written 2.9 years ago by Gabriel R.2.8k

another comment, I do not know what you are trying to do but the ms output produces this output under an infinite sites model and these represent mutations occurring in different branches in the tree. To quote the manual: "An infinite sites model of mutation is assumed, and thus multiple-hits and back mutations do not occur. However, when used in conjunction with other programs, finite site mutation models or micro-satellite models can be studied. For example, the gene trees themselves can be output, and these gene trees can be used as input to other programs which will evolve the sequences under a variety of finite-site models." I used seq-gen in the past to do this using the gene trees as the manual describes.

ADD REPLYlink written 2.9 years ago by Gabriel R.2.8k

Hello, I would also like to do this. Is there any more straight forward way to do this rather than using msprime?

Many thanks

ADD REPLYlink written 2.1 years ago by Mr Locuace100

Hello! I'm facing the same issue as well... I've tried to use ms2geno but haven't managed to make it work properly. Has anyone found a better option?

ADD REPLYlink written 8 months ago by sonia.olaechea140
0
gravatar for alex.klassmann
4 months ago by
alex.klassmann0 wrote:

I've written a C++ utility 'ms2vcf' which converts ms output into vcf; for simplicity I added the code to a small package with other ms-related stuff: see https://sourceforge.net/projects/coatli/ and the wiki there (last entry).

ADD COMMENTlink written 4 months ago by alex.klassmann0

I download the package coatli and trying ms2vcf function. But I wonder how can I set the parameter for the input file? There's no option for the input. Could you please give me some comments or a full command line?

ADD REPLYlink written 4 months ago by gaowei11600

The function is intended for piping only and thus has no options to specify input and output files, sorry. A full command line is given in the Wiki there: ms 10 2 -t 10 -precision 16 | ms2vcf -length 1000 > msout.vcf

However, note that a conversion of ms output to vcf entails a violation of the infinite-sites-model, since a vcf can accommodate only a finite number of integer positions. This obviously limits its usefulness.

ADD REPLYlink written 3 months ago by alex.klassmann0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1194 users visited in the last hour