Question

Composite Of Multiple Signals; Where Have You Gone?

5

Entering edit mode

13.4 years ago

Zev.Kronenberg 12k

This composite method for detecting natural selection seems to have made a big splash.

http://www.sciencemag.org/content/327/5967/883.abstract

However, does an implementation that can be downloaded exist? I have read the paper, visited the lab's website and poked around the internet, but no luck.

Anyone know of a newer paper or tool that I should consider using in place of CMS?

fst selection • 4.8k views

ADD COMMENT • link updated 12.2 years ago by Giovanni M Dall'Olio 28k • written 13.4 years ago by Zev.Kronenberg 12k

score 6 · Answer 1 · 2012-01-26

6

Entering edit mode

13.4 years ago

Neilfws 49k

According to this press release from 2010: "The software tools for CMS analysis are all homegrown, and should soon be available, perhaps wrapped into a program called Sweep for the long haplotype analysis."

I suggest you email the authors and ask how things are coming along :)

Of course, if journals required that software used for analyses be made available, we would not have this kind of problem. It astonishes me that so-called top-tier journals will accept the results of an analysis with no concern regarding the tools used to do it.

ADD COMMENT • link 13.4 years ago by Neilfws 49k

0

Entering edit mode

Even more disturbing is when as a reviewer, you ask that code used be made publicly available as a condition of publication and the editor of the journal doesn't consider this to be a valid or relevant consideration...

ADD REPLY • link 13.4 years ago by Malachi Griffith 20k

0

Entering edit mode

I figured it wasn't available just though I would ask. Seems pretty shady not to even release the code.

ADD REPLY • link 13.4 years ago by Zev.Kronenberg 12k

score 2 · Answer 2 · 2013-04-09

2

Entering edit mode

12.2 years ago

bmpbowen ▴ 40

http://www.broadinstitute.org/scientific-community/science/programs/medical-and-population-genetics/cms/cms-composite-multiple-si-0

ADD COMMENT • link 12.2 years ago by bmpbowen ▴ 40

0

Entering edit mode

Thanks for the update. It was probably released with: "Identifying Recent Adaptations in Large-Scale Genomic Data"?

ADD REPLY • link 12.2 years ago by Zev.Kronenberg 12k

score 1 · Answer 3 · 2013-04-09

I tried to implement the CMS some time ago, but in the end, due to lack of time, I could not complete it.

The problem with the CMS is that you have to create a set of simulations, using the 90 sets of parameters that can be found in the Supplementary Materials of the paper, plus one scenario for neutral evolution. The simulations have never been made available, but can be generated easily using cosi (you will have to adjust the allele frequency spectrum, and remove the simulations that are too different from the rest of the genome). So, even if you can find the CMS script, you also need the set of simulations; I suppose that because these are large files, the CMS has never been made available online.

Once you have got the simulations, you have to calculate the Fst, dDAF, iHS, and other tests. From that, you have to calculate the distribution of values for each test, and in the two datasets of simulations (neutral and selection). Then, you can calculate the p-value of a given SNP in the genome by calculating the same tests (Fst, dDAF, iHS, ...), and calculating the probability of observing the value in the two sets of simulations. The CMS is just the multiplication of the ratios of the p-values calculated in this way. If you have another method to calculate the p-values, you can also consider implementing your custom CMS, just by multiplying the p-values.

One big problem in the CMS is that it assumes that all the tests have the same ability to detect a selective sweep. For example, Fst and iHS should have the same ability to detect selection. I personally don't think that this assumption is very correct, also considering that these tests detect different types of selection. It would be better to use a method that can give some weights to each of the tests used for selection; for example, you can have a look at this paper (Lin et al 2011, Distinguishing Positive Selection From Neutral Evolution: Boosting the Performance of Summary Statistics), where the authors used a technique called boosting.