Question: Is It Possible To Infer Population Genetics Parameters Like Ne Using De-Novo Sequencing Data Of Pooled Samples?
10
gravatar for Lhl
9.0 years ago by
Lhl730
United States
Lhl730 wrote:

Hi there,

I have used 454 GFLX to (de-novo) sequence two plant ecotypes (two divergent populations which adapted to each of their own habitats) by polling 16 individuals from each ecotype. To date, i have finished assembling, SNPs and Indel detection. And i also calculate population parameters like Watterson's Theta (θ = 4Neμ), Pi (which is expected to be equal to theta under neutral equilibrium). However, i am not sure whether it is possible to inferring some other parameters like Ne (effective population size), divergence time of the two ecotypes.

By the way, i would like to know how do you identify SNP outliers for you data if you have done or doing the same thing. Is it good to use a Fst based approach or Fisher exact test?

Elzed

population analysis sequencing • 6.8k views
ADD COMMENTlink modified 6.3 years ago by Giovanni M Dall'Olio26k • written 9.0 years ago by Lhl730

What system did you use to detect SNPs and INDELs in your dataset?

ADD REPLYlink written 8.2 years ago by Erik Garrison2.2k

Sorry fot the late reply. By system, do you mean softwares? I tried Mosaik && BWA-SW + Samtools to do alignment and SNP calling.

ADD REPLYlink written 8.0 years ago by Lhl730
4
gravatar for Casey Bergman
9.0 years ago by
Casey Bergman18k
Athens, GA, USA
Casey Bergman18k wrote:

Yes, there is some recent effort to solve on this problem, see Futschik & Schlötterer (2010) Genetics.

EDIT: see associated code base at PoPOOLation (Hat tip to RaghuM's answer on this related thread)

ADD COMMENTlink modified 9 weeks ago by RamRS24k • written 9.0 years ago by Casey Bergman18k
4

As you probably are aware, under the standard neutral model you can infer Ne from theta if you assume a mutation rate. You'd have to dig deeper or contact the authors about more complex demographic scenarios. You may want to post your question to evoldir (http://evol.mcmaster.ca/evoldir.html) for a more community-specific response to this question.

ADD REPLYlink written 9.0 years ago by Casey Bergman18k

Yes, thanks. i read the paper. But do you have any ideas about inferring demographic history,like Ne?

ADD REPLYlink written 9.0 years ago by Lhl730

Thanks a lot, i will try that.

ADD REPLYlink written 9.0 years ago by Lhl730

Thanks Casey, that's a very cool community.

ADD REPLYlink written 9.0 years ago by Lhl730

And does that mean i have to identify regions those are under neutral selection? Could i define a neutral region simply based on those having Theta close to 0?

ADD REPLYlink written 9.0 years ago by Lhl730

And does that mean i have to identify regions those are under neutral selection? Could i define a neutral region simply based on those having Theta close to Pi?

ADD REPLYlink written 9.0 years ago by Lhl730
3
gravatar for Giovanni M Dall'Olio
6.3 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

The software PSMC can infer how the effective population size of a species has changed over time, using only one single diploid sequence.

Estimated history of effective population size in human populations, from Li and Durbin 2012:

image taken from Li, Durbin 2012

ADD COMMENTlink written 6.3 years ago by Giovanni M Dall'Olio26k

I am sorry if I just interrupting the topic discussed above.

Can I know how to scale down Y axis (effective population size)?,

The scale generated on my PSMC plot is too big and the changes in effective population size across time was unable to estimate.

ADD REPLYlink modified 10 weeks ago by RamRS24k • written 4.4 years ago by nadiahtohoku0
2
gravatar for David W
9.0 years ago by
David W4.7k
New Zealand
David W4.7k wrote:

Have you considered the Extended Bayesian Skyline (Heled and Drummnd 2008, tutorial here).

Presuming you have aligned sequences, you should be able to infer changes in population size (unless you have an estimate of the mutation rate of some of your genes you won't be able to express in it 'real' numbers, but that's not always the goal anyway)

ADD COMMENTlink written 9.0 years ago by David W4.7k
1

Just be aware that skyline assumes no recombinations. To counteract this, we should have sufficient number of loci, I think.

ADD REPLYlink written 9.0 years ago by lh331k
1

I don't about ms (isn't that a simulation program?). To do the Bayesian analysis you'll need to give each 'partition' in your data a substitution model (so non-coding seqs probably don't need teh ful GTR for instance)

One of the problems with using massive multi-loci datasets in this sort of anaylysis is deciding what a partition is. Is tempting to set each locus as one partition, but that can be a PITA computationally and probably over-fits the data. (I don't have to solution to that problem by the way, just a warning ;)

ADD REPLYlink written 9.0 years ago by David W4.7k

And should i discriminate between coding and non-coding region when using this software?

ADD REPLYlink written 9.0 years ago by Lhl730

Thanks. That is a good point. However,should i discriminate between coding and non-coding regions when processing my datasets?

ADD REPLYlink written 9.0 years ago by Lhl730

And do you think it is possible to us ms to solve the same problem?

ADD REPLYlink written 9.0 years ago by Lhl730

Thanks David. ms is a coalescent simulation software created by Richard R. Hudson at the University of Chicago. It is available at https://webshare.uchicago.edu/xythoswfs/webui/users/rhudson1/Public/ms.folder?action=frameset&subaction=print&uniq=yzld0b&stk=2B23BE1D462EA92

ADD REPLYlink written 9.0 years ago by Lhl730
1
gravatar for Paolo Gratton
8.8 years ago by
Paolo Gratton10 wrote:

Hello!

I have stepped into this thread, which is really interesting. It seems to me that nobody mentioned what looks to me a very important matter. Elzed's data are from 16 pooled individuals, without tagging, right? Is it possible to retrieve the true frequency of each haplotype in each sample? If it is not, how is it possible to use coalescent based algorithms like Beast - EBSP?

I hope this thread is still active, since I guess I am not grasping something and I would really like to know what.

Paolo

ADD COMMENTlink written 8.8 years ago by Paolo Gratton10

Thanks for your interests in this topic, Paolo. And i am sorry this late reply because of my travelling to another place out of my own country. I have two pools, with each of them consists of 16 individuals. Each pool has a unique tag. Futschik and Schlötterer (2011) proposed a method to estimate population genetics parameters. http://www.genetics.org/cgi/content/full/186/1/207

I would like to continue our discussion over this rub.

ADD REPLYlink written 8.7 years ago by Lhl730
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 889 users visited in the last hour