Population Genetics from SNP data- no VCF
0
0
Entering edit mode
4.6 years ago
vidyavuru • 0

Hi, this is going to be a long question:

My dataset is basically flowers that are tetraploid in nature. These flowers can be divided into wild and garden flowers.

1) Of what I've read, Tajima's D and Pi are population specific. That is, it mostly explains population structure and should be mostly used within a set population.

Based on this, my idea was to align sequences (based on type (wild /garden)) and measure Tajima's D separately for each type, and based on this alignment,

I measure Tajima's D. My problem here is that the R package that does estimate Tajima's D (PopGenome) does not consider heterozygous SNPs, and thus, would not be accurate for tetraploid SNPs in these sequences. I then found this package, 'snpR' (Hemstrow) that does calculate Tajima's D and pi over a sliding window, but I'm unable to specify populations in that function. I ended up thinking that I could just divide the data based on the types of flowers, but I wouldn't be filtering out monomorphic SNPs. But, I also think that it shouldn't matter, because Tajima's D only measures within a given population.

Could you give any suggestion or idea with regards to how I could measure Tajima's D and pi?

2) For Ka/Ks, I have come across numerous papers that use either DNAsp or KaKs calculator or seqinr kaks(), but I don't think the ploidy is really taken into consideration.

i) Of what I understood, ks/ks is done between populations, to find out the rate of mutation with respect to a certain reference sequence. In that case, if, for example, I have 10 cut flowers and 10 other types of flowers, how do I compare the ka/ks?

    From what I read, I would take the CDS sequences, align them and then measure ka/ks (using, for example, seqinr::kaks()), but then I might be losing information on some of the SNPs by blindly aligning all the same type of flowers together. Is there a better procedure to handle this?


ii) I also found some analyses that assesses ka/ks per position. My question here is then, how are you comparing between populations?

Is there a better way to assess ka/ks in r?

R SNP population genetics • 1.2k views
ADD COMMENT
0
Entering edit mode

Are you wedded to estimating summary statistics in R? I'm a big fan of using R whenever I can, but I find the packages lacking for this level of non-model population genetics.

ADD REPLY

Login before adding your answer.

Traffic: 1971 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6