dxy (Nei 1987) calculation
2
0
Entering edit mode
6.7 years ago

Is there any scripts out there that can calculate absolute divergence (dxy) between populations/species using many alignments of sequence data?

snp next-gen sequence alignment • 8.2k views
ADD COMMENT
2
Entering edit mode

Could you give the right reference to the paper and possibly explain the terms in the equation?

ADD REPLY
1
Entering edit mode

The equation is described in Box 1 in a recent Molecular Ecology paper by Cruickshank and Hahn (2014). Link to the open access paper: http://onlinelibrary.wiley.com/doi/10.1111/mec.12796/pdf.

ADD REPLY
0
Entering edit mode

Thanks, and I should have put this in my first comment: Do you have some example dataset to play with? (In general it's good to make life easier to those willing to answer questions...)

ADD REPLY
0
Entering edit mode

Below are 3 example alignments

Locus1:

#NEXUS
begin data;
    dimensions ntax=2 nchar=460;
    format datatype=dna missing=? gap=-;
matrix
species1 ?????????????????????TAGTCCTTACACTGTAAAAAACTTTGGAATTGTTTGACCCTGTAAACACAAAATTCATGTCTCTCACCCTGGGACAAATACATTCTTTTTAAAAGCAGCATATGGGCAGCCTTGGACTGATGTTAGTTTATTGTCACTGCTTGATAACATTTAATGGAAAAGATACAAGAGTGCCAAAGAATTTTAATTATTTTTGTGATAAAGTTATATGTTCGGCCTTGAAAAAGTGGAGATAATGCTGGGATTCATTATTATTCCCAGTGTGTTTAAACAGACGACACAGAATGCAAACAAAAGCAGATGAAATTTGAAAAGTATTATCAATATT???????????????????????????????????????????????????????????????????????????????????????????????????????????????
species2 TGTGTGCATGCACAGATTTTTTAGTCCTTACACTGTAAAAAACTTTGGAATTGTTTGACCCTGTAAACACAAAATTCATGTCTCTCACCCTGGGACAAATACATTCTTTTTAAAAGCAGCATATGGGCAGCCTTGGACTGATGTTAGTTTATTGTCACTGCTTGATAACATTTAATGGAAAAGATACAAGAGTGCCAAAGAATTTTAATTATTTTTGTGATAAAGTTATATGTTCGGCCTTGAAAAAGTGGAGATAATGCTGGGATTCATTATTATTCCCAGTGTGTTTAAACAGACGACACAGAATGCAAACAAAAGCAGATGAAATTTGAAAAGTATTATCAATATTGCAGATAGCAGATGCCCTTTCCAATCAGAACAAGCATATCTTCTATAGCAACTTTATGGTTGAGTAGTTTATTCATTTCTATTAGAAGGTTGTACGTTTCTAAAATATGTA
;
end;

Locus2:

#NEXUS
begin data;
    dimensions ntax=2 nchar=582;
    format datatype=dna missing=? gap=-;
matrix
species1 ????????????????????????????????????????AACCACAATTGGTTGTCTGTTTTCTACTTTATGACATTTCCACTGAAAATTGTAATTCTTTTTTGCTGTGTTCTATTCCCCTTGTACGGAGTGTCCCCTTGGGAAGTGGGGCCCAAGAGCCCTTTCTAGGATGGGACAGGATATTACAGCTTGGTTTGTGCACCAGCATCCTTAATATTTCCTTCCTTTCAGAAGCAAATAGAGCGTACCCTTATCTGAATGCTAATTTCCTAGTTAAAACCCTCCCTTGCTGACAAGGGACTGAAAGAGTTTTAAATCACAGATGTAGAGTATCAAATGCAATAATGCTCTTGCAATAGTGCATTGAAGCCTCAATTAATTAACCCTTGGGCTAAGTAGGCAGGTACATGGTGGTGGCCACAGGCGGTGGATGGATGAGATTTAAATGGGCATCTCATTTCCTCA????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
species2 ????????????????????????????TTCTGAAAAATAAACCACAATTGGTTGTCTGTTTTCTACTTTATGACATTTCCACTGAAAATTGTAATTCTTTTTTGCTGTGTTCTATTCCCCTTGTACGGAGTGTCCCCTTGGGAAGTGGGGCCCAAGAGCCCTTTCTAGGATGGGACAGGATATTACAGCTTGGTTTGTGCACCAGCATCCTTAATATTTCCTTCCTTTCAGAAGCAAATAGAGCGTACCCTTATCTGAATGCTAATTTCCTAGTTAAAACCCTCCCTTGCTGACAAGGGACTGAAAGAGTTTTAAATCACAGATGTAGAGTATCAAATGCAATAATGCTCTTGCAATAGTGCATTGAAGCCTCAATTAATTAACCCTTGGGCTAAGTAGGCAGGTACATGGTGGTGGCCACAGGCGGTGGATGGATGAGATTTAAATGGGCATCTCATTTCCTCAGCACGGAACATGCCGTTTGATTCAGAAAGGAGTCATTTTACACACTCGCCTCATTTACGCTCAGCTTTAATCCCTTTAATTCCACCTGAGATCCAAGCAAGAATGGGAAAAGAGAG
;
end;

Locus3:

#NEXUS
begin data;
    dimensions ntax=2 nchar=610;
    format datatype=dna missing=? gap=-;
matrix
species1 ?????TATGTCTTGGTCTAGACTGAAGCAGAAACTCCAGGTCAGACATATGGTGACTGAAAAAGTGCATGTTATTTATTCATATCTCTTAATGTGAAATGTGTATTTGAAGAGACTTAAAATCTCTGAAAGAGCCAATTACTCTCAGCTTTTTAATTCTAGCAATACATTTGGAACATTTTCATTGTTCTAAGGGTTAAAAACCTCACCGTGACAATGATGAGCCTTATTACTCAGTCAAAGTAAATGGATCACCATATAACCTTTCAGAAATGTTCTTCCTTAAGCTATTAAAACATTCCATGCCCTTAGATGACAACAATTTCTCTGCCTTTTGAAATTTCTTTTCTATCCTGCAGAGTTCATAGAGATATGCTTGGTTAAAATCAACTTATATAAAACTATGCACTGTAAATTCTGACACTTCTGTTTGAATCTCTTTTCAAACACTTGTCTTTGCTCACCATAATAGATGTCAGTTCTTCTGATGTAGTTCAAGCATGAGCTTCATATGAAGACTCAGCTATGTCTATTGCATTTCTGAAGCTACTACTCACTGAAGTTTTGTGCTGTTTGACATCAAAGATAGGCAAGAATCACCTGCTGAGTTC
species2 TAAGGTATGTCTTGGTCTAGACTGAAGCAGAAACTCCAGGTCAGACATATGGTGACTGAAAAAGTGCATGTTATTTATTCATATCTCTTAATGTGAAATGTGTATTTGAAGAGACTTAAAATCTCTGAAAGAGCCAATTACTCTCAGCTTTTTAATTCTAGCAATACATTTGGAACATTTTCATTGTTCTAAGGGTTAAAAACCTCACCGTGACAATGATGAGCCTTATTACTCAGTCAAAGTAAATGGATCACCATATAACCTTTCAGAAATGTTCTTCCTTAAGCTATTAAAACATTCCATGCCCTTAGATGACAACAATTTCTCTGCCTTTTGAAATTTCTTTTCTATCCTGCAGAGTTCATAGAGATATGCTTGGTTAAAATCAACTTATATAAAACTATGCACTGTAAATTCTGACACTTCTGTTTGAATCTCTTTTCAAACACTTGTCTTTGCTCACCATAATAGATGTCAGTTCTTCTGATGTAGTTCAAGCATGAGCTTCATATGAAGACTCAGCTATGTCTATTGCATTTCTGAAGCTACTACTCACTGAAGTTTTGTGCTGTTTGACATCAAAGATAGGCAAGAATCACCTGCTGAGTTC
;
end;
ADD REPLY
2
Entering edit mode

I've implemented PI, which is very similar to dxy. https://github.com/zeeev/popFastaaa

If more people express interest I will implement dxy.

ADD REPLY
0
Entering edit mode
6.5 years ago
polcarel • 0

Hi... Back to the question above, how to calculate window-based Dxy between two populations given a set of genomewide SNP dataset using R? I am still at the early stage in R or perl.

Many thanks in advance.

ADD COMMENT
0
Entering edit mode

Homework? Why use R?

ADD REPLY
0
Entering edit mode

I guess R would be the fastest way to do this...By the way, I eventually managed to calculate windows of 10kb nucleotide diversities (pi, π) for each population πx and πy. This may be another silly question, can I use these π values to estimate the Dxy between the two populations?

ADD REPLY
0
Entering edit mode
2.6 years ago
beausoleilmo ▴ 490

In this paper, there is a link to their supplementary material. Within, these is a genomescan_dxy.pl.

I'm not understanding Perl enough to be able to read and interpret what is happening, but that could be one way of doing it.

ADD COMMENT

Login before adding your answer.

Traffic: 1750 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6