Sequence divergence of whole genomes
1
0
Entering edit mode
3.0 years ago
Colaptes ▴ 90

Hello,

I sequenced and assembled the genomes of two closely related species, and I would like to calculate the pairwise sequence divergence between the two whole genomes. I have done this for many orthologous genes extracted from the genome, but now I would like to align the whole genomes and calculate overall divergence. The genomes are in fasta format.

I know how to align the genomes and get a sam/bam file, and I know how to calculate sequence divergence between aligned fasta entries, but I am not sure how to link the steps. Is there a good way to get fasta-format alignments for whole genomes, or other ways to calculate their sequence divergence? (Bonus if there is a way to exclude coding regions stored in GFF files!)

Thank you!

alignment divergence • 1.1k views
ADD COMMENT
0
Entering edit mode
3.0 years ago
Mensur Dlakic ★ 27k

Would you be interested in whole-genome average nucleotide identity? If so, this program will do what you want:

https://github.com/ParBLiSS/FastANI

It is fast and does not require any external programs.

ADD COMMENT
0
Entering edit mode

Thanks! I would definitely be interested in something like that. I should have mentioned that my genomes are eukaryotic (~1 Gb, ~10% repeat), and it seems that the developers of FastANI warn that it is not tested for use on eukaryotic genomes.

ADD REPLY

Login before adding your answer.

Traffic: 2000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6