How to determine % similarity between genomes?
2
2
Entering edit mode
7 months ago
A_heath ▴ 60

Hi all,

I am aligning multiple bacterial genomes and I would like to know how I can obtained a % of identity between these genomes ?

Is that a function that can be displayed by either Mugsy or Mauve?

Thank you for your answers.

Audrey

genome alignement mugsy mauve % identity • 345 views
ADD COMMENT
5
Entering edit mode
7 months ago
5heikki 10.0k

I recommend Mash

mash dist genome1.fna genome2.fna
ADD COMMENT
0
Entering edit mode

Thank you for your help.

I used mash and I have the following results :

Mygenome.fasta Close_genome_1.fasta 0.0196 0494/1000

Mygenome.fasta Close_genome_2.fasta 0.0174 530/1000

I do not really understand the meaning of the two scores.

In this case, which genome is closer? Genome 1 or 2?

Thanks in advance

ADD REPLY
1
Entering edit mode

Close_genome_2 is closer. ANI = 1 - mash distance, so here 1 - 0.0174 = 0.9826, i.e. 98.26% similarity. The last column displays the number of shared hashes (out of 1,000 by default). You can get more precise results if you sketch your genomes first with e.g. k-mer value of 17 and sketch size of 10,000 (mash sketch -k 17 -s 10000 input.fna) and then compare the resulting .msh files with mash dist

ADD REPLY
3
Entering edit mode
7 months ago
Carambakaracho ★ 2.7k

What you're probably looking for is average nucleotide identity (ANI).

This is a tool I ever wanted to test, but now it's not relevant for me anymore

FastANI (publication)

More readings from my simple web search

https://www.sciencedirect.com/science/article/pii/S0580951714000087

https://img.jgi.doe.gov/docs/ANI.pdf

ADD COMMENT

Login before adding your answer.

Traffic: 1170 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6