most sequenced genomes (microbial)
2
0
Entering edit mode
5.8 years ago
savscosta • 0

I would like to know if there is any way that i can get which microorganisms with the largest number of genomes sequenced and deposited in databases. (NCBI). i tried to get this information in the ncbi website but I did not succeed

genomes microbial • 1.2k views
ADD COMMENT
0
Entering edit mode

thank you.

I want the microorganisms with more complete genomes deposited in NCBI. Can you help me?

ADD REPLY
0
Entering edit mode

What do you mean by more complete genomes?

Note: Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLY
0
Entering edit mode

sorry, i want to tell 'the microorganisms with the largest number of complete genomes deposited'

ADD REPLY
0
Entering edit mode

See my second answer below (A: most sequenced genomes ).

ADD REPLY
4
Entering edit mode
5.8 years ago
GenoMax 141k

Get prokaryote genomes summary file from NCBI here.

awk -F '\t' '{print $1}' prokaryotes.txt | sort | uniq -c | sort -k1,1nr > bact

Gets you (truncated for brevity) following. Note: This is only checking on the names as available in the summary file.

   8123 Escherichia coli
   8055 Streptococcus pneumoniae
   4598 Staphylococcus aureus
   3722 Mycobacterium tuberculosis
   3161 Klebsiella pneumoniae
   2492 Pseudomonas aeruginosa
   2372 Listeria monocytogenes
   2062 Salmonella enterica subsp. enterica serovar Typhi
   1923 Acinetobacter baumannii
   1351 Salmonella enterica
   1210 Neisseria meningitidis
   1115 Streptococcus suis
   1079 Clostridioides difficile
   1039 Shigella sonnei
    926 Campylobacter jejuni
    863 Bacillus cereus
    852 Mycobacteroides abscessus subsp. abscessus
    755 Enterococcus faecium
    727 Streptococcus agalactiae
    722 Campylobacter coli
    633 Bordetella pertussis
    600 Vibrio parahaemolyticus
    556 Salmonella enterica subsp. enterica serovar Typhimurium
    553 Enterobacter cloacae
    549 Helicobacter pylori
ADD COMMENT
3
Entering edit mode
5.8 years ago
GenoMax 141k

To look for genomes marked as "Complete" use the following variation of the answer.

 grep -w "Complete" prokaryotes.txt | awk -F '\t' '{print $1}' - | sort | uniq -c | sort -k1,1nr > compl_genomes

This is the result at the time of writing.

444 Escherichia coli
343 Bordetella pertussis
177 Klebsiella pneumoniae
159 Staphylococcus aureus
127 Mycobacterium tuberculosis
105 Pseudomonas aeruginosa
 94 Listeria monocytogenes
 88 Campylobacter jejuni
 83 Streptococcus agalactiae
 82 Acinetobacter baumannii
 63 Neisseria meningitidis
 57 Corynebacterium pseudotuberculosis
 54 Helicobacter pylori
 52 Legionella pneumophila
 50 Bacillus velezensis
 49 Brucella melitensis
 47 Burkholderia pseudomallei
ADD COMMENT

Login before adding your answer.

Traffic: 2565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6