Clustered bacterial RefSeq?
1
0
Entering edit mode
9 months ago
predeus ★ 1.9k

Hi all,

I am sure I am missing something obvious, and hope someone would point me in the right direction.

I was wondering if there are datasets similar to UniRef90/UniRef50 etc, but done on bacterial RefSeq genome sequences, e.g. by clustering using something like ANI? Basically it would be good to have a "rarified" database with say 10-20k genomes defined by some sort of clustering, without 1000 E. coli genomes etc.

Thank you in advance, as always!

refseq clustering ANI bacteria • 428 views
ADD COMMENT
2
Entering edit mode
9 months ago
GenoMax 142k

This sounds like NCBI's Prokaryotic representative reference genome sequences: https://www.ncbi.nlm.nih.gov/refseq/about/prokaryotes/#representative_genomes

Here is that list (17,500 genomes as of July 2023) : https://www.ncbi.nlm.nih.gov/genome/browse#!/prokaryotes/refseq_category:representative

There is already a blast database available: ref_prok_rep_genomes

ADD COMMENT
0
Entering edit mode

Thank you! I must have seen the "representative genome" descriptor a hundred times, yet it never occurred to me that's what it is.

ADD REPLY

Login before adding your answer.

Traffic: 1317 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6