16S NCBI database to train feature classifier
0
0
Entering edit mode
4.6 years ago
the_dummy ▴ 30

Hello, I want to train feature classifiers as I did with SILVA and GreenGenes databases. But I couldn't figure out which sequences I should get from NCBI since the database is complex and it is not very straight like SILVA for 16S amplicon analysis. I need to get reference sequences and taxonomy files from NCBI somehow. Any help would be appreciated. Thank you very much...

amplicon 16S NCBI classifier • 1.2k views
ADD COMMENT
1
Entering edit mode

NCBI has a collection of 16S sequence available as a pre-formatted blast index here. These sequences are from two bioprojects (BioProjects 33175 and 33117), which you can search for at NCBI.

You can recover the fasta sequences from the blast indexes by converting them back to fasta using blastdbcmd utility included in BLAST+ package.

ADD REPLY
0
Entering edit mode

Yes, thank you for the utility. I did find this data but I thought it is not related since there was no fasta file. GREAT HELP!

ADD REPLY

Login before adding your answer.

Traffic: 2941 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6