how to annotate draft genome
2
0
Entering edit mode
3.7 years ago
zhangdengwei ▴ 210

Hi all,

I have several bacterial draft genomes assembled by spades. After checking with checkM, I wanna annotate these bacteria which are with 'contaminatin' < 5%, in order to find out their species information. I aligned the draft genome to nt database, but I found one particular genome could be assigned to different bacteria. I am afraid that maybe this approach is not rational. Is there any approach competent in quick bacteria annotation. Thanks in advance.

contigs bacteria draft genome • 1.5k views
ADD COMMENT
0
Entering edit mode

Can you clarify what

which are with 'contaminatin' < 5%,

means?

Are these mixed samples (e.g. metagenomes)? Are you expecting contamination? If you have single, clean draft genome assemblies, prokka is a tool of choice for annotation.

ADD REPLY
0
Entering edit mode

Yes, my draft genomes were derived from metagenomes, and I try to split them into single bacteria. Afterward, I used checkM to determine whether it is clean. Each separated bacterium has hundreds of contigs. So now I want to know what the separated bacteria is.

ADD REPLY
0
Entering edit mode

I would try using a dedicated metagenomic annotation and binning pipeline such as: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03585-4

Though I've never tried it myself, so I can't vouch for it.

ADD REPLY
1
Entering edit mode
3.7 years ago
Mensur Dlakic ★ 27k

Assuming that you have a computer with at least 128 Gb RAM (or a combo of RAM+swap > 128 Gb), the most consistent way of doing this is by using a GTDB toolkit. Below is an example of the final output for each bin that is more than 10% complete (it is truncated on the right side so as to not run too far out). As you can see, most metagenomic bins are classified down to the family or genus level, with couple of them having a species designation.

group_06        d__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Desulfurococcales;f__Desulfurococcaceae;g__Thermosphaera;s__Thermosphaera aggregans
group_11        d__Archaea;p__Nanoarchaeota;c__Nanoarchaeia;o__Nanoarchaeales;f__Nanopusillaceae;g__Nanopusillus;s__
group_16        d__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Desulfurococcales;f__Acidilobaceae;g__;s__
group_04        d__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__
group_18        d__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Thermofilales;f__Thermofilaceae;g__;s__ 
group_15        d__Bacteria;p__Deinococcota;c__Deinococci;o__Deinococcales;f__Thermaceae;g__Thermus;s__Thermus aquaticus
group_03        d__Bacteria;p__Aquificota;c__Aquificae;o__Aquificales;f__Aquificaceae;g__Thermocrinis;s__
group_17        d__Bacteria;p__Desulfobacterota;c__Thermodesulfobacteria;o__Thermodesulfobacteriales;f__Thermodesulfobacteriaceae;g__;s__
ADD COMMENT
0
Entering edit mode

Many thanks. That's what I want.

ADD REPLY
1
Entering edit mode
3.7 years ago

As Mensur said GTDB can be used to determine taxonomic affiliation of your bins. In case you do not have 128 Gb of RAM, GTDB has been implemented in KBase

ADD COMMENT

Login before adding your answer.

Traffic: 2127 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6