Assess The Characteristics Of High Numbers Of Bacterial Species
2
3
Entering edit mode
10.8 years ago
Daniel ★ 4.0k

When trying to investigate a 16S rRNA dataset, I often identify several dozen/hundred species/families which are found in higher/lower abundances. I then start doing literature searches to see what they could be doing, where they have been observed before etc.

To me this sounds:

  1. Really selective, only sampling a few papers for each species.
  2. Limiting, as there is no way to do this fully for tens of species.
  3. Incredibly time consuming.

What I'm really looking for is a system which I can put a taxa list into and it'll say "Those ones are all anoxic" or "Those 5 have shown denitrifying ability". I don't know if this could be done with literature mining or where to start this, or if there is a database around in the world which curates data like this...

Any suggestions are appreciated.

• 4.1k views
ADD COMMENT
5
Entering edit mode
10.8 years ago
Daniel ★ 4.0k

On following Neil's advice, and reading some papers I reverted back to IMG to see what meta-data they collect with their published genomes, and actually, it's quite substantive.

On clicking into the 'Genome Browser' section you get presented with a list of species and some basic data on the sequencing project. But if you navigate to the bottom of the page, there is a selection as such: IMG_Metadata_example

This allows you to generate your table with the data you're interested in. A selection as above results in the following output, which can then be exported to excel or tsv format and used however you would like. (Full size available here:http://i.imgur.com/KjMXRDu.png)

enter image description here

ADD COMMENT
0
Entering edit mode

That's great. I had ignored IMG since people often want more than "just the organisms with sequenced genomes". And I had not looked at it in a while. But there are so many microbial genomes now...and great to see the associated physiological data.

ADD REPLY
0
Entering edit mode

I was just going to tick this 'answered', but I'm not allowed to do so on my own post... Do you know if there is a way to do it?

ADD REPLY
0
Entering edit mode

I thought this used to be possible. See what Istvan says (contact him if no response here).

ADD REPLY
2
Entering edit mode
10.8 years ago
Neilfws 49k

There are not many good online resources with programmatic access for this kind of task. Microbiology data still tends to be stashed away in expensive manuals and books.

One good starting point might be the Microbial Life Database and links off that page. In particular they have a useful Google spreadsheet.

Less comprehensive but also useful: MicrobeWiki.

ADD COMMENT
0
Entering edit mode

Thanks Neil, for putting me on to this. The google doc looked nice and full of data, but unfortunately a bit out of date, and several of the families and orders I'm looking at aren't present. But whilst exploring the links and papers, i've found that the IMG (Integrated Microbial Genomes) actually has a large amount of metadata which is looking really good. I'll put it as a separate answer as I think it's worth noting. Thanks again.

ADD REPLY

Login before adding your answer.

Traffic: 1778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6