Question: Assess The Characteristics Of High Numbers Of Bacterial Species
3
gravatar for Daniel
4.3 years ago by
Daniel3.5k
Cardiff University
Daniel3.5k wrote:

When trying to investigate a 16S rRNA dataset, I often identify several dozen/hundred species/families which are found in higher/lower abundances. I then start doing literature searches to see what they could be doing, where they have been observed before etc.

To me this sounds:

  1. Really selective, only sampling a few papers for each species.
  2. Limiting, as there is no way to do this fully for tens of species.
  3. Incredibly time consuming.

What I'm really looking for is a system which I can put a taxa list into and it'll say "Those ones are all anoxic" or "Those 5 have shown denitrifying ability". I don't know if this could be done with literature mining or where to start this, or if there is a database around in the world which curates data like this...

Any suggestions are appreciated.

• 2.3k views
ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Daniel3.5k
5
gravatar for Daniel
4.3 years ago by
Daniel3.5k
Cardiff University
Daniel3.5k wrote:

On following Neil's advice, and reading some papers I reverted back to IMG to see what meta-data they collect with their published genomes, and actually, it's quite substantive.

On clicking into the 'Genome Browser' section you get presented with a list of species and some basic data on the sequencing project. But if you navigate to the bottom of the page, there is a selection as such: IMG_Metadata_example

This allows you to generate your table with the data you're interested in. A selection as above results in the following output, which can then be exported to excel or tsv format and used however you would like. (Full size available here:http://i.imgur.com/KjMXRDu.png)

enter image description here

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Daniel3.5k

That's great. I had ignored IMG since people often want more than "just the organisms with sequenced genomes". And I had not looked at it in a while. But there are so many microbial genomes now...and great to see the associated physiological data.

ADD REPLYlink written 4.3 years ago by Neilfws47k

I was just going to tick this 'answered', but I'm not allowed to do so on my own post... Do you know if there is a way to do it?

ADD REPLYlink written 4.2 years ago by Daniel3.5k

I thought this used to be possible. See what Istvan says (contact him if no response here).

ADD REPLYlink written 4.2 years ago by Neilfws47k
2
gravatar for Neilfws
4.3 years ago by
Neilfws47k
Sydney, Australia
Neilfws47k wrote:

There are not many good online resources with programmatic access for this kind of task. Microbiology data still tends to be stashed away in expensive manuals and books.

One good starting point might be the Microbial Life Database and links off that page. In particular they have a useful Google spreadsheet.

Less comprehensive but also useful: MicrobeWiki.

ADD COMMENTlink written 4.3 years ago by Neilfws47k

Thanks Neil, for putting me on to this. The google doc looked nice and full of data, but unfortunately a bit out of date, and several of the families and orders I'm looking at aren't present. But whilst exploring the links and papers, i've found that the IMG (Integrated Microbial Genomes) actually has a large amount of metadata which is looking really good. I'll put it as a separate answer as I think it's worth noting. Thanks again.

ADD REPLYlink written 4.3 years ago by Daniel3.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 962 users visited in the last hour