Question: Identifying Certain Genes in a Bacterial Genome
Identifying some Genes for Arsenic and Nitric Oxide Metabolism in some microbial genomes As described by Isokpehi, et al., 2014,developed a suite of bioinformatics and visual analytics methods to evaluate the availability (presence or absence) and abundance of functional annotations in a microbial genome. Genes of protein family (Pfam) that are known to participate in the metabolism of arsenic and stress response in microbes were downloaded from Integrated Microbial Genome (IMG) system ( and a list of reference genomes sequenced by the HMP was obtained from the HMP catalog ( the Pfams were downloaded as excel spreadsheet and was integrated in a visual software (Tableau Desktop) against the genome sequences, I think, to display the absence and abundance of genes annotated with Pfam function in the genome of the specie. A binary matrix that encodes the presence (1) or absence (0) of a relevance annotation for selected genomes was constructed. The binary matrix was visualized with matrix2png. But I have not been able to replicate this and I am yet to come across any other workable approach.

  1. How can I get the list of genes known to be responsible for arsenic metabolism and nitric oxide metabolism in microbes?

  2. Aside Tableau Desktop which I only have access to the demo, are there other visualizing software that can serve the same purpose?

  3. Having over 2000 species to search through their genomes for these genes, is their a way I can download all the genomes once into the software instead of one at a time?

Thank you.

Have you looked for any ontologies related to this function in GO or KEGG?

ADD REPLYlink written 12 weeks ago by jrj.healey11k

I advise you to break down your question into more manageable, focused questions. Currently it is too wide (and lacking details at certain parts), and it would demand too much effort and time to answer.

  1. You will have to search the literature, or use a database. KEGG, for example, has these metabolic pathways for several microbial genomes

  2. How is the visualisation you want? I don't have Tableau and you didn't show any plot. Anyway, you can probably put together a solution in R, for example using ggplot.


ADD REPLYlink written 12 weeks ago by h.mon24k
