Question: Getting Phyla Using List Of Organisms Or Gis, Or Genebank Ids
0
gravatar for bioinfo
5.7 years ago by
bioinfo690
EU
bioinfo690 wrote:

I have a list of 500 proteins with their sequences, gi number, organisms they belongs to, GenBank ids. I want to create a pie chart of the taxonomic phylum of these proteins. How I can I proceed?

So, far I thought to go for Id mapping using the GI/ GeneBank Ids to get the taxonomy Id's then probably I can use that taxonomy ids to get the list of phyla. But I have not seen any option in the Uniprot browser to do that ID making between gi's/GenBank ID to taxonomy ID but in NCBI taxonomy browser, there is an option to enter the list of organisms to get the taxonomy ids. But then how to get the phyla list? In the final output, I'm expecting something like below:

Phylum              Counts
Proteobactia         300
Acodonacteria        100
Cyanobacteria.       100
--------------------------
Total.               500
taxonomy • 3.0k views
ADD COMMENTlink modified 5.7 years ago by David Westergaard1.4k • written 5.7 years ago by bioinfo690
0
gravatar for David Westergaard
5.7 years ago by
Copenhagen, Denmark
David Westergaard1.4k wrote:

If you know the organism it belongs to, you can use the NCBI eutils to search for the taxonomy id:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=taxonomy&term=bacillus+cereus

And then lookup the lineage in taxonomy:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=1396

And then it's just matter of parsing the lineage entry in the xml result.

ADD COMMENTlink written 5.7 years ago by David Westergaard1.4k

As I have the gi number I can match my gi's with the NCBI taxonomy file of gi vs taxid. But looking up the lineage and get the phylum name for those 500 taxids in a single run is not that easy I guess. It looks easy to me when you have one single taxid.

ADD REPLYlink written 5.7 years ago by bioinfo690

I don't understand. Why is it not easy? What is the problem? You will have to be more specific if you want help.

Lineage always comes (to my knowledge) in the form organism, superkingdom, phylum, class, order, family, genus, species - From the NCBI EUtils service, at least. I guess the some of the last ones depends on the tax id you use for lookup. E.g. if you lookup something by its family tax id, you probably won't get genus and species.

ADD REPLYlink written 5.7 years ago by David Westergaard1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 754 users visited in the last hour