Query bacterial genus on NCBI from a list of accessions
2
Hey guys,
I have a list of several accessions, such as
NZ_JRTV01000009.1
NZ_CEWL01000009.1
NZ_CP013481.2
NZ_CP009553.3
NZ_CBYD010000018.1
NZ_CBYE010000015.1
NZ_CP016370.1
...
I would like to fetch the genus for each accession. Please be aware that some accessions are old and have been replaced by new accessions on NCBI.
Thanks!
genome
• 1.1k views
echo "NZ_JRTV01000009.1 NZ_CEWL01000009.1 NZ_CP013481.2 NZ_CP009553.3 NZ_CBYD010000018.1 NZ_CBYE010000015.1 NZ_CP016370.1" | tr " " "\n" | while read A ; do echo -n "$A " && wget -O - -q "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=nuccore&id=${A} " | xmllint --xpath '//Item[@Name="TaxId"]/text()' - | xargs -I % wget -O - -q "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=taxonomy&id=%" | xmllint --xpath '//Item[@Name="Genus"]/text()' - && echo ; done
NZ_JRTV01000009.1 Klebsiella
NZ_CEWL01000009.1 Pandoraea
NZ_CP013481.2 Pandoraea
NZ_CP009553.3 Pandoraea
NZ_CBYD010000018.1 Elizabethkingia
NZ_CBYE010000015.1 Elizabethkingia
NZ_CP016370.1 Elizabethkingia
Using Entrezdirect
$ more acc
NZ_JRTV01000009.1
NZ_CEWL01000009.1
NZ_CP013481.2
NZ_CP009553.3
NZ_CBYD010000018.1
NZ_CBYE010000015.1
NZ_CP016370.1
$ epost -db nuccore -input acc | esummary | xtract -pattern DocumentSummary -element Caption,Organism
NZ_JRTV01000009 Klebsiella variicola
NZ_CP013481 Pandoraea apista
NZ_CEWL01000009 Pandoraea apista
NZ_CP009553 Pandoraea pnomenusa
NZ_CBYD010000018 Elizabethkingia anophelis PW2806
NZ_CBYE010000015 Elizabethkingia anophelis PW2809
NZ_CP016370 Elizabethkingia anophelis
If you only want genus name then add | awk -F ' ' '{OFS="\t"}{print $1 ,$2 }'
to end of above command.
Login before adding your answer.
Traffic: 3502 users visited in the last hour
You could use the approach of my code at https://github.com/jrjhealey/PYlogeny/tree/master/PYlogeny
Use Entrez to query the Accessions and get their TaxIDs, then use ETE3's
NCBITaxa
module to extract the associated genus etc.This will do nothing about converting your obsolete IDs though, so you'll need to tackle that yourself.