I want to programmatically list all the aliases for a whole number of gene names. I am using Entrez/NCBI and the "gene" database to do this.
For eg:
When I search the ncbi gene database for XRCC1, I get the results
http://www.ncbi.nlm.nih.gov/gene/?term=XRCC1
In the browser NCBI gives you a nice table with the "Name", "Geneid" , "Location" and importantly "Aliases". How do I get these aliases from a Biopython or Entrez eutils query programmatically?
Currently I can get a bunch of information from a function like:
from Bio import Entrez
def main(infile_name):
Entrez.email = "myemail@myworkplace.com"
# How do I setup this query to get Aliases?
query = Entrez.esearch(db="gene",term=input_term,retmode="text")
# Process this result and extract Aliases?
result = Entrez.read(query)
print result
Thanks for your help
Thanks for the pointer to Entrez Direct. I used the workflow you outlined to fetch all ids and then iterate through to find a match for the ID used in the list of aliases. Using code like
edirect is a nice tool