Question: How to go from locus tag to FASTA sequence using Biopython / specify db for blast query
gravatar for kahackbarth27
2.5 years ago by
kahackbarth2710 wrote:

I am trying to take a list of S. cerevisiae gene locus tags [YDR251W, YDR342C, YPR022C, ...] and run a blastx search for homologs in some related species (ex. S. paradoxus, K. lactis). To automate this, I created a script using Biopython that searches for the NCBI gene ID [851838, 851943, 856133, ....]. While testing my code, I realized that running the blast searches manually using the gene IDs wasn't even returning the correct results. Although the gene IDs I have match the locus tags, when I put that ID into blast it usually seems to pull from the EST database instead of the Gene database. There are a couple ways I could go about fixing this, but I'm not sorry which approach is most straightforward (I am very new to the world of bioinformatics). 1. Is there a way in a blast search to specify which database you would like to pull from? I would later figure out how to do that within a Biopython script. 2. Is there a way to get Biopython to take a locus tag and retrieve the corresponding gene FASTA nucleotide sequence? Searching the Gene database for my locus tags and clicking "Go to nucleotide: FASTA" gives me the sequence I want to use in my blastx searches, but I don't know how to code that using Biopython.

blast • 1.1k views
ADD COMMENTlink modified 2.4 years ago by Biostar ♦♦ 20 • written 2.5 years ago by kahackbarth2710
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1897 users visited in the last hour