I am trying to take a list of S. cerevisiae gene locus tags [YDR251W, YDR342C, YPR022C, ...] and run a blastx search for homologs in some related species (ex. S. paradoxus, K. lactis). To automate this, I created a script using Biopython that searches for the NCBI gene ID [851838, 851943, 856133, ....]. While testing my code, I realized that running the blast searches manually using the gene IDs wasn't even returning the correct results. Although the gene IDs I have match the locus tags, when I put that ID into blast it usually seems to pull from the EST database instead of the Gene database. There are a couple ways I could go about fixing this, but I'm not sorry which approach is most straightforward (I am very new to the world of bioinformatics). 1. Is there a way in a blast search to specify which database you would like to pull from? I would later figure out how to do that within a Biopython script. 2. Is there a way to get Biopython to take a locus tag and retrieve the corresponding gene FASTA nucleotide sequence? Searching the Gene database for my locus tags and clicking "Go to nucleotide: FASTA" gives me the sequence I want to use in my blastx searches, but I don't know how to code that using Biopython.
Question: How to go from locus tag to FASTA sequence using Biopython / specify db for blast query
2.5 years ago by
kahackbarth27 • 10
kahackbarth27 • 10 wrote:
ADD COMMENT • link •
Please log in to add an answer.
Powered by Biostar version 2.3.0
Traffic: 1897 users visited in the last hour