You may skip to the question part if it seems clear enough without the context.
I'm trying to automate the process of extracting specific info from BLAST searches.
When doing it manually, I would blastn some sequences against the nt database, and record the following info of the top hit: the query coverage, the percent identity, start of the aligned range (hit_start), and the gene product of the first CDS.
The first two would be on the result page, and the last two would require clicking on the hit to see the range first, and then open the GenBank info to find the CDS info.
Since I need to go through a large amount of sequences, I'd like to automate this process. I managed to get everything with Biopython except for the product name, I guess it's because the other attributes could be extracted from the blast results, but the genbank info is really independent from the blast results.
So my question now is: if I obtained the accession id or gi number from the blast results, as well as the range of alignment, i.e. the location within the gene, is there a way to somehow connect to genbank or other databases to extract the first gene product name within or overlapping this range?
I hope I explained myself clearly... I've been trying to rearrange keywords to look for a solution or part of a solution to this problem but haven't been successful.
Thank you in advance.
Edit: More specifically, my immediate question would be: is there a way to extract GenBank info on a specific region instead of pulling the entire file? For example, on the BLAST result page, if I click on the GenBank info of a hit, it would specify that this is a partial Genbank file for: e.g. ACCESSION CP009256 REGION: 3404381..3405501. When I do it in python, is there an option for retrieving a partial file like this, and then look for the product name in CDS?