I have a blast 24 column tabular file constructed after blasting the longest contigs in our genome assembly. The file shows the aligned query and subject sequences, as well as the genbank accession number of the subject sequence. The results for the first two sequence hits are shown below:
Since the gi numbers are not particularly useful by themselves, I am wondering how difficult it would be to develop a python script that would look up each gi number and replace it with the actual gene id. I don't care if this would be in a separate file, as I'm assuming I could merge these back into the main tabular file.
Any help you might have would be greatly appreciated.