How can I make multiple taxID queries using qblast and store multiple blast handles using NCBIXML?
0
0
Entering edit mode
4 months ago

Hello everybody, I'm new to Biopython, and programming in general, but I am trying to create a small script that iterates through a dictionary, collects each taxID, and, then, searches the protein-seq-file against this taxID-organism. When I try this code (without iteration) using a txid in the entrez_query attribute it usually works, but when using the dictionary like in this script below the final .txt file turns out to be empty. Does anyone have an idea why? Any help is welcome!

a glimpse of my multiple_aa.faa:

sp|Q9FW44|ADR1_ARATH Disease resistance protein ADR1 OS=Arabidopsis thaliana OX=3702 GN=ADR1 PE=2 SV=2 MASFIDLFAGDITTQLLKLLALVANTVYSCKGIAERLITMIRDVQPTIREIQYSGAELSN HHQTQLGVFYEILEKARKLCEKVLRCNRWNLKHVYHANKMKDLEKQISRFLNSQILLFVL AEVCHLRVNGDRIERNMDRLLTERNDSLSFPETMMEIETVSDPEIQTVLELGKKKVKEMM FKFTDTHLFGISGMSGSGKTTLAIELSKDDDVRGLFKNKVLFLTVSRSPNFENLESCIRE FLYDGVHQRKLVILDDVWTRESLDRLMSKIRGSTTLVVSRSKLADPRTTYNVELLKKDEA MSLLCLCAFEQKSPPSPFNKYLVKQVVDECKGLPLSLKVLGASLKNKPERYWEGVVKRLL RGEAADETHESRVFAHMEESLENLDPKIRDCFLDMGAFPEDKKIPLDLLTSVWVERHDID EETAFSFVLRLADKNLLTIVNNPRFGDVHIGYYDVFVTQHDVLRDLALHMSNRVDVNRRE RLLMPKTEPVLPREWEKNKDEPFDAKIVSLHTGEMDEMNWFDMDLPKAEVLILNFSSDNY VLPPFIGKMSRLRVLVIINNGMSPARLHGFSIFANLAKLRSLWLKRVHVPELTSCTIPLK NLHKIHLIFCKVKNSFVQTSFDISKIFPSLSDLTIDHCDDLLELKSIFGITSLNSLSITN CPRILELPKNLSNVQSLERLRLYACPELISLPVEVCELPCLKYVDISQCVSLVSLPEKFG KLGSLEKIDMRECSLLGLPSSVAALVSLRHVICDEETSSMWEMVKKVVPELCIEVAKKCF TVDWLDD

sp|Q9FKZ1|DRL42_ARATH Probable disease resistance protein At5g66900 OS=Arabidopsis thaliana OX=3702 GN=At5g66900 PE=3 SV=1 MNDWASLGIGSIGEAVFSKLLKVVIDEAKKFKAFKPLSKDLVSTMEILFPLTQKIDSMQK ELDFGVKELKELRDTIERADVAVRKFPRVKWYEKSKYTRKIERINKDMLKFCQIDLQLLQ HRNQLTLLGLTGNLVNSVDGLSKRMDLLSVPAPVFRDLCSVPKLDKVIVGLDWPLGELKK RLLDDSVVTLVVSAPPGCGKTTLVSRLCDDPDIKGKFKHIFFNVVSNTPNFRVIVQNLLQ HNGYNALTFENDSQAEVGLRKLLEELKENGPILLVLDDVWRGADSFLQKFQIKLPNYKIL VTSRFDFPSFDSNYRLKPLEDDDARALLIHWASRPCNTSPDEYEDLLQKILKRCNGFPIV IEVVGVSLKGRSLNTWKGQVESWSEGEKILGKPYPTVLECLQPSFDALDPNLKECFLDMG SFLEDQKIRASVIIDMWVELYGKGSSILYMYLEDLASQNLLKLVPLGTNEHEDGFYNDFL VTQHDILRELAICQSEFKENLERKRLNLEILENTFPDWCLNTINASLLSISTDDLFSSKW LEMDCPNVEALVLNLSSSDYALPSFISGMKKLKVLTITNHGFYPARLSNFSCLSSLPNLK RIRLEKVSITLLDIPQLQLSSLKKLSLVMCSFGEVFYDTEDIVVSNALSKLQEIDIDYCY DLDELPYWISEIVSLKTLSITNCNKLSQLPEAIGNLSRLEVLRLCSSMNLSELPEATEGL SNLRFLDISHCLGLRKLPQEIGKLQNLKKISMRKCSGCELPESVTNLENLEVKCDEETGL LWERLKPKMRNLRVQEEEIEHNLNLLQMF

dic_tx = {"nicotiana":'"(txid4097[ORGN])"',"grapevine":'"(txid:29760[ORGN])"',"almond":'"(txid:3755[ORGN])"',"apple":'"(txid:3750[ORGN])"',"citrus":'"(txid:2711[ORGN])"',"coffee":'"(txid:13443[ORGN])"', "olive":'"(txid:4146[ORGN])"'}

    for k,v in dic_tx.items():
        print(k)
        print(v)
        Entrez.email = '...@...'
        list_record_host = []
        for record in SeqIO.parse("multiple_aa.faa", format="fasta"):
            print(record.id)
    #         print(record.seq)

            # online request
            try:
                result_handle = NCBIWWW.qblast("blastp","nr", record.format("fasta"),entrez_query=v, hitlist_size=1)
                print(result_handle)
            except HTTPError:
                time.sleep(5)
                result_handle = NCBIWWW.qblast("blastp","nr", record.format("fasta"),entrez_query=v, hitlist_size=1)

            # result handle stored in a list
            list_record_host.append(result_handle)
        result_handle_list_host = open("%s.xml" % k, "w") 
        for item in list_record_host:
            result_handle_list_host.write(item.read())
        result_handle_list_host.close()
    #     result_handle_list_host
        reopen_result_handle = "%s.xml" % k
        blast_records = NCBIXML.parse(open(reopen_result_handle))
        save_file = open("%s_NLR.txt" % k, 'w')
        for blast_record in blast_records:
            for alignment in blast_record.alignments:
                for hsp in alignment.hsps:
                    save_file.write('>%s\n' % (alignment.title,))
          #here possibly to output something to file, between each blast_record
        save_file.close()
python NCBIXML biopython help qblast • 122 views
ADD COMMENT

Login before adding your answer.

Traffic: 1984 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6