Question: blastn biopython error: positional arguments
gravatar for Sethzard
2.1 years ago by
Sethzard10 wrote:

I'm running blast via biopython and when I run it I get the error:

Error: Too many positional arguments (1), the offending value: alecto[Organism]

The script which seems to be running into trouble is:

command = "blastn -entrez_query Pteropus alecto[Organism] OR Hendra virus[Organism] -db nr -outfmt '6 qseqid sseqid pident' -query sequence.fasta -out sequence.out " + remote_arg

Anyone know where I'm going wrong?

blast rna-seq • 835 views
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Sethzard10

It's complaining about your entrez query ids. What is the rest of your code to run the blast command-line in biopython? Also, I would let everyone know that youre running a remote blast. Could you do this locally? Ideally, you could then run blastn separately, and then parse with biopython, and continue with the rest of your script.

ADD REPLYlink written 2.1 years ago by

In an ideal world I would be doing this very differently, however this is part of a software development project so I have to run it in biopython. I know that it doesn't like my queries, I just have no idea what's wrong with them.

This is the code we're using. for fasta in glob.glob(os.path.join(data_directory, "*_fixed.fasta")): print "Blast file %s ..." % fasta

    # Open CSV file to write results
    report = fasta.replace("_fixed.fasta", ".csv")
    with open(report, 'wb') as csvfile:
        writer = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
        writer.writerow(["Accession Number", "Gene Symbol", "FPKM Value"])

        # Iterate via sequences
        for sequence in open(fasta).read().split('>'):
            # Skip emtpy lines
            if not sequence:

            # Write sequence into file to blast
            with open("sequence.fasta", "w") as fasta_sequence:
                fasta_sequence.write(">%s" % sequence)

            # Blast sequence
            command = "blastn -entrez_query Pteropus Alecto[Organism] OR Hendra virus[Organism] -db nr -outfmt '6 qseqid sseqid pident' -query sequence.fasta -out sequence.out " + remote_arg
  , shell=True)

            # Get GI, accession number and FPKM value
            line = open("sequence.out").readlines()
            if line:

                line = line[0].replace("\t", "|").split("|")
                gene_gi = line[2]
                accession_number = line[4]
                fpkm_value = sequence.split("\n")[0].split()[-1]

                # Find gene name
                url = "" % gene_gi
                data = urllib2.urlopen(url).readlines()
                i = 0
                for j, t in enumerate(data):
                    t = t.strip()
                    if t.startswith("gene"):
                        i = j + 1
                gene_name = data[i].strip().replace("/gene=", "").replace('"', "")

                # Save data
                writer.writerow([accession_number, gene_name, fpkm_value])

                # Remove temporary files
      "rm -f sequence.fasta sequence.out", shell=True)
ADD REPLYlink written 2.1 years ago by Sethzard10

I've now fixed it and run into a different problem. It was fixed by adding quotation marks around the organisms. -entrez_query 'Pteropus alecto[Organism] OR Hendra virus[Organism]'

ADD REPLYlink written 2.1 years ago by Sethzard10

what is your other problem? (1) you can use Biopython to run blastn (2) when removing temporary files at the end, you can import os and use os.remove('sequence.fasta') instead of calling subprocess. (3) and out of curiosity, i'm wondering why you're using csv writer. if the point is to open in excel later, you could write your results tab-delimited (just my preference). writer.write(accession_number, '\t', gene_name, '\t', fpkm_value)

ADD REPLYlink written 2.1 years ago by
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1033 users visited in the last hour