blastn biopython error: positional arguments
0
0
Entering edit mode
7.2 years ago
Sethzard ▴ 20

I'm running blast via biopython and when I run it I get the error:

Error: Too many positional arguments (1), the offending value: alecto[Organism]

The script which seems to be running into trouble is:

command = "blastn -entrez_query Pteropus alecto[Organism] OR Hendra virus[Organism] -db nr -outfmt '6 qseqid sseqid pident' -query sequence.fasta -out sequence.out " + remote_arg

Anyone know where I'm going wrong?

RNA-Seq blast • 2.1k views
ADD COMMENT
2
Entering edit mode

It's complaining about your entrez query ids. What is the rest of your code to run the blast command-line in biopython? Also, I would let everyone know that youre running a remote blast. Could you do this locally? Ideally, you could then run blastn separately, and then parse with biopython, and continue with the rest of your script.

ADD REPLY
0
Entering edit mode

In an ideal world I would be doing this very differently, however this is part of a software development project so I have to run it in biopython. I know that it doesn't like my queries, I just have no idea what's wrong with them.

This is the code we're using. for fasta in glob.glob(os.path.join(data_directory, "*_fixed.fasta")): print "Blast file %s ..." % fasta

    # Open CSV file to write results
    report = fasta.replace("_fixed.fasta", ".csv")
    with open(report, 'wb') as csvfile:
        writer = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
        writer.writerow(["Accession Number", "Gene Symbol", "FPKM Value"])

        # Iterate via sequences
        for sequence in open(fasta).read().split('>'):
            # Skip emtpy lines
            if not sequence:
                continue

            # Write sequence into file to blast
            with open("sequence.fasta", "w") as fasta_sequence:
                fasta_sequence.write(">%s" % sequence)

            # Blast sequence
            command = "blastn -entrez_query Pteropus Alecto[Organism] OR Hendra virus[Organism] -db nr -outfmt '6 qseqid sseqid pident' -query sequence.fasta -out sequence.out " + remote_arg
            subprocess.call(command, shell=True)

            # Get GI, accession number and FPKM value
            line = open("sequence.out").readlines()
            if line:

                line = line[0].replace("\t", "|").split("|")
                gene_gi = line[2]
                accession_number = line[4]
                fpkm_value = sequence.split("\n")[0].split()[-1]

                # Find gene name
                url = "https://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?id=%s&db=nuccore&report=genbank&retmode=text" % gene_gi
                data = urllib2.urlopen(url).readlines()
                i = 0
                for j, t in enumerate(data):
                    t = t.strip()
                    if t.startswith("gene"):
                        i = j + 1
                        break
                gene_name = data[i].strip().replace("/gene=", "").replace('"', "")

                # Save data
                writer.writerow([accession_number, gene_name, fpkm_value])

                # Remove temporary files
                subprocess.call("rm -f sequence.fasta sequence.out", shell=True)
ADD REPLY
0
Entering edit mode

I've now fixed it and run into a different problem. It was fixed by adding quotation marks around the organisms. -entrez_query 'Pteropus alecto[Organism] OR Hendra virus[Organism]'

ADD REPLY
0
Entering edit mode

what is your other problem? (1) you can use Biopython to run blastn (2) when removing temporary files at the end, you can import os and use os.remove('sequence.fasta') instead of calling subprocess. (3) and out of curiosity, i'm wondering why you're using csv writer. if the point is to open in excel later, you could write your results tab-delimited (just my preference). writer.write(accession_number, '\t', gene_name, '\t', fpkm_value)

ADD REPLY

Login before adding your answer.

Traffic: 1463 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6