Question: blastn biopython error: positional arguments
0
gravatar for Sethzard
2.7 years ago by
Sethzard10
Sethzard10 wrote:

I'm running blast via biopython and when I run it I get the error:

Error: Too many positional arguments (1), the offending value: alecto[Organism]

The script which seems to be running into trouble is:

command = "blastn -entrez_query Pteropus alecto[Organism] OR Hendra virus[Organism] -db nr -outfmt '6 qseqid sseqid pident' -query sequence.fasta -out sequence.out " + remote_arg

Anyone know where I'm going wrong?

blast rna-seq • 995 views
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Sethzard10
2

It's complaining about your entrez query ids. What is the rest of your code to run the blast command-line in biopython? Also, I would let everyone know that youre running a remote blast. Could you do this locally? Ideally, you could then run blastn separately, and then parse with biopython, and continue with the rest of your script.

ADD REPLYlink written 2.7 years ago by st.ph.n2.5k

In an ideal world I would be doing this very differently, however this is part of a software development project so I have to run it in biopython. I know that it doesn't like my queries, I just have no idea what's wrong with them.

This is the code we're using. for fasta in glob.glob(os.path.join(data_directory, "*_fixed.fasta")): print "Blast file %s ..." % fasta

    # Open CSV file to write results
    report = fasta.replace("_fixed.fasta", ".csv")
    with open(report, 'wb') as csvfile:
        writer = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
        writer.writerow(["Accession Number", "Gene Symbol", "FPKM Value"])

        # Iterate via sequences
        for sequence in open(fasta).read().split('>'):
            # Skip emtpy lines
            if not sequence:
                continue

            # Write sequence into file to blast
            with open("sequence.fasta", "w") as fasta_sequence:
                fasta_sequence.write(">%s" % sequence)

            # Blast sequence
            command = "blastn -entrez_query Pteropus Alecto[Organism] OR Hendra virus[Organism] -db nr -outfmt '6 qseqid sseqid pident' -query sequence.fasta -out sequence.out " + remote_arg
            subprocess.call(command, shell=True)

            # Get GI, accession number and FPKM value
            line = open("sequence.out").readlines()
            if line:

                line = line[0].replace("\t", "|").split("|")
                gene_gi = line[2]
                accession_number = line[4]
                fpkm_value = sequence.split("\n")[0].split()[-1]

                # Find gene name
                url = "https://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?id=%s&db=nuccore&report=genbank&retmode=text" % gene_gi
                data = urllib2.urlopen(url).readlines()
                i = 0
                for j, t in enumerate(data):
                    t = t.strip()
                    if t.startswith("gene"):
                        i = j + 1
                        break
                gene_name = data[i].strip().replace("/gene=", "").replace('"', "")

                # Save data
                writer.writerow([accession_number, gene_name, fpkm_value])

                # Remove temporary files
                subprocess.call("rm -f sequence.fasta sequence.out", shell=True)
ADD REPLYlink written 2.7 years ago by Sethzard10

I've now fixed it and run into a different problem. It was fixed by adding quotation marks around the organisms. -entrez_query 'Pteropus alecto[Organism] OR Hendra virus[Organism]'

ADD REPLYlink written 2.7 years ago by Sethzard10

what is your other problem? (1) you can use Biopython to run blastn (2) when removing temporary files at the end, you can import os and use os.remove('sequence.fasta') instead of calling subprocess. (3) and out of curiosity, i'm wondering why you're using csv writer. if the point is to open in excel later, you could write your results tab-delimited (just my preference). writer.write(accession_number, '\t', gene_name, '\t', fpkm_value)

ADD REPLYlink written 2.7 years ago by st.ph.n2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1841 users visited in the last hour