Question: guidance for communication online seach engine with python or R script?
0
gravatar for boaty
2.7 years ago by
boaty110
boaty110 wrote:

Hi guys,

I'm looking for some python packages or R packages to communicate repeatMasker online service from local script. Yes, I can run repeatMasker locally but I don't trouble myself with database update.

I am a lazy person, so, instead of upload fasta then download result manually, I am wondering if there's the way that send my quest to repeatmasker with its parameters and get results out automatically. I know biopython can do this for blast because of API but not for repeatMasker.

thanks

ADD COMMENTlink modified 2.0 years ago • written 2.7 years ago by boaty110
1

I am a lazy person

That took courage :-)

ADD REPLYlink written 2.7 years ago by genomax92k
1

sorry, l tried to use a metaphor from programmers community

ADD REPLYlink written 2.7 years ago by boaty110
3
gravatar for boaty
2.0 years ago by
boaty110
boaty110 wrote:

yes, i answer my question I asked 8 months ago. I want to write a script of auto online search which links FISH probe design program of FISH-quant tools. So our biologists who only have Windows can perform probe design script by themselves.

main python tools used is selenium, a excellent web tool for python and java. I also used katalon recorder, a firefox plugin record your action of web navigating and export codes, so you can copy paste directly in your script to reproduce same action. of course, firefox inspector is needed to understand web pages.

Here's my script to upload fasta sequence from .fasta file and online search repeat region with croiss_match option. finally, to get results .masked file.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
import  time, re, os
import traceback,logging

try:
    file="your/fasta/file"          #your local fasta file
    seqs=open(file,'r').read()

    driver = webdriver.Firefox()         #open firefox       
    driver.get("http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker")        #go to repeatmasker

    driver.find_element_by_name("sequence").send_keys(seqs)
    driver.find_element_by_xpath("(.//*[normalize-space(text()) and normalize-space(.)='Search Engine'])[1]/following::input[3]").click()   #choose croiss_match algo
    driver.find_element_by_name("submit").click()

    #get results
    still=True
    i=1
    while i < 9 or still==True:
        print("waiting "+str(i*30)+" seconds...")
        time.sleep(30*i)     # wait page to charge and wait results
        try:
            masked=driver.find_element_by_partial_link_text(".masked").click()    #if there's masked results, get it
        except NoSuchElementException:
            #for queued request
            if driver.find_element_by_tag_name('h2').text=='Request Queued':
                try:
                    driver.find_element_by_partial_link_text('.html').click()
                except Exception as e:
                    print(e)
            #for no results found
            if "No repetitive sequences" in driver.find_element_by_tag_name('pre').text:
                still=False
                exit("no repetitive sequences were detected")
        else:
            still =False
        driver.refresh()
        print("page refreshed")
        i+=1
    content=driver.find_element_by_tag_name('pre')  #get sequences of results

    #write result to file
    with open("seqs.masked",'w') as out:
        out.write(content)

except Exception as e:
    print(e)

finally:
    driver.close()

selenium + katalon recorder, very strong combination!!!

ADD COMMENTlink written 2.0 years ago by boaty110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1974 users visited in the last hour