Question: how can I combine uniprot's random and limit functionality?
1
gravatar for arronslacey
4.8 years ago by
arronslacey240
United Kingdom
arronslacey240 wrote:

Hi - thanks to some help on here I am getting used to querying uniprot. A question I have is about how to use both the "random" and "limit" functionalities in the same query. 

 

For example, I have:

http://www.uniprot.org/uniprot/?query=reviewed:yes+AND+organism:9606+AND+annotation:(type:transmem)&format=fasta&random=yes&limit=10

 

which I am trying to get some transmembrane proteins, randomize the order in which they appear, and then choose the first 10. I would expect to see different(!) proteins each time I run this query if I am using the random flag. however, I obtain the same 10 proteins each time. it seems the random flag is being ignored. maybe this isn't what it's used for an I have it wrong.

 

can I use the random and limit flags together in such a way?

 

 

***EDIT***

From this thread, Is it possible to download a random set of proteins? (fasta files) and using Elisabeth's answer I have used the uniprot query and wrapped in a little R script. the result is similar to Pierre's answer in that thread, however my campus firewall doesn't allow me to connect via mysql. Here's the script:

 

library(XML)
library(httr)

suppressPackageStartupMessages(library("methods")) 
search.term="reviewed:yes+AND+organism:9606+AND+annotation:(type:transmem)&random=yes"
for (i in 1:10){
url.name=paste0("http://www.uniprot.org/uniprot/?query=",search.term)
url.get=GETurl.name)
url.content=content(url.get, as="text")
links <- xpathSApply(htmlParse(url.content), "//a[contains(@href, 'fasta')]",xmlGetAttr, "href")
fasta_link<-paste0("http://www.uniprot.org",links[1])
download.file(fasta_link,"myseqs.fasta",quiet= FALSE,mode="a")
}

 

 

This downloads 10 transmembrane sequences chosen at random. haven't quite worked out how to do this without replacement yet, but will update when I have. download.file "mode" has been set to append (a flag) as I wanted to collect all sequences into one file.

cheers.

uniprot • 1.3k views
ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by arronslacey240
2
gravatar for Elisabeth Gasteiger
4.8 years ago by
Geneva
Elisabeth Gasteiger1.6k wrote:

The

&random=yes

flag was designed to pick a random entry from a query and to work with the html format only, for interactive use. We can look into providing it for other formats as well.

You might want to give this tool a try: http://www.rocrooks.co.uk/biology/uniprot-random.php

 

 

 

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by Elisabeth Gasteiger1.6k

thanks elizabeth, that would be a really useful. in the mean time, I have used your answer from Is it possible to download a random set of proteins? (fasta files) and made a little R script which will grab the fasta file from a random page. can loop through the uniprot query as many times as you like

ADD REPLYlink written 4.8 years ago by arronslacey240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1694 users visited in the last hour