I have a list of swissprot IDs. How can I download the complete data in text format from the swissprot based on these IDs? I need only the data of these IDs.
Your suggestions would be appreciated!!
Assuming there is one ID per line in a file named "ids.txt", do this from a bash shell:
while read id; do wget http://www.uniprot.org/uniprot/$id.txt; done < ids.txt
For a non-programatic way of addressing the question...
If you go to the "Retrieve" tab at the top of the UniProt webpage http://www.uniprot.org/ , you can upload a file with a list of IDs in it, or just copy-and-paste into the query box. You can then get them in GFF, "Flat Text", FASTA, XML formats. This help page on the UniProt site gives more information about this http://www.uniprot.org/help/batch
This method is much faster than using wget to download fasta files separately.
It won't be "much faster" than wget, unless using the website magically increases the bandwidth of your network connection :)
I was also using wget before to download thousands of sequences separately which was taking ages. Thanks to batch download, it is very fast now.
Excellent solution; I had forgotten that UniProt offers batch retrieval.
If you want few files sequences do it manually via http://www.uniprot.org/uniprot/SWISSPROT_ID_YOU_WANT.txt
changing ID every time.
or If you are using windows "wget" is the best tool for downloads via command prompt.
This could help http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc132
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy