Question: (Closed) How to download database of Human protein sequences with sub cellular locations?
0
gravatar for nkhan.mscs15seecs
11 months ago by
nkhan.mscs15seecs50 wrote:

How Can I download Human Protein Database for every protein sequence with its sub cellular locations. I know there is this famous site UniProt where all protein database is located and can be downloaded in fasta format. Also at that site there is information of protein sub cellular locations.

Is there some built in tool or I need to write some web crawler for that?

ADD COMMENTlink modified 11 months ago by Pierre Lindenbaum122k • written 11 months ago by nkhan.mscs15seecs50
1

I have had the same problem recently, I will link you to my question and close this one, ok?

Extracting Sub-cellular location from Uniprot into tabular format

ADD REPLYlink modified 11 months ago • written 11 months ago by Michael Dondrup46k
1

Hello nkhan.mscs15seecs!

Questions similar to yours can already be found at:

We have closed your question to allow us to keep similar content in the same thread.

If you disagree with this please tell us why in a reply below. We'll be happy to talk about it.

Cheers!

ADD REPLYlink written 11 months ago by Michael Dondrup46k
1
gravatar for Pierre Lindenbaum
11 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:

using the uniprot XML dump, an XSL transformation sheet and my program XsltStream: http://lindenb.github.io/jvarkit/XsltStream.html

ADD COMMENTlink written 11 months ago by Pierre Lindenbaum122k
1

Hi, in my experience that gets you about half of the the annotations, however the way the web front-end of Uniprot calculates the localization seems more complex. They also calculate a localization for entries without the comment section, solely based on GO terms. So if looking for the 'final verdict' from Uniprot, there is no real solution other than screen-scraping, getting the Uniprot code (would be awesome if this was open-source, not sure why they won't share the code), or parsing GO terms to test if they are derived from the cellular component ontology.

ADD REPLYlink modified 11 months ago • written 11 months ago by Michael Dondrup46k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1082 users visited in the last hour