Question: How to map sub-cellular localisation to enteries in uniprot database fasta file.
1
gravatar for wl284
24 months ago by
wl28440
UK
wl28440 wrote:

I have a dataset of proteins that I have blasted against the uniprot-swissprot database.

I'd now like to identify which proteins are likely to have a mitochondrial sub-cellular localisation based on the sub-cellular localisation of their best blast hit in the swiss-prot database.

The fasta headers of the uniprot proteins look like this:

">sp|Q64602|AADAT_RAT Kynurenine/alpha-aminoadipate aminotransferase, mitochondrial OS=Rattus norvegicus GN=Aadat PE=1 SV=1"

I have found a gene ontology mapping file (link below) but the fasta headers don't contain the GO IDs necessary to map them. ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/external2go/uniprotkb_sl2go

Is there some intermediate file that I need to use and does anyone know where to find it? Any help would be appreciated.

blast sequence • 705 views
ADD COMMENTlink modified 24 months ago by Pierre Lindenbaum119k • written 24 months ago by wl28440
3
gravatar for Pierre Lindenbaum
24 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

using xslt:

$ awk -F '|' '/^>/ {printf("%s\n",$2);}' input.fa | while read ACN ; do curl -s "https://www.uniprot.org/uniprot/${ACN}.xml"| xsltproc transform.xsl - ; done

Q64602  Mitochondrion

with transform.xsl:

ADD COMMENTlink modified 9 months ago by Michael Dondrup46k • written 24 months ago by Pierre Lindenbaum119k

Thanks, that's awesome.

ADD REPLYlink written 24 months ago by wl28440

I changed the protocol to HTTPS, otherwise the response could be empty, because Uniprot move to https and sends a document moved header.

ADD REPLYlink written 9 months ago by Michael Dondrup46k

yes I saw it this morning ! :-D https://github.com/lindenb/jvarkit/commit/5f6b66bc05201d2d543e1b1214640dd5c84051f8

ADD REPLYlink written 9 months ago by Pierre Lindenbaum119k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 747 users visited in the last hour