Question: Uniprot In Gff3 Format
gravatar for User 9996
8.2 years ago by
User 9996800
User 9996800 wrote:

Where can I find valid versions of Uniprot database (for all isoforms of all genes) in GFF3 format? I'm interested in this for hg18/hg19 and mm9. Thanks.

gene gff uniprot • 2.2k views
ADD COMMENTlink modified 8.2 years ago by Jarretinha3.3k • written 8.2 years ago by User 9996800
gravatar for Jerven
8.2 years ago by
Jerven640 wrote:

Building on Pieres answer you can then get each uniprot record in gff using One by one. Or using batch retrieve to get the entries in one go. Then look for the small link back to uniprot and then download the uniprot entries using the orange download button in gff.

This is gff but not 100% gff3 as the Sequence Ontology does not have all UniProt features so they can't be described with 100% valid gff3. Which makes it rather hard for UniProt to be encoded in GFF3.

ADD COMMENTlink written 8.2 years ago by Jerven640
gravatar for Pierre Lindenbaum
8.2 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:
  • goto ucsc table browser:
  • select mamal/human/hg18
  • select genes/UCSC genes/knwonGenes
  • output format: BED
  • get output

the column, proteinID should be the Uniprot-ID

ADD COMMENTlink written 8.2 years ago by Pierre Lindenbaum119k
gravatar for Jarretinha
8.2 years ago by
São Paulo, Brazil
Jarretinha3.3k wrote:

By taking advantage of Pierre's tip, you'll just need to get the ID list here.

With the list in hand, remove all header/RefSeq things and the second column with:

cat hgTables | grep -v "NP_" | awk '{print $1}'> hgTablesUniProt

Then, get your files (Beware! Loooong list!):

while read line; do wget$line.gff done < hgTablesUniProt

As Pierre says: That's it!

Just to mention, I've assumed a bash shell in hand. And I think a delay in wget could be polite.

ADD COMMENTlink written 8.2 years ago by Jarretinha3.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 879 users visited in the last hour