Question: Need a comprehensive list of human Gene Ontology IDs and Names.
0
gravatar for siddharth.avadhanam
4.1 years ago by
India
siddharth.avadhanam30 wrote:

Hi,

I've downloaded the gene_association.goa_human file from the ebi website. I tried running a python script to filter out GO IDs with "membrane" in the description. This approach doesn't seem to be working because the target receptor I am looking for doesn't contain "membrane" in the description ( though it is a membrane receptor). The QuickGo search feature on the ebi website using the GO id corresponding to the target receptor, however, gives me  the correct description in the name field ( name : membrane ). Is there anyway I can get a list of human GO ID's that contain the appropriate descriptions ( the gene_assocation.goa_human file doesn't work ). My aim is to generate a list of GO ID's and Gene names that correspond to membrane expressed proteins. 

Thanks
Appreciate any help 
Siddharth 

R gene • 2.2k views
ADD COMMENTlink modified 4.1 years ago by a.zielezinski8.6k • written 4.1 years ago by siddharth.avadhanam30
3
gravatar for a.zielezinski
4.1 years ago by
a.zielezinski8.6k
a.zielezinski8.6k wrote:

You can easily generate a list of all human membrane-associated genes using the web service AmiGO2 (Gene Ontology).

  1. Search for a GO record of membrane as a cellular component. I think the accession number is GO:0016020.
  2. In the Associations tab, filter protein products according to a given taxon (Homo sapiens) and source database (e.g. UniProt, Ensembl, RefSeq etc.).
  3. Download the resulting list choosing a format that suits you (e.g. tab-separated file, sequences in FASTA format).

For example, here's a list of all gene products related to membrane.

 

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by a.zielezinski8.6k

Hi. 
Thanks a ton, this is exactly what I was looking for. I am having problems with downloading however. I am unable to download more than 5000 of the results at a time ( the prompt at the website says 10,000 lines). And how do I go about downloading the next batch of results ( 5000 - 10000) and so on ? 

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by siddharth.avadhanam30

Could you pass me a link to the results you're trying to download? I'm asking because I just downloaded the file containing all gene products related to membrane. It's a way more than 10,000 lines and I didn't encounter any prompt saying there's a limit.

ADD REPLYlink written 4.1 years ago by a.zielezinski8.6k

http://amigo.geneontology.org/amigo/term/GO:0016020#display-lineage-tab 
that's the results page, and upon clicking on the download link, I get the prompt 

ADD REPLYlink written 4.1 years ago by siddharth.avadhanam30
1

You're right - there is a limit up to 10,000 lines. I don't see a simple way to overcome this limitation. However, you can download the results in parts of 10,000 lines. Click the download button and display first 10,000 lines. Look at the url address in your browser. For example, in my browser it looks like this: 

http://golr.geneontology.org/solr/select?defType=edismax&qt=standard&indent=on&wt=csv&rows=10000&start=0&fl=source,bioentity_internal_id,bioentity_label,qualifier,annotation_class,reference,evidence_type,evidence_with,aspect,bioentity_name,synonym,type,taxon,date,assigned_by,annotation_extension_class,bioentity_isoform&facet=true&facet.mincount=1&facet.sort=count&json.nl=arrarr&facet.limit=25&hl=true&hl.simple.pre=%3Cem%20class=%22hilite%22%3E&csv.encapsulator=&csv.separator=%09&csv.header=false&csv.mv.separator=%7C&fq=document_category:%22annotation%22&fq=regulates_closure:%22GO:0016020%22&fq=taxon_closure_label:%22Homo%20sapiens%22&facet.field=source&facet.field=assigned_by&facet.field=aspect&facet.field=evidence_type_closure&facet.field=panther_family_label&facet.field=qualifier&facet.field=taxon_closure_label&facet.field=annotation_class_la​

Notice, that there is a start variable in the url. There, you can specify the row number to be displayed as first. For example, if you want to see the next 10,000 lines, just change the value of the start variable from 0 to 10000.

Alternatively, you may download all gene products related to membrane using QuickGO form EBI. The philosophy is the same: enter GO accession number to see all proteins that are related to it.

ADD REPLYlink written 4.1 years ago by a.zielezinski8.6k
1

Thanks a ton. I was agonising over that next step. :) 

ADD REPLYlink written 4.1 years ago by siddharth.avadhanam30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 929 users visited in the last hour