Filter Biological Process Related to Mitochondria for a List of Genes
1
3
Entering edit mode
6.9 years ago
Bahrani ▴ 30

I have a list of Uniprot gene IDs associated with Gene Ontology (biological processes), which I have obtained from Uniprot.org.  I am showing only one gene ID with associated the biological processes -- because the other genes have a lengthy biological process.

 O95831    activation of cysteine-type endopeptidase activity involved in apoptotic process; apoptotic DNA        fragmentation; apoptotic process; cell redox homeostasis; chromosome condensation; DNA catabolic process; intrinsic apoptotic signaling pathway in response to endoplasmic reticulum stress; mitochondrial  respiratory chain complex I assembly; NAD(P)H oxidase activity; neuron apoptotic process; neuron differentiation; oxidoreductase activity, acting on NAD(P)H; positive regulation of apoptotic process; regulation of apoptotic DNA fragmentation.

 

Problem: Figure out a way to text mining the biological process that is related mitochondria (where mitochondria is mentioned).  Would regex be useful to solve this problem? or what other ways that might be useful?

Expected Result:  the result that I want to get is the following:

O95831    mitochondrial respiratory chain complex I assembly

Your help is appreciated,

 

biological_process text mining gene ontology • 2.3k views
ADD COMMENT
3
Entering edit mode
6.9 years ago
Siva ★ 1.8k

+1 for stating your question very clearly.

You might also want to check MitoMiner. It uses UniProt ID/names as the primary ID. It has several templates (or predefined queries). The linked query is "Show all genes and their corresponding proteins for a specific species that are annotated as mitochondrial according to the Gene Ontology (GO)". You need to edit the query to control what information you want to see in the output. Here you can add the GO annotation field.

ADD COMMENT
0
Entering edit mode

This must be helpful!  Since my goal is to annotate about ~1000 genes, do you think it would be possible to query more than one Uniprot ID to find the corresponding annotation?  Or this should performed programmatically? 

ADD REPLY
1
Entering edit mode

Yes, you can do large-scale analysis. Actually, the query I linked in my previous comment will give you all the genes in Human that are annotated as mitochondrial by GO (>7,000 records) from SwissProt and trEMBL. You need to modify the query and output fields to limit the exact result you want. Then you parse the results which are provided in several formats and select the UniProt IDs you are interested in.

Or, you can upload your list of Uniprot IDs to MitoMiner and save it as a list. Then you can use that list as a query.

 

ADD REPLY
0
Entering edit mode

Thank you a lot ... Your comment is so valuable! 

ADD REPLY
0
Entering edit mode

You are welcome. Glad to help.

ADD REPLY

Login before adding your answer.

Traffic: 2019 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6