I'm looking to get a file formatted with two columns. One column should contain the actual GO IDs. The second should contain the vocabulary associated with the GO ID that is in the same row. Rows in the file should look something like this:
GO:0000002 mitochondrial genome maintenance GO:0000003 reproduction GO:0000009 alpha-1,6-mannosyltransferase activity GO:0000012 single strand break repair
So I want something like this, except for all GO IDs and their associated terms. How can I do this? I took a look at the downloads options from the GO website and did not see an option to download something like this.
Thank you, this works for me. I have a follow-up question. I see some files with GO IDs separated into GO IDs for three categories: CC (cellular component), BP (biological process), and MF (molecular function). If I'm not mistaken, all GO terms fall within these three categories. Is there a way to create three separate dataframes, one containing GO IDs belonging to each of the three categories?
EDIT: I looked into it and if I'm not mistaken, what I'm asking for should conform to the categories in the Ontology column of the GO.db dataframe. So I guess I can just use that.
Yes. You can simply modify the select statement:
Just thought I would come back here and add the exact script to generate tables of each ontology and save them all to file if anyone is interested in just copying and pasting it in now.