I downloaded a data set from http://comppi.linkgroup.hu/downloads (Integrated protein-protein interaction dataset / C elegans) but it does not have a label if they are essential or not, How could I get these labels?
I found out some of them in " http://ogee.medgenius.info " but many of them don't exist in this site.
A protein is only essential in some context, i.e. what is it essential for ?
The reference for C. elegans is Wormbase. You could search there for "essential" genes.
Thank you for your reply, the dataset that I have, many of them do not have WBGene ID and when I find these IDs of proteins I could not find labels either in this site.
Most of the proteins have gene ID(i.e. WBGene00003576) but I could not find more than half of gene ID is essential or not like these ones :
P03949 WBGene00000018
Q23500 WBGene00000040
Q95ZL1 WBGene00000066
...
I really do not know what to do ?!
I am not sure I understand the problem. WBGene00000040 has no phenotype and lethal phenotypes are reported as not observed for WBGene00000018. However, WBGene00000066 is reported as embryonic lethal.
See my answer below for a way of doing it.
Assuming that for you essential means loss of function is lethal, here is how you can go:
- On the WormBase site, go to Tools > Ontology browser
- In the phenotype ontology (bottom of the page) look for phenotype "lethal" (variant > development variant > organism development variant > lethal)
- click on "lethal"
- In the tree at the bottom of the page, the term becomes highlighted in green and next to it you have the number of gene products annotated with this term
- Click on this and you get a list of the corresponding genes.
I really appreciate for your answer and consideration. I only want to find out is this WBgene of specific protein essential or not like this site : http://ogee.medgenius.info/browse/Caenorhabditis%20elegans but this site does not have many of WBgene IDs.
From this link, the data corresponds to only one publication testing ~11000 genes and essentiality is defined as embryonic lethality or sterility. You can get this also from Wormbase by selecting the relevant phenotypes in the way I explained above. The difference is that WormBase data is collected and curated over many different screens and experiments so will be more comprehensive.
A protein is only essential in some context, i.e. what is it essential for ? The reference for C. elegans is Wormbase. You could search there for "essential" genes.
Thank you for your reply, the dataset that I have, many of them do not have WBGene ID and when I find these IDs of proteins I could not find labels either in this site.
You'll have to convert your protein IDs to gene IDs
Most of the proteins have gene ID(i.e. WBGene00003576) but I could not find more than half of gene ID is essential or not like these ones : P03949 WBGene00000018 Q23500 WBGene00000040 Q95ZL1 WBGene00000066 ... I really do not know what to do ?!
I am not sure I understand the problem. WBGene00000040 has no phenotype and lethal phenotypes are reported as not observed for WBGene00000018. However, WBGene00000066 is reported as embryonic lethal. See my answer below for a way of doing it.