Question: Question about the consequence types from VEP
gravatar for vasilislenis
2.8 years ago by
United Kingdom
vasilislenis100 wrote:


I'm really new if the field and maybe my question is a little bit naive.

I'm trying to annotate my SNPs by using VEP. The reason to do this is to find the nonsense SNPs on each genome, the synonymous and non-Synonymous. The organism that I'm working on is sheep. The thing that is a little bit confusing to me is the tag system that Ensembl uses. For example, the "synonymous_variant" is for the synonymous SNPs, but I'm not so sure about the non-synonymous and the nonsense. I'm taking the "coding_sequence_variant" and "stop_gained", respectively. Am I right? Also, I cannot identify the CNVs. Is there any particular tag for this?

A second issue that I faced is that for some gene IDs there is no information about the name of the gene (symbol tag). Is there any way to use somehow a list with these IDs and find the names of these genes?

Thank you very much in advance and I'm really sorry for the questions "bombing".

ADD COMMENTlink modified 2.8 years ago by Denise - Open Targets4.9k • written 2.8 years ago by vasilislenis100
gravatar for Denise - Open Targets
2.8 years ago by
UK, Hinxton, EMBL-EBI
Denise - Open Targets4.9k wrote:

The tag system is based on the Sequence Ontology (SO) consequence terms. Non-synonymous is not a SO term. This should be referred to as missense_variants according to SO. Check the SO definitions and a diagram showing the location of variants on the Calculated consequence variants page. The nonsense is known as stop_gained. If you annotate CNVs (larger insertions or deletions for example), you will have the same SO consequence terms. These are some of the consequences for copy_number_variation according to SO As for your second issue, I'd guess there is no gene name for the sheep gene, but you will have the Ensembl stable ID, e.g. ENSOARG00000005819. If you sent some examples, it'd be easier to help.

ADD COMMENTlink written 2.8 years ago by Denise - Open Targets4.9k

Thank you very much for your help :)

I'm sending you some examples of tags that when I used the --symbol flag didn't give me the gene name.

ENSOARG00000000134 ENSOARG00000000154 ENSOARG00000000161

ADD REPLYlink written 2.8 years ago by vasilislenis100

ENSOARG00000000134, ENSOARG00000000154 and ENSOARG00000000161 are all uncharacterised proteins, with no gene name, different from ENSOARG00000019179, the latter named as CD96.

ADD REPLYlink written 2.8 years ago by Denise - Open Targets4.9k

Thank you very much for your help! So, I will leave them as unknown genes in my code.

ADD REPLYlink written 2.8 years ago by vasilislenis100
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1561 users visited in the last hour