Why does the CADD database have multiple lines for the same mutation/substitution with different gene IDs?
1
0
Entering edit mode
22 months ago
4galaxy77 2.8k

I grepped out a position 12_111803962_G_A from the CADD database and it returned this

12_111803962_G_A        32      Intergenic      DOWNSTREAM      ENSG00000274697 ENST00000617899
12_111803962_G_A        32      CodingTranscript        NON_SYNONYMOUS  ENSG00000111275 ENST00000261733

I'm confused as to why there are multiple lines for this specific mutation.

If I look it up on NBCI, then it says the mutation is in the ALDH2 gene, as expected. This maps to ENSG00000111275 which is the second entry in my grep results above. However, the first entry maps to ENSG00000274697 which is a different gene, MIR6761, which I believe is next door to ALDH2.

This seems very confusing to me - the position 12_111803962 isn't in the MIR6761 gene, so why does it map to there?

cadd • 477 views
ADD COMMENT
2
Entering edit mode
22 months ago
tomas4482 ▴ 390

This variant is a downstream variant of ENSG00000274697, annotated by Ensembl-VEP.

ADD COMMENT

Login before adding your answer.

Traffic: 2536 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6