Question: How to calculate the position ( c.4879+41) of "Mutation Call:Relative To CDS (e.g c.4879+41delG) " base on the data HG19_RefGene
2.3 years ago by
merobin80
merobin80 wrote:

I need to calculate the position (e.g. c.4879+41) of "Mutation Call:Relative To CDS (e.g c.4879+41delG)" base on file HG19 RefGene with info like [txName] ,[chrom] ,[strand] ,[txStart] ,[txEnd] ,[cdsStart] ,[cdsEnd] ,[exonCount] ,[exonStarts] ,[exonEnds] ,[score] ,[geneName] ,[cdsStartStat] ,[cdsEndStat] ,[exonFrames].

However , some of my result can match , some are not . Is there anyone know the algorithm of this calculations ?

Below are some examples :

``````**NextGene Output   My Result**
c.1-33T>C   c.1-33
c.294A>G    c.294
c.728-47A>G c.728-47
c.728-45C>G c.728-45
c.728-39C>A c.728-39
c.728-39C>G c.728-39
c.1178-6T>C c.1178-6
c.1384+28G>A    c.1384+28
c.1385-15_1385-14delCT  c.1385-15
c.1385-15_1385-14delCT  c.1385-14
c.2537-26A>G    c.2537-26
c.3011T>C   c.3011
c.3066A>G   c.3066
c.3517-12T>C    c.3517-12
c.3558T>C   c.3558
c.4161T>C   c.4161
c.4299-48T>G    c.4299-48
c.4745-17C>T    c.4745-17
c.4879+41delG   c.4879+27
c.4879+32G>A    c.4879+32
c.4879+33G>A    c.4879+33
c.4879+43T>C    c.4879+43
c.5651+5C>T c.5651+5
c.6057C>T   c.6057
c.534G>A    c.534
c.57+41C>A  c.58+41
``````
written 2.3 years ago by merobin80

What is the gene name or transcript name?

For me it looks like you have a list of variants in hgvs notation and you like to get rid of the base changes informations leaving behind only the `c.` position?

``````c.1385-15_1385-14delCT  c.1385-15
c.1385-15_1385-14delCT  c.1385-14
``````

Why this entry appears two times with different result?

fin swimmer

Thanks for your reply . What I want to do is to generate the same result as the nextgene output (column left) by only process the refGene database with chr + chrPos+ transcript name+ gene name. However, what i can do for now is to calculate the c. position first 。 What I post is just a few example for reference only . For now I don't have a full pictures of how the c. position is calculated . It seems that it have many condition to consider . Now my algorithm can fit for some condition ,but others are not . Since I am new to this area , I hope if some one can provide the full pictures of how the do this . Below are some of the records are not matched for reference only .

``````Chr Position    Chr:Position    Gene    nextgene output my output   RNA Accession
22208030    1:22208030  HSPG2   c.1593-35G>A            c.1655-35   NM_005529
22211217    1:22211217  HSPG2   c.1445+43C>T    c.1508+43   NM_005529
22211222    1:22211222  HSPG2   c.1445+38A>G    c.1508+38   NM_005529
22214127    1:22214127  HSPG2   c.682T>C                    c.744           NM_005529
22216574    1:22216574  HSPG2   c.412G>T            c.474           NM_005529
22216604    1:22216604  HSPG2   c.382G>C            c.444           NM_005529
22216877    1:22216877  HSPG2   c.351+43delG            c.414+43    NM_005529
22217108    1:22217108  HSPG2   c.262C>T                    c.324           NM_005529
22222633    1:22222633  HSPG2   c.137+35T>G         c.200+35            NM_005529
22222638    1:22222638  HSPG2   c.137+30A>G          c.200+30   NM_005529
``````

Hello merobin,

now I've got it I think :)

I cannot provide an algorithm, but i checked your results using mutalyzer. The result is that (most) of your c. position are correct and not the one from nextgene. For variants with `+` you have a off-by-one problem. So e.g. `c.200+30` should be `c.199+30`.

fin swimmer