I have a dataset of CDS_Mutations. I want to convert CDS_Mutations to AA_Mutations and obtain their genomic coordinates. Kindly if someone can guide me on how to do this. Many thanks.
I have a dataset of CDS_Mutations. I want to convert CDS_Mutations to AA_Mutations and obtain their genomic coordinates. Kindly if someone can guide me on how to do this. Many thanks.
You can use the data you have as HGVS input for the Ensembl VEP. That will give you amino acid substitutions, genomic coordinates and a bunch of other information. Since you only have gene names, and not versioned transcript IDs, your variants are somewhat ambiguous, as the VEP has to work out what transcript you're referring to, and it may get it wrong. This blog post talks about protein HGVS but the stuff about transcript choice is relevant here too.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What format are they in? Can you show us a single line of the file, please?
In an excel file, I have columns of Gene name, CDS Mutation
Please see an example below:
And I want to find the AA mutations and genomic coordinates for each of the CDS mutations.
Many thanks, Emily. I will try VEP tool.
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.SUBMIT ANSWER
is for new answers to original question.