Question: Annovar And Protein Sequence
gravatar for Chris
9.3 years ago by
Chris1.6k wrote:

I'm using annovar for mapping 1KG data onto mRNA transcripts. Now, since I'm interested in nsSNPs, I would like to know the whole protein sequence. annovar provides the position, wt and mt residues however no specific information on the whole sequence. At least I couldn't find any. I parsed the mRNA RefSeq identifiers out of annovar's output and tried to find a mapping from those to RefSeq protein sequences. However, this often results in different wt amino acids between annovar's annotation and the residue that is found in the RefSeq sequence in that position. I wonder what the correct approach is? How does annovar perceive amino acids? By 'simple' translation of the mutant codon in the mRNA RefSeq file?

amino-acids annovar protein • 2.4k views
ADD COMMENTlink modified 9.3 years ago by Larry_Parnell16k • written 9.3 years ago by Chris1.6k

I think the answer to your last question is, "yes".

ADD REPLYlink written 9.3 years ago by Sean Davis26k
gravatar for Larry_Parnell
9.3 years ago by
Boston, MA USA
Larry_Parnell16k wrote:

This sounds to me like cases we have often seen - the reference human genome, from which RefSeq mRNA and protein sequences were built, does contain minor alleles, even homozygous minor alleles in places. In other words, when looking at the consensus genome of six individuals, as was done to build the ref human genome, it is possible to have some minor alleles incorporated into the gene models. When looking at many more genome, however, it becomes clear that certain positions in a reference sequence are not representative of the major allele. An extreme case of this is some UGT2A and UGT2B gene family members are absent from some Asian populations.

So, you could do the mapping as you describe, but let Annovar overrule RefSeq for single residue discrepancies as long as the Annovar residue is based on sampling of many individuals. If you are working with data from a population/individual not of European origin, then you may need to consider comparisons to a reference genome from that other ethnic group.

ADD COMMENTlink written 9.3 years ago by Larry_Parnell16k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 962 users visited in the last hour