Question: Variant Annotation - Which Transcript(S) Are The Best Representatives Of The Variant?
4
gravatar for Biomed
7.9 years ago by
Biomed4.5k
Bethesda, MD, USA
Biomed4.5k wrote:

I would like to learn what are the rules one should use while annotating variants, specifically to select transcripts for variant annotations. The topic is discussed here and there are many tools we can use to annotate a variant from the starting info of chr,position,variant_allele but most of the downstream annotation depends on which transcript you would choose. Many genes have alternative transcripts and based on which transcript you choose determines if the variant is a coding variant or if it is an intronic variant etc. So how do you choose which transcript to represent your variant?

annotation variant genetics • 3.4k views
ADD COMMENTlink modified 7.9 years ago by Sean Davis25k • written 7.9 years ago by Biomed4.5k
1

Also see the discussion here: http://biostar.stackexchange.com/questions/2992/how-to-assess-the-effect-of-snps-based-on-multiple-transcripts

ADD REPLYlink written 7.9 years ago by Khader Shameer18k
4
gravatar for Larry_Parnell
7.9 years ago by
Larry_Parnell16k
Boston, MA USA
Larry_Parnell16k wrote:

Before I get to your main question of transcript selection, I will briefly bring up haplotypes. It may be appropriate to analyze a haplotype rather than a single variant because of high linkage disequilibrium (LD) that allows certain variants to "travel" together. I find analysis of haplotypes to be extremely important when looking for allele-specific effects on RNA folding.

Let's say SNP 1 is A (major allele) and C (minor) while SNP 2 is G (major) and T (minor) and these are in very high LD. If SNPs 1 and 2 are in high (or absolute) LD, SNP 1 major allele of A and SNP 2 major allele of G will often (always) be found together, while SNP 1 allele A and SNP 2 allele of T (it minor allele) will be observed never or rarely. Thus, one should analyze two mRNA isoforms: one that is A at SNP 1 and G at SNP 2, and a second that is C at SNP 1 and T and SNP 2.

Which transcript to analyze can depend on many factors. Perhaps you should analyze the most well known or well characterized mRNA. Perhaps you should analyze all of the mRNA isoforms that are expressed in the tissue(s) relevant to your phenotype(s) of interest. In other cases, you may want to be blind to expression and phenotype and just analyze all reported mRNA isoforms for that gene.

Regardless of the above, I find folding of the 3'-UTR to be the most complicated and intensive of possible analyses one can perform in annotating variants. This is less the case for 5'-UTRs because they are usually much shorter than the 3'-UTR.

ADD COMMENTlink modified 7.9 years ago • written 7.9 years ago by Larry_Parnell16k
4
gravatar for Sean Davis
7.9 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

The answer to this question depends heavily on what you want to do downstream of the variant annotation. The most general way of proceeding is to simply annotate against ALL transcripts at a given locus and then make heuristic rules about which variant annotation to use downstream rather than arbitrarily choosing a transcript a priori.

ADD COMMENTlink written 7.9 years ago by Sean Davis25k
1

I agree with Sean's assessment (+1), but would add that if the variant is linked to a brain phenotype (eg cognition or Parkinson's), then you are justified in choosing to annotate only brain-specific transcripts - provided those are known.

ADD REPLYlink written 7.9 years ago by Larry_Parnell16k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1100 users visited in the last hour