For example, if I have a list of variants like this:
Gene_ID Transcript Coding Amino_Acid_Change TP53 NM_000546 c.G830T p.C277F TP53 NM_001126112 c.G830T p.C277F TP53 NM_001126113 c.G830T p.C277F TP53 NM_001126114 c.G830T p.C277F TP53 NM_001126115 c.G434T p.C145F TP53 NM_001126116 c.G434T p.C145F TP53 NM_001126117 c.G434T p.C145F TP53 NM_001126118 c.G713T p.C238F
How could you figure out which of the transcripts is the canonical transcript?
Supposedly, transcripts are listed in places like UCSC, RefSeq, and Ensembl. But I have gone through each of these and have not been able to find anything that resembles the information I've listed above (ANNOVAR RefGene annotation output). The closest I've come is the UCSC Table Browser returning 'knownCanonical' for UCSC genes, but this is in a BED-style output with identifiers that do not resemble my given data. ANNOVAR's own documentation says that it does not support any differential reporting for canonical transcripts.