The answer from the author of ANNOVAR is this:
There has never been a consensus in the field which transcript should
be used to represent a gene when multiple transcripts are available.
The most popular approach is to use the longest transcript nowadays.
However, in the medical genetics field, for certain specific diseases
and specific genes, there are 'canonical' transcripts that everybody
uses by default for historical reasons, and you will need to manually
select this canonical transcript from ANNOVAR output file to
communicate with the rest of the field.
In a way, he is correct, and I feel that the field should start to embrace (and report) multiple transcript isoforms more and more, even with the increased data load. There is too much reporting of variants on isoforms that may have minimal relevance in the tissue of study. Also, for many well-studied genes, like BRCA1, we have identified >10 isoforms; whilst, for other less-studies genes, we don't yet understand the alternate splicing patterns of the gene.
Note that VEP does allow you to output the canonical isoform, but to Ensembl the canonical is always the isoform with the longest CCDS: https://www.ensembl.org/Help/Glossary?id=346
On the last point, researchers even disagree about what canonical means. For some it is the highest expressed isoform in the tissue being studied, which may not necessarily be the longest. At least Ensembl's definition is broad-sweeping and covers all tissues.