I am picking single protein sequence for each gene in Ensembl genome, particularly, human and mouse genomes. I used to select the longest isoform. It works fine generally, but when I looked through them in UCSC Genome Browser, some longest isoforms are obviously not real, as the evidence of expression and conservation implied. Even though I can recognize them in Genome Browser, how can I deal with them automatically? Thanks.