I'm trying to convert gene IDs to ensemble ID using files downloaded from biomart. It seems that some protein-coding genes does not have ensembl ID, but its synonym have ensemble ID available. For instance, for 'MLL2', no ensemble ID is available, but synonym of 'MLL2', which is 'KMT2D', does have its ensembl ID (ENSG00000167548). How can I account for all of the synonym while converting one ID form to another? Thanks.
MLL2 is an alias, the gene's actual name is KMT2D, which is why that's what's associated with the Ensembl ID. What you can do is use the HGNC biomart to get a list of official gene symbols and their aliases. You then then convert as needed. Note that some genes may have multiple IDs, since gene names aren't unique.