Extracting orthology relationships follows the main steps of clustering, multiple alignment, and tree generation. Broadly, protein sequences are initially clustered using all-against-all BlastP, then the sequences within each cluster undergo multiple alignment to give a phylogenetic tree. Then orthology relationships are inferred from this tree. This is more or less the pipeline used in eggNOG, TreeFam and EnsemblCompara.
My question is what is the purpose of building phylogenetic tree? If I want only the orthologs of a gene, clusters should be enough, since I already get a group of genes based on sequence similarity. What does phylogenetic tree add more and is it required to get the orthologs of a gene?