I am using codeml to calculate dN/dS ratios on a data set including over 1000 genes. I was hoping to speed up the process by finding a way to calculate the LRT prior to running codeml, thereby avoiding having to run it on both the null tree and the alternative tree. Is anyone familiar with a way of doing this considering the trees will be labelled in the appropriate way for codeml (e.g. #1).
I think you are confusing two things. You also need more info on what models you are fitting (branch, site, or branch-site), but I'll answer generally.
LRTs are used to select among nested models. In codeml, this refers to models with and without a site class that corresponds to an omega > 1; evidence in favor of a model with such a site class allows for the secondary inference of codons under positive selection. This is how you select between M7 and M8 and determine the best-fit model or look for selection along particular branches, for example. This can NOT be used for non-nested models.
In molecular phylogenetics, the LRT is often used to select between clocklike and non-clocklike trees (e.g., testing the molecular clock hypothesis) and selecting among nested pairwise models of nucleotide sequence evolution (say, submodels of GTR). Note that in the second case you will probably be better off testing among sets of models using the AIC, BIC, or a decision-theoretic criterion.
What this means is that in order to test your hypothesis, you'll need the likelihood of that particular model given your tree and data in order to take the likelihood ratio and assess fit. You can't get this likelihood without fitting the model, i.e., running codeml in the first place.
Does this answer your question?