The CAI index ranges from zero to one being one if a gene always uses, for each encoded amino acid, the most frequently used synonymous codon in the reference set. The significance cutoff value obtained from a reference set, that is Arabidopsis is 0.836. If per codon I am getting values higher than the cutoff what does that tell me about the specific codons coding my sequence?
Codon Adaptation Index (CAI) Codon adaptation index is a measurement of the relative adaptiveness of the codon usage of a gene towards the codon usage of highly expressed genes. The relative adaptiveness (w) of each codon is the ratio of the usage of each codon, to that of the most abundant codon for the same amino acid. The CAI index is defined as the geometric mean of these relative adaptiveness values. Non-synonymous codons and termination codons (dependent on genetic code) are excluded. CAI values range from 0 to 1, with higher values indicating a higher proportion of the most abundant codons. [Sharp, P. M., and W. H. Li , (1987). The codon adaptation index a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Research 15: 1281-1295. Abstract, also see: Jansen R., Bussemaker H.J., and Gerstein M. (2003) Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models. Nucleic Acids Res. 31(8):2242-51. Abstract ]
You've answered your question in your post. CAI measures the percentage of codons that are the most abundant choice in any organism. For instance, here are the choices for Isoleucine in E coli from the Codon Usage Database:
AUU Ile 0.50 AUC Ile 0.40 AUA Ile 0.09
A value of 1.0 would indicate all Ile codons are the most abundant (AUU). As you move away from 1.0 you have increasingly more of the AUC and AUA choices.
If you are seeing higher CAI values in your test set compared to the reference, this means they have a higher proportion of the most used codon choices.
CAI is often equated to expression (high CAI = high expression) although the relationship is not that simple. This graphic from DNA 2.0 (a gene synthesis company) demonstrates cases where CAI does not correlate with gene expression.