What I understand that You have to check many instances of given trinucleotide then I would suggest. Read sequences for each trinucleotide (Some thing with substring) and then maintain a Hash where each trinucleotide as key and occurrence as value. So once you will have hash ready then you can easily match based on Key-value pair !! So as per me this is will take least time as will read exome sequence only once .