I have a set of kmer counts coming from 2 groups. The first and second group have 25 RNA-seq samples each. I'm interested in identifying kmers that appear to have counts that are different between the 2 groups. In other words, for example, i have the 3mer AAT counts for each sample in both groups. I want to test whether the number of occurrence of this 3mer is significantly different between the 2 groups. Note here that I normalize my data to account for different library sizes in the different samples. Would it be correct to address this problem as trying to test whether the two distribution are significantly different (e.g., test whether the distribution of the 3mer AAT in the first group is significantly different than the distribution of the 3mer AAT in the second group)? In that case I could use a statistical test such as Kolmogorov–Smirnov test or is there a better approach to tackle this problem?