k-tuple frequency can be used in clustering sequences.
I wonder if this method can be used when target sequences are very short like reads from next generation sequencing? Will it be unstable? (I have read some papers and all of them just use k-tuple frequency to catalog different meta-data, but not clustering reads in those meta-data.)
Any hints or paper about this would be helpful! Thank you!