I want to find a specific 9-mer (GATCGATGC) in human genome, and then export them into a bed file with all information including chromosome, start and end position. A lot of tools such as jellyfish and DSK can only count k mer occurrence and can't export k mer information. Does anybody know how to do this? Any suggestion would be greatly appreciated.
UCSC BLAT is not ideal for short sequences, but a command-line version of BLAT could be used locally with a small tile size and options
-minIdentity to export a PSL file, and from there, a conversion script like
psl2bed can be used to get a BED file for downstream set operations.