Entering edit mode
4.2 years ago
lstephens03
•
0
Hi,
I'm trying to build my own version of blast (almost from scratch) to run locally. I'm currently at the stage of generating a list of k-mers parsed from the query sequence which is fine.
My question relates to the neighbourhoods words, in the classic setting of blast, are the neighbourhood words generated as one large collection of words? ie if my alphabet consists of 20 letters and I want words of k = 3, is a complete list of all possible words (20^3) generated from which each k-mer in the query sequence is compared against and a score generated?
Thanks in advance
Forgive me, but I have to ask:
Why?
Maybe OP doesn't know conda :D. But seriously, I think it's a great thing to try and (re-)build things such as blast. Not necessarily for production purposes, but I think this is the best way to understand how things work (and with that, hone one's skill set). Of course, one could say 'read the blast papers' (and that text by Gene Myers' that goes into a bit more detail on the algorithms and data structures utilised in blast), but to be fair, it's not like those make for great reading material...