Hi
I am comparing 1 sequence (as query ~1800nt) with a set of sequences (as target), looking for the similarities (in blast+ (blastn)).
I want to find all of the similarities that exist (even as long as 5 nucleotide). Because normally blast doesn't return similarities that short, I set e-value very large and also adjusted the max target sequences to a high number. I have also set the world size on a small number (5).
Is there any other thing that I can adjust to make sure that I have found ALL of the similarities as long as, say, 5 nucleotide?
Thank you very much.
In genomes, usually the uniqueness of a sequence starts only after 17 bases. If you still want 5 base similarity, you probably should go for a script which can match patterns from a file. A basic grep should also do that
Thanks Rohit. I am not specifically looking into the uniqueness but thanks for your advice. I think you're right, I should try grep too.