Exact match in blastn
2
Hi
I've used blastall to discover known miRNA for my small reads, my parameters were
blastall -p blastn -F F -e 0.01
but my problem is I cannot fix to find exact match, it found some target sequence but not they have not same length and some of them has mismatch,
Is there a way to determine prefect/exact match for blast or no?
Thanks friends.
blastall
blastn
mismatch
• 3.5k views
•
link
updated 3.7 years ago by
Ram
45k
•
written 11.0 years ago by
Whoknows
▴
960
Hi,
If the tabular output format of blast is enough for you, you can try the following
blastall -p blastn -F F -e 0.01 -d bank -i query -m8 | gawk '{if(index($3 ,"100.0")>0) { print $0 }}'
It will keep aligments with only 100% of identity.
Note : you may have to use 'awk' if you don't have 'gawk'.
•
link
updated 3.7 years ago by
Ram
45k
•
written 11.0 years ago by
edrezen
▴
730
You are better using something other than blast if you want to do exact matching. An example is this perl script
USAGE:
./probe_specificity_test test_file.fasta probe_seq outfile.txt
Something like:
./probe_specificity_test my_mirna.fasta CATGCATCGATGCATCGTA matching_sequences.txt
That will do gapped matching too so you would need to cut that bit out. It's also IUPAC compliant.
Hope this helps.
•
link
updated 3.7 years ago by
Ram
45k
•
written 11.0 years ago by
Daniel
★
4.0k
Login before adding your answer.
Traffic: 3606 users visited in the last hour
Thanks a lot for answer, I don't know what 'gawk' is, but sometimes you have 100% similarity but just a part of query is consider with the target sequence I mean:
I want to find those sequences with same length and also without mismatch.,
gawk is a tool that can perform simple treatments on text files. It is often distributed on Linux (or awk on macos)