8.4 years ago
Whoknows ▴ 950

Hi

I've used blastall to discover known miRNA for my small reads, my parameters were

blastall -p blastn -F F -e 0.01


but my problem is I cannot fix to find exact match, it found some target sequence but not they have not same length and some of them has mismatch,

Is there a way to determine prefect/exact match for blast or no?

Thanks friends.

8.4 years ago
edrezen ▴ 730

Hi,

If the tabular output format of blast is enough for you, you can try the following

blastall -p blastn -F F -e 0.01 -d bank -i query -m8 | gawk '{if(index($3,"100.0")>0) { print$0}}'


It will keep aligments with only 100% of identity.

Note: you may have to use 'awk' if you don't have 'gawk'.

Thanks a lot for answer, I don't know what 'gawk' is, but sometimes you have 100% similarity but just a part of query is consider with the target sequence I mean:

Query : 21 nt
Target : 25 nt
Exact matched : Query (1,18) with 100% similarity


I want to find those sequences with same length and also without mismatch.,

gawk is a tool that can perform simple treatments on text files. It is often distributed on Linux (or awk on macos)

8.4 years ago
Daniel ★ 3.9k

You are better using something other than blast if you want to do exact matching. An example is this perl script

USAGE:

./probe_specificity_test test_file.fasta probe_seq outfile.txt


Something like:

./probe_specificity_test my_mirna.fasta CATGCATCGATGCATCGTA matching_sequences.txt


That will do gapped matching too so you would need to cut that bit out. It's also IUPAC compliant.

Hope this helps.