I set up BLAST Search program to making auto typing program.
I have one question "How to bring original sequences" from output file.
/shared/MiSeq/BLAST/ncbi-blast-2.2.29+/bin/blastn -num_alignments 500 -word_size 50 -db ./test -query ./test.fasta -out ./test.out -outfmt 5
I got output file format as "XML" because of outfmt 5.
I convert this results and programming handling this data.
But I wonder is there any way to get "Original Sequence" in the output format? Or, is there any idea how to handle below cases?
I have one raw sequences like below,
that "TTTTTTTTTTTTT" positions are Exon2.
So I have a reference like below,
So If I aligned this two It will be like below
-----GTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTG----- |||||||||||||||||||||||||||||||| AAAAACTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCAAAAA
BLAST Search results will be 100% match like below,
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT |||||||||||||||||||||||||||||||| TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
In this case I want to determine whether this raw sequences amplified 100% of that database segment.
This case even though there are 2 mismatch bases are trimmed It is actually fully amplified Exon2.
but Some cases like below does not 100% amplified.
So below examples are
Intron 1 Exon2 Intron 2 TTTTTTTTTTTTTTTTTTTTTTTTTTT AAAAAAAAAAAAAAAAAAAAAAAAAA TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT TTTTT AAAAAAAAAAAAAAAAAAAAAAAAAA TTTTT
It is only Exon2 is fully amplified. I'd like to get this fully amplified sequences (Which segments are fully amplified)
I am thinking to get it from comparing original sequences. Does anyone has good idea? Thank you,