blast with large deletion
1
0
Entering edit mode
4.7 years ago
haewon • 0

Hi,

I'm trying to align some sequences that have large deletion (~150bp) using command line blastn.

Size of subject is 266bp and query size is about 115bp. Blastn can align the query either to 5' end or 3' end of the subject but it cannot align the query as a single alignment with large gap. I tried to change gap opening and gap extension penalty to 0 but it wasn't helpful. Does anyone have an idea how to run blastn to get a single alignment with large gap? Below is my code and example sequences.

blastn -db ~/data/raw/GSKOampliconNGS/IntronGSgene.fa -query test.fa -out test_results.out -max_hsps 1 -gapopen 0 -gapextend 0 \
    subject - ttggccttattctaaaggtcaacatgttctttctagtgggaattccaaataggaccctgtgaaggaatccgcatgggagatcatctctgggtggcccgtttcatcttgcatcgagtatgtgaagactttggggtaatagcaacctttgaccccaagcccattcctgggaactggaatggtgcaggctgccataccaactttagcaccaaggccatgcgggaggagaatggtctgaagtaagtagcttcctctggagccatctttat \
    query - ATAAAGATGGCTCCAGAGGAAGCTACTTACTTCAGACCATTCTCCTCCCGGATTCCTTCACAGGGTCCTATTTGGAATTCCCACTAGAAAGAACATGTTGACCTTTAGAATAAGGCCAA

Thanks,

blast deletion • 1.4k views
ADD COMMENT
1
Entering edit mode
4.7 years ago
h.mon 35k

Blast is a local aligner, so it will discard alignment regions that fall bellow a threshold score. You can try to lower gap opening and gap extension penalties, but, ultimately, blast is not the right tool for what you want. You may try a multiple sequence alignment progeam like mafft or muscle, you will probably get better results.

ADD COMMENT
0
Entering edit mode

Thanks! I'll try them.

ADD REPLY
0
Entering edit mode

mafft and muscle both work very well. However, I need to get more like machine readable output format something like you can get using -outfmt 6 in blastn. Below is my example code and output format.

blastn -db ~/data/raw/GSKOampliconNGS/IntronGSgene.fa -query test.fa -out test_results.out -outfmt 6

test IntronContainingGSamplicon 100.000 71 0 0 110 180 1 71 8.80e-36 132

test IntronContainingGSamplicon 100.000 66 0 0 1 66 1 66 5.30e-33 122

test IntronContainingGSamplicon 98.113 53 1 0 176 228 214 266 4.15e-24 93.5

test IntronContainingGSamplicon 100.000 46 0 0 63 108 221 266 6.95e-22 86.1

Is there anyway I can get similar output format using mafft or muscle? I read the manuals but I couldn't find any options that can generate such output format. Or is there any other aligners or tools that can generate above format or convert mafft output into something like that?

ADD REPLY
0
Entering edit mode

You may want to look at FASTA package - see this post.

ADD REPLY

Login before adding your answer.

Traffic: 1493 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6