Hi,
I'm trying to align some sequences that have large deletion (~150bp) using command line blastn.
Size of subject is 266bp and query size is about 115bp. Blastn can align the query either to 5' end or 3' end of the subject but it cannot align the query as a single alignment with large gap. I tried to change gap opening and gap extension penalty to 0 but it wasn't helpful. Does anyone have an idea how to run blastn to get a single alignment with large gap? Below is my code and example sequences.
blastn -db ~/data/raw/GSKOampliconNGS/IntronGSgene.fa -query test.fa -out test_results.out -max_hsps 1 -gapopen 0 -gapextend 0 \
subject - ttggccttattctaaaggtcaacatgttctttctagtgggaattccaaataggaccctgtgaaggaatccgcatgggagatcatctctgggtggcccgtttcatcttgcatcgagtatgtgaagactttggggtaatagcaacctttgaccccaagcccattcctgggaactggaatggtgcaggctgccataccaactttagcaccaaggccatgcgggaggagaatggtctgaagtaagtagcttcctctggagccatctttat \
query - ATAAAGATGGCTCCAGAGGAAGCTACTTACTTCAGACCATTCTCCTCCCGGATTCCTTCACAGGGTCCTATTTGGAATTCCCACTAGAAAGAACATGTTGACCTTTAGAATAAGGCCAA
Thanks,
Thanks! I'll try them.
mafft and muscle both work very well. However, I need to get more like machine readable output format something like you can get using -outfmt 6 in blastn. Below is my example code and output format.
test IntronContainingGSamplicon 100.000 71 0 0 110 180 1 71 8.80e-36 132
test IntronContainingGSamplicon 100.000 66 0 0 1 66 1 66 5.30e-33 122
test IntronContainingGSamplicon 98.113 53 1 0 176 228 214 266 4.15e-24 93.5
test IntronContainingGSamplicon 100.000 46 0 0 63 108 221 266 6.95e-22 86.1
Is there anyway I can get similar output format using mafft or muscle? I read the manuals but I couldn't find any options that can generate such output format. Or is there any other aligners or tools that can generate above format or convert mafft output into something like that?
You may want to look at FASTA package - see this post.