Hi. I'm using ggsearch36 from the FASTA package to create a similarity matrix of protein sequences, using global identity as the similarity. I realized that for some query sequences, ggsearch36 does not print all alignments to the library, and I have a hard time figuring out what parameter can be used to fix this.
For query
>Query1
MCPRAARAPATLLLALGAVLWPAAGAWELTILHTNDVHSRLEQTSEDSSKCVNASRCMGGVARLFTKVQQ
the command
fasta36/bin/ggsearch36 -E 20758 testqry.tmp library.fasta
works fine, printing all alignments. My e value is the size of the library.
Statistics: Unscaled normal statistics: mu= -27.8916 var=246.9351 Ztrim: 0
statistics sampled from 1375 (1376) to 1375 sequences
Algorithm: Global/Global affine Needleman-Wunsch (SSE2, Michael Farrar 2010) (6.0 April 2007)
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
Whereas for query
>Query2
MKVVIFIFALLATICAAFAYVPLPNVPQPGRRPFPTFPGQGPFNPKIKWPQGY
The same command only returns 12 aligments.
Statistics: (shuffled [100]) Unscaled normal statistics: mu= -29.4000 var=283.6970 Ztrim: 0
statistics sampled from 12 (12) to 100 sequences
Algorithm: Global/Global affine Needleman-Wunsch (SSE2, Michael Farrar 2010) (6.0 April 2007)
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
I can't figure out what causes this behaviour. Increasing the e value did not increase the number of printed alignments. The only difference that is obvious to me is that query 1 has a length of 70, whereas query 2 is only 53 amino acids. I couldn't find anything related to sequence length in the configuration though.
Any ideas what the problem might be? Thanks for your help.