ggsearch: not returning all alignments for short sequences
0
0
Entering edit mode
4 months ago

Hi. I'm using ggsearch36 from the FASTA package to create a similarity matrix of protein sequences, using global identity as the similarity. I realized that for some query sequences, ggsearch36 does not print all alignments to the library, and I have a hard time figuring out what parameter can be used to fix this.

For query

>Query1
MCPRAARAPATLLLALGAVLWPAAGAWELTILHTNDVHSRLEQTSEDSSKCVNASRCMGGVARLFTKVQQ

the command

fasta36/bin/ggsearch36 -E 20758 testqry.tmp library.fasta

works fine, printing all alignments. My e value is the size of the library.

Statistics:  Unscaled normal statistics: mu= -27.8916  var=246.9351 Ztrim: 0
statistics sampled from 1375 (1376) to 1375 sequences
Algorithm: Global/Global affine Needleman-Wunsch (SSE2, Michael Farrar 2010) (6.0 April 2007)
Parameters: BL50 matrix (15:-5), open/ext: -10/-2

Whereas for query

>Query2
MKVVIFIFALLATICAAFAYVPLPNVPQPGRRPFPTFPGQGPFNPKIKWPQGY

The same command only returns 12 aligments.

Statistics: (shuffled [100]) Unscaled normal statistics: mu= -29.4000  var=283.6970 Ztrim: 0
statistics sampled from 12 (12) to 100 sequences
Algorithm: Global/Global affine Needleman-Wunsch (SSE2, Michael Farrar 2010) (6.0 April 2007)
Parameters: BL50 matrix (15:-5), open/ext: -10/-2

I can't figure out what causes this behaviour. Increasing the e value did not increase the number of printed alignments. The only difference that is obvious to me is that query 1 has a length of 70, whereas query 2 is only 53 amino acids. I couldn't find anything related to sequence length in the configuration though.

Any ideas what the problem might be? Thanks for your help.

alignment • 228 views
ADD COMMENT

Login before adding your answer.

Traffic: 2429 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6