I am aligning two sequences using ggsearch, an implementation of Needleman-Wunsch that is part of the FASTA package. The alignments are not the same when the query and subject are switched. From my understanding of the basic Needleman-Wunsch algorithm, the results should be symmetric. Below are the two sequences followed by the alignments.
>3hea:B
STFVAKDGTQIYFKDWGSGKPVLFSHGWPLDADMWEYQMEYLSSRGYRTIAFDRRGFGRSDQPWTGNDYDTFADDI
AQLIEHLDLKEVTLVGFSMGGGDVARYIARHGSARVAGLVLLGAVTPIFGQKPDYPQGVPLDVFARFKTELLKDRA
QFISDFNAPFYGINKGQVVSQGVQTQTLQIALLASLKATVDCVTAFAETDFRPDMAKIDVPTLVIHGDGDQIVPFE
TTGKVAAELIKGAELKVYKDAPHGFAVTHAQQLNEDLLAFLKR
>1hkh:B
GYITVGNENSTPIELYYEDQGSGQPVVLIHGYPLDGHSWERQTRELLAQGYRVITYDRRGFGGSSKVNTGYDYDTF
AADLHTVLETLDLRDVVLVGFSMGTGELARYVARYGHERVAKLAFLASLEPFLVQRDDNPEGVPQEVFDGIEAAAK
GDRFAWFTDFYKNFYNLDENLGSRISEQAVTGSWNVAIGSAPVAAYAVVPAWIEDFRSDVEAVRAAGKPTLILHGT
KDNILPIDATARRFHQAVPEADYVEVEGAPHGLLWTHADEVNAALKTFLAK
This is the alignment when 1hkh:B is the query and 3hea:B is the subject.
>>3hea:B (271 aa)
n-w opt: 497 Z-score: 399.8 bits: 81.8 E(1): 2.4e-268
global/global (N-W) score: 497; 39.6% identity (69.3% similar) in 280 aa overlap
10 20 30 40 50 60
1hkh:B GYITVGNENSTPIELYYEDQGSGQPVVLIHGYPLDGHSWERQTRELLAQGYRVITYDRRG
. : ...: ..:..: :::.::.. ::.:::. :: : . : ..:::.:..::::
3hea:B S--TFVAKDGT--QIYFKDWGSGKPVLFSHGWPLDADMWEYQMEYLSSRGYRTIAFDRRG
10 20 30 40 50
70 80 90 100 110 120
1hkh:B FGGSSKVNTGYDYDTFAADLHTVLETLDLRDVVLVGFSMGTGELARYVARYGHERVAKLA
:: :.. :: :::::: :. ..: :::..:.::::::: :..:::.::.: ::: :.
3hea:B FGRSDQPWTGNDYDTFADDIAQLIEHLDLKEVTLVGFSMGGGDVARYIARHGSARVAGLV
60 70 80 90 100 110
130 140 150 160 170 180
1hkh:B FLASLEPFLVQRDDNPEGVPQEVFDGIEAAAKGDRFAWFTDFYKNFYNLDENLGSRISEQ
.:... :.. :. : :.::: .:: ... :: ...:: ::.. : :. .:.
3hea:B LLGAVTPIFGQKPDYPQGVPLDVFARFKTELLKDRAQFISDFNAPFYGI--NKGQVVSQG
120 130 140 150 160 170
190 200 210 220 230
1hkh:B AVTGSWNVAIGSAPVAAYAVVPAWIE-DFRSDVEAVRAAGKPTLILHGTKDNILPIDATA
. : . ..:. .. :. : :. : ::: :. . . :::..:: :.:.:...:.
3hea:B VQTQTLQIALLASLKATVDCVTAFAETDFRPDMAKIDV---PTLVIHGDGDQIVPFETTG
180 190 200 210 220 230
240 250 260 270
1hkh:B RRFHQAVPEADYVEVEGAPHGLLWTHADEVNAALKTFLAK
. . . :. . ::::. :::...: : .:: .
3hea:B KVAAELIKGAELKVYKDAPHGFAVTHAQQLNEDLLAFLKR
240 250 260 270
This the alignment when the query and subject are switched.
>>1hkh:B (279 aa)
n-w opt: 497 Z-score: 381.8 bits: 78.4 E(1): 1.1e-241
global/global (N-W) score: 497; 38.9% identity (69.3% similar) in 280 aa overlap
10 20 30 40 50
3hea:B STFVAKDGT----QIYFKDWGSGKPVLFSHGWPLDADMWEYQMEYLSSRGYRTIAFDRRG
. ... . . ..:..: :::.::.. ::.:::. :: : . : ..:::.:..::::
1hkh:B GYITVGNENSTPIELYYEDQGSGQPVVLIHGYPLDGHSWERQTRELLAQGYRVITYDRRG
10 20 30 40 50 60
60 70 80 90 100 110
3hea:B FGRSDQPWTGNDYDTFADDIAQLIEHLDLKEVTLVGFSMGGGDVARYIARHGSARVAGLV
:: :.. :: :::::: :. ..: :::..:.::::::: :..:::.::.: ::: :.
1hkh:B FGGSSKVNTGYDYDTFAADLHTVLETLDLRDVVLVGFSMGTGELARYVARYGHERVAKLA
70 80 90 100 110 120
120 130 140 150 160 170
3hea:B LLGAVTPIFGQKPDYPQGVPLDVFARFKTELLKDRAQFISDFNAPFYGI--NKGQVVSQG
.:... :.. :. : :.::: .:: ... :: ...:: ::.. : :. .:.
1hkh:B FLASLEPFLVQRDDNPEGVPQEVFDGIEAAAKGDRFAWFTDFYKNFYNLDENLGSRISEQ
130 140 150 160 170 180
180 190 200 210 220 230
3hea:B VQTQTLQIALLASLKATVDCVTAFAETDFRPDMAKIDV---PTLVIHGDGDQIVPFETTG
. : . ..:. .. :. : :. : ::: :. . . :::..:: :.:.:...:.
1hkh:B AVTGSWNVAIGSAPVAAYAVVPAWIE-DFRSDVEAVRAAGKPTLILHGTKDNILPIDATA
190 200 210 220 230
240 250 260 270
3hea:B KVAAELIKGAELKVYKDAPHGFAVTHAQQLNEDLLAFLKR
. . . :. . ::::. :::...: : .:: .
1hkh:B RRFHQAVPEADYVEVEGAPHGLLWTHADEVNAALKTFLAK
240 250 260 270
When I use the Needleman-Wunsch alignment tool on the PDB website I get symmetric behavior. Using 1hkh:B as the query, these are the results from the PDB website (the page may take a second to load).
http://www.rcsb.org/pdb/workbench/workbench.do?action=pw_needlemanwunsch&mol=1hkh.B&mol=3hea.B
Using 3hea:B as the query, these are the results
http://www.rcsb.org/pdb/workbench/workbench.do?action=pw_needlemanwunsch&mol=3hea.A&mol=1hkh.B
Maybe the non-symmetric behavior is due to some feature of the ggsearch implementation? Or perhaps there's an option, to change this behavior? Here's the command I used to run ggsearch.
ggsearch36 path/to/query.fa path/to/subject.fa -s BP62 -E 10 -F 0
From my understanding -s BP62 sets the scoring matrix to Blosum62 and the gap penalties to -11/-1, the same as in BLASTP.
-E
sets the maximum evalue and -F
sets the minimum evalue, which doesn't matter too much here since I am only aligning two sequences that I know are fairly close.
Barring a bug, the alignment is not symmetric if the substitution matrix is not symmetric. Are you sure that BP62 refers to the BLOSUM62 matrix ? On the FASTA web site page for ggsearch, there's no BLOSUM62 but a BlastP62 matrix.
In the FASTA package manual page 16 it says
Then Christian spotted the most likely explanation. It's possible that PDB picked the same solution twice from the possible ones just by chance. It could also be that PDB's implementation finds all optimal alignments then sorts them by % identity and/or other criterion (on PDB's site, the alignment returned looks like the one with the highest % identity from the two you got with ggsearch).