Short protein sequence alignment
0
0
Entering edit mode
20 months ago

Hello all, I was wondering if there any tools available for doing multiple sequence alignment for N terminal residues (say just 20 residues). I tried Blast but it gives me alignment with only one among the two subject sequences provided. I had separated the two subject sequences using a comma (is that the correct way?). I know I can manually do it, especially the identity part but for similarity, I might have to open the amino acid table. So just wanted some insights on any tool or way to do this?

alignment sequence blast • 427 views
0
Entering edit mode

For multiple sequence alignment you can use MAFFT:

https://mafft.cbrc.jp/alignment/server/

https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=BlastHelp

0
Entering edit mode

Thanks, Fatima for replying!For example this is the query protein's residues: SDPLSMVGPSQGRSPSYAS and I want to know the identity and similarity of this query protein's residue with subject protein 1: VNTHAGGTGPEGCRPFAKF and subject protein 2: HLESDMFSSPLETDSMDPF Again, these are short and I can do it manually but wanted to know if there an insilico way to do it.

0
Entering edit mode
>query
SDPLSMVGPSQGRSPSYAS

>subject1
VNTHAGGTGPEGCRPFAKF
>subject2
HLESDMFSSPLETDSMDPF


If your sequences were longer you could use blastp (Align two or more sequences option).

MAFFT output

id1             -------SDPLSMVGPSQGRSPSYAS
id3             HLESDMFSSPLETDSMD----PF---
id2             ------VNTHAGGTGPEGCR-PFAKF


MAFFT FASTA output

>id1
-------SDPLSMVGPSQGRSPSYAS
>id3
HLESDMFSSPLETDSMD----PF---
>id2
------VNTHAGGTGPEGCR-PFAKF


Then you can get the pairs that you are interested in and clean up the columns with gaps in both sequences:

>id1
SDPLSMVGPSQGRSPSYAS
>id2
VNTHAGGTGPEGCR-PFAKF

>id1
-------SDPLSMVGPSQGRSPSYAS
>id3
HLESDMFSSPLETDSMD----PF---


https://mafft.cbrc.jp/alignment/server/spool/_out.200213155522842eoKWxmY7tMldACkJ1fPvVlsfnormal.pir

Other tools: https://www.ebi.ac.uk/Tools/psa/

0
Entering edit mode

Fatima, I tried MAFFT with my original dataset but I am not sure how to interpret the results since there is no e value/identity/ similarity percentage given. For example; how can I interpret the output below:

>id1
SDPLSMVGPSQGRSPSYAS
>id2
VNTHAGGTGPEGCR-PFAKF

0
Entering edit mode

I'm not sure about pairwise alignments but for multiple sequence alignment you can use MAFFT and then guidance

Please see the output of guidance:

http://guidance.tau.ac.il/results/15816469509376/MSA.MAFFT.Guidance_res_pair_res.html

GUIDANCE alignment score: 0.306977

https://www.nature.com/articles/s41598-019-56499-4

Pairwise alignment tools:

https://www.ebi.ac.uk/Tools/psa/

0
Entering edit mode

Thank you Fatima for all your help and also introducing me to MAFFT and guidance :) I shall read more about these.