Calculates the percent sequenceCalculates the percent sequence identity for a pairwise sequence alignment
0
0
Entering edit mode
17 months ago

Hello! How can i calculate the percentage identity between a pair of sequences?

Tem1_SRR6260399.808938_1_250_- TTGATTTGGCATATGTGGACAATAAGACGACAAATGATTATCAGATTATTCGGGTTCCAGCTCTTGTTTAATTTTACATGGAGCATTTTCTTCTTTTATCTCCGGAGTCCGTTGCTAGGTTTTGCAAACATTTTGGTGCTGGATGTGCTTGTTGTTTATTATATGATAGAAAGTTATCCGGTGAAGAAATCTTCGGCATACCTTTTTGTTCCTTATCTTTTGTGGTTGATTCTTGCCACTTATCTTAAC Tem5_SRR6260399.418888_1_220_+ CTCCTTATGAAAAAAATAATCCCCATACTGATTGCCATACTCATCTGTTTTGGTGTAGGCTGTACTGCTTCTTATTTTCAGTCGGAGGCCATACTCAACTGGTATCCTACATTGGACAAACCTTCTCTTACACCACCTGATATGGCTTTTCCCATTGCTTGGAGCCTTATCTATCTGTGCATGGGAATTTCTCTCGGGTTGATTTGGCATATGTGG

pairwise sequence alignment identity • 549 views
0
Entering edit mode

Is this homework? (Do be honest, and the community will help.)

What have you tried already? Have the sequences been aligned? Are these the only sequences you need to deal with, or are there more?

0
Entering edit mode

I have 14 sequences that correspond to the TSPO protein found by HMM in a control condition. So I aligned them with Clustal and I saw that they align perfectly in the middle. The question is to know if it is the same protein. so I want to calculate the percentage of identity

1
Entering edit mode

Are you using Clustal-Omega? If yes, you can supply --distmat-out=<somefilename> and --percent-id to it on the command line (along with --full), and it will output a pairwise distance matrix that reports percent identity. So something like this:

clustalo --full --percent-id --distmat-out=test_distmat.txt -o test_out.msa -i test.fa