Question

Calculates the percent sequenceCalculates the percent sequence identity for a pairwise sequence alignment

0

Entering edit mode

4.0 years ago

Emy Alade ▴ 20

Hello! How can i calculate the percentage identity between a pair of sequences?

Tem1_SRR6260399.808938_1_250_- TTGATTTGGCATATGTGGACAATAAGACGACAAATGATTATCAGATTATTCGGGTTCCAGCTCTTGTTTAATTTTACATGGAGCATTTTCTTCTTTTATCTCCGGAGTCCGTTGCTAGGTTTTGCAAACATTTTGGTGCTGGATGTGCTTGTTGTTTATTATATGATAGAAAGTTATCCGGTGAAGAAATCTTCGGCATACCTTTTTGTTCCTTATCTTTTGTGGTTGATTCTTGCCACTTATCTTAAC Tem5_SRR6260399.418888_1_220_+ CTCCTTATGAAAAAAATAATCCCCATACTGATTGCCATACTCATCTGTTTTGGTGTAGGCTGTACTGCTTCTTATTTTCAGTCGGAGGCCATACTCAACTGGTATCCTACATTGGACAAACCTTCTCTTACACCACCTGATATGGCTTTTCCCATTGCTTGGAGCCTTATCTATCTGTGCATGGGAATTTCTCTCGGGTTGATTTGGCATATGTGG

pairwise sequence alignment identity • 1.2k views

ADD COMMENT • link updated 3.6 years ago by Jeremy Leipzig 23k • written 4.0 years ago by Emy Alade ▴ 20

0

Entering edit mode

Is this homework? (Do be honest, and the community will help.)

What have you tried already? Have the sequences been aligned? Are these the only sequences you need to deal with, or are there more?

ADD REPLY • link 4.0 years ago by Dunois ★ 2.9k

0

Entering edit mode

I have 14 sequences that correspond to the TSPO protein found by HMM in a control condition. So I aligned them with Clustal and I saw that they align perfectly in the middle. The question is to know if it is the same protein. so I want to calculate the percentage of identity

ADD REPLY • link 4.0 years ago by Emy Alade ▴ 20

1

Entering edit mode

Are you using Clustal-Omega? If yes, you can supply --distmat-out=<somefilename> and --percent-id to it on the command line (along with --full), and it will output a pairwise distance matrix that reports percent identity. So something like this:

clustalo --full --percent-id --distmat-out=test_distmat.txt -o test_out.msa -i test.fa

ADD REPLY • link 4.0 years ago by Dunois ★ 2.9k