I have a file containing aligned dna sequences for a gene with 3 species :
>species1 ATGCTGATTCGATTGCGATTAGCTAGCTG >species2 ATGCTGTTTCGATTTTTATTAGCTAGCTG >species3 ATGCTCCTTCGATTGCGATTAGTTAGCTG
I would like to recover every sites that are different between the species 1 and species 2 (and is it a synonymous or non synonymous mutation) and also know if this position have more likely beed mutated on species 1 or species 2 based on the species 3 sequence (so if 1=A, 2=C and 3=A, then the mutation arised in species 2).
I tried to write a script in python but it doesn’t make differences between synonymous and non synonymous mutations .. I would like to know if something more reliable and commonly used exist ? Here is my script :
seq1=Species1 seq2=Species2 seq3=Species3 if seq3!="": for i in range(0,len(seq1)): if seq1[i]!=seq2[i]: if seq3[i]==seq1[i]: print("Mutation on seq2 : "+seq1[i]+"->"+seq2[i], i) elif seq3[i]==seq2[i]: print("Mutation on seq1 : "+seq2[i]+"->"+seq1[i], i) else : print("Non orientable mutation, seq1 : "+seq1[i]+" vs seq2 : "+seq2[i], i)
Thanks for the help,