Entering edit mode
4.1 years ago
ranusharma09
•
0
Hi
I have a file with n columns. I would like to map two columns with Uniprot IDs to calculate percent identity between them. What would be the best way to do it?
Input file is:
Col1 Col2 -------- Coln
G8JKW1 P20248
U6GD34 P20249
Output should be
Col1 Col2 %identity -------- Coln
G8JKW1 P20248 60
U6GD34 P20249 40
Thanks
How are you getting 60 and 40? Provide example data and expected output.
You'll have to make some decisions/assumptions. Do you want global or local alignment? What similarity matrix do you want to use? etc. I'm not aware of a tool that computes alignment in R, you will have to extract the sequences through biomaRt potentially and call an outside software like
needle
from EMBOSS to align each pair and parse the results.