Mapping list of Uniprot IDs to calculate percent identity
0
0
Entering edit mode
5.6 years ago

Hi

I have a file with n columns. I would like to map two columns with Uniprot IDs to calculate percent identity between them. What would be the best way to do it?

Input file is:

Col1           Col2  -------- Coln 
G8JKW1    P20248  
U6GD34   P20249

Output should be

Col1           Col2         %identity -------- Coln 
G8JKW1    P20248    60
U6GD34   P20249    40

Thanks

sequence R gene • 708 views
ADD COMMENT
0
Entering edit mode

How are you getting 60 and 40? Provide example data and expected output.

ADD REPLY
0
Entering edit mode

You'll have to make some decisions/assumptions. Do you want global or local alignment? What similarity matrix do you want to use? etc. I'm not aware of a tool that computes alignment in R, you will have to extract the sequences through biomaRt potentially and call an outside software like needle from EMBOSS to align each pair and parse the results.

ADD REPLY

Login before adding your answer.

Traffic: 3422 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6