Question: Probing specific protein amino acid position changes in R
0
4.9 years ago by
debra.ragland0 wrote:

I have 15 protein sequences of 99 amino acids each. After doing some looking around I have found that there are several ways you can read sequences into R and do pairwise or multiple alignments. I, however, do not know how to probe changes at specific positions. For instance, I would like to know the best way to align a standard sequence with one(1) or several mutant sequences and probe each amino acid position that does not match the standard sequence. In other words seq1 = "standard amino acid seq" and seq2 = "mutant seq", align these 2 and then have a way to ask R to report whether there is a change at position 10, or 11, or 12 and so on such that R reports(for example) TRUE or FALSE for this question. Where all the sequences that have a reported TRUE for a change at position X can be grouped against those that do not have a change at this position.

I'm not even sure that R is the best way to do this, but it's the only language I'm somewhat familiar with.

I hope this makes sense. Any help will be appreciated.

R • 2.0k views
modified 4.9 years ago by 5utr350 • written 4.9 years ago by debra.ragland0
0
4.9 years ago by
5utr350
5utr350 wrote:

There are indeed many different function that you can use to manipulate strings, this is an example using Biostrings:

``````#load package
library(Biostrings)
#reference sequence
standardseq<- AAString("MARKSLEMSIR")
#query sequences
seq1 <- AAString("MARKSLEMSER")
seq2 <- AAString("MDRKSLEMSER")
seq3 <- AAString("MDRKSAEMSER")
seq4 <- AAString("MARKSLEMSIR")
#merge them into a list
queryseq=list(seq1,seq2,seq3,seq4)

#apply function pmatch and return mismatching position for every query seq
mutationpositions=lapply(queryseq,function(X) c(1:length(X))[ is.na( pmatch( as.vector(standardseq),as.vector(X) ) )] )
#mismatching position for every query seq
mutationpositions
``````

You can then use the list 'mutationposition' to split matching/unmatching sequences or check the AA change.