Question: Pairwise Sequence Alignment With R
gravatar for Lakshmi
8.8 years ago by
Lakshmi50 wrote:

Hi all,

I used the following R program for pairwise sequence alignment. My data set contains 90 protein sequences. I would like to do pairwise sequence alignment with each pair of sequences.

seq1<- read.fasta(file = " first.fasta")
seq2<- read.fasta(file = " second.fasta")
seq1string <- toupper(c2s(seq1[[1]]))
seq2string <- toupper(c2s(seq2 [[2]]))
globalAlign<- pairwiseAlignment(seq1string, seq2string)
pid(globalAlign, type = "PID3")

file second.fasta is my dataset. first.fasta file contains first sequence . Using this program I am doing pairwise sequence alignment with first sequence and second sequence.Next, I have to do alignment with first and third sequence. first and 4th sequence etc upto 90. What all changes do I have to make in my program for visualizing the alignment of each pair of sequences?

R alignment • 14k views
ADD COMMENTlink modified 5 months ago by gjhansi1110 • written 8.8 years ago by Lakshmi50

put a loop around... what else?

ADD REPLYlink written 8.8 years ago by Michael Dondrup47k

Hello, I am trying to align 50 sequences with one sequence and find percent identity, i.e seq1 vs seq2, seq1 vs seq2 etc.

i have tried the above code, @David W x <- sapply(seq2,function(seq1) pairwiseAlignment(seq2,seq1,type ="global",substitutionMatrix = mat, gapOpening = 10, gapExtension = 1))

I could get the alignment score but when i am trying to get percent identity it gives me the following error, pid(x, type = "PID4")

   Error:         Error in (function (classes, fdef, mtable)  : 
                    unable to find an inherited method for function ‘pid’ for signature ‘"list"’

The error may be because sapply returns list, but "pid" takes pariwiseAlignmentsingle subject.

Thank you

ADD REPLYlink written 5 months ago by gjhansi1110

Hi gjhansi111,

I am having the same problem - do you have a solution for this?

Many thanks


ADD REPLYlink written 3 months ago by tom.lewin10
gravatar for David W
8.8 years ago by
David W4.8k
New Zealand
David W4.8k wrote:

Hi Laksmi,

It's not quite clear from your questoin, but do you want to do a pairwise alignment of each of your 90 sequences against a particular sequence (ie seq2[[1]] v seq1 then seq2[[2]] v seq1 in your example) or you want to do all the possible pairwsie comparisons between your 90 sequences.

The first one is easy, use an apply function. I don't have bioconductor on this computer, so this isn't tested, but something like

sapply( seq2, function(x) pairwiseAlignment(toupper(c2s(x)), seq1string)) )

It's probably more readible if you define a function first:

convert_then_align <- function(a,ref_seq){
  seq_string <- toupper(c2s(a))
  return(pairwiseAlignment(seq_string, ref_seq))

sapply(seq2, convert_then_align, seq2string)

All possible combinations is a little tricker, the way I do these is to make the indices first:

all_pairs <- combn(1:length(seq2), 2)

Then you need a function to do your converting and aligning

align_from_index <- function(sequences, index.a, index.b){
  seq1 <- toupper(c2s(sequences[index.a]))
  seq2 <- toupper(c2s(sequences[index.b]))
  return( pairwiseAlignment(seq1, seq2) )

res <- apply(all_pairs, 2, function(indices) align_from_index(sequences, indices[1], indices[2]) )

Which you can turn into a matrix if pairwiseAlignment returns something that makes sense for that (there must be a less hacky way of doing this!):

attributes(res) <- attributes(dist(1:length(seq2))
res <- as.matrix(res)
ADD COMMENTlink modified 8.8 years ago • written 8.8 years ago by David W4.8k

@david Thank you very much for your answer.

ADD REPLYlink written 8.8 years ago by Lakshmi50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1703 users visited in the last hour