Pairwise Sequence Alignment With R
1
5
Entering edit mode
12.3 years ago
Lakshmi ▴ 60

Hi all,

I used the following R program for pairwise sequence alignment. My data set contains 90 protein sequences. I would like to do pairwise sequence alignment with each pair of sequences.

library("seqinr")
seq1<- read.fasta(file = " first.fasta")
seq2<- read.fasta(file = " second.fasta")
seq1string <- toupper(c2s(seq1[[1]]))
seq2string <- toupper(c2s(seq2 [[2]]))
library(Biostrings)
globalAlign<- pairwiseAlignment(seq1string, seq2string)
globalAlign 
pid(globalAlign, type = "PID3")

file second.fasta is my dataset. first.fasta file contains first sequence . Using this program I am doing pairwise sequence alignment with first sequence and second sequence.Next, I have to do alignment with first and third sequence. first and 4th sequence etc upto 90. What all changes do I have to make in my program for visualizing the alignment of each pair of sequences?

alignment r • 19k views
ADD COMMENT
5
Entering edit mode

put a loop around... what else?

ADD REPLY
0
Entering edit mode

Hello, I am trying to align 50 sequences with one sequence and find percent identity, i.e seq1 vs seq2, seq1 vs seq2 etc.

i have tried the above code, @David W x <- sapply(seq2,function(seq1) pairwiseAlignment(seq2,seq1,type ="global",substitutionMatrix = mat, gapOpening = 10, gapExtension = 1))

I could get the alignment score but when i am trying to get percent identity it gives me the following error, pid(x, type = "PID4")

   Error:         Error in (function (classes, fdef, mtable)  : 
                    unable to find an inherited method for function ‘pid’ for signature ‘"list"’

The error may be because sapply returns list, but "pid" takes pariwiseAlignmentsingle subject.

Thank you

ADD REPLY
0
Entering edit mode

Hi gjhansi111,

I am having the same problem - do you have a solution for this?

Many thanks

Tom

ADD REPLY
5
Entering edit mode
12.3 years ago
David W 4.9k

Hi Laksmi,

It's not quite clear from your questoin, but do you want to do a pairwise alignment of each of your 90 sequences against a particular sequence (ie seq2[[1]] v seq1 then seq2[[2]] v seq1 in your example) or you want to do all the possible pairwsie comparisons between your 90 sequences.

The first one is easy, use an apply function. I don't have bioconductor on this computer, so this isn't tested, but something like

sapply( seq2, function(x) pairwiseAlignment(toupper(c2s(x)), seq1string)) )

It's probably more readible if you define a function first:

convert_then_align <- function(a,ref_seq){
  seq_string <- toupper(c2s(a))
  return(pairwiseAlignment(seq_string, ref_seq))
}

sapply(seq2, convert_then_align, seq2string)

All possible combinations is a little tricker, the way I do these is to make the indices first:

all_pairs <- combn(1:length(seq2), 2)

Then you need a function to do your converting and aligning

align_from_index <- function(sequences, index.a, index.b){
  seq1 <- toupper(c2s(sequences[index.a]))
  seq2 <- toupper(c2s(sequences[index.b]))
  return( pairwiseAlignment(seq1, seq2) )
}

res <- apply(all_pairs, 2, function(indices) align_from_index(sequences, indices[1], indices[2]) )

Which you can turn into a matrix if pairwiseAlignment returns something that makes sense for that (there must be a less hacky way of doing this!):

attributes(res) <- attributes(dist(1:length(seq2))
res <- as.matrix(res)
ADD COMMENT
0
Entering edit mode

@david Thank you very much for your answer.

ADD REPLY

Login before adding your answer.

Traffic: 2957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6