Question: Multi Pairwise Sequence Alignment in R
0
gravatar for mastercod3r
4.9 years ago by
mastercod3r0 wrote:

Hello guys, I am final year bioinformatics student, doing internship at the moment. So i am sorry for my noobiness in advance. 

Here is what i was told to do, i have 100+ fasta files, and need to run pairwise sequence alignment with each other and save the outcome. Here is what i did and the errors i am getting. 

library(seqinr)

setwd("C:\\Users\\Celik\\Downloads\\fasta\\R_deneme")
files = list.files(pattern="*.fasta")
seq = lapply(files, read.fasta)
library(Biostrings)

pws.alignments <- sapply(1:(length(seq)-1), function(ind)

{
sapply((ind+1):length(seq), function(sec.ind)
{
return(pairwiseAlignment(seq[[ind]], seq[[sec.ind]]))
})
})

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘seqtype’ for signature ‘"SeqFastadna"’

 

I appreciate your help, thanks.

ADD COMMENTlink modified 4.9 years ago by Brice Sarver3.5k • written 4.9 years ago by mastercod3r0
1

maybe try readDNAStringSet instead read.fasta ?

ADD REPLYlink written 4.9 years ago by poisonAlien2.8k
2
gravatar for Brice Sarver
4.9 years ago by
Brice Sarver3.5k
United States
Brice Sarver3.5k wrote:

Are you doing a true pairwise alignment, i.e., each fasta file has a single sequence? Make sure that seq[[ind]] etc. are of class DNAStringSet. You might need to do a single indexing (seq[ind]) depending on what the lapply returns. Also, the comment about using readDNAStringSet is correct. If not, check Biostrings for MSAs.

Also, alignments in general are quite slow in R, with the exception of simple cases. I would recommend 1) doing this very simply with calls to Muscle or Mafft in a shell script or 2) using a system call in R.

ADD COMMENTlink written 4.9 years ago by Brice Sarver3.5k

Hello, thanks for your replies guys. 

Yes i need to do a true pairwise alignment, each fasta file has a single sequence. I ll work on readDNAStringset hopefully it will work. 

And lastly yes i was many times told that alignment programs would be easier to do this job but my supervisor insists on doing it in R. So yeah gotta do it with R. Thanks again. 

ADD REPLYlink written 4.9 years ago by mastercod3r0

Hello Brice, after applying readAAStringSet, it works but partially. i put 5 files in a directory to try out the coding so what it does is

runs pairwise with file1 vs file2... till file 5 (no problem), but then file2 is till file4, file4 is till f2, and file5 is till file1.

i dont know where is the mistake. here is my update code again. Sorry for bad english. Thanks in advance.

 

library(seqinr)
library(Biostrings)
 
setwd("C:\\Users\\Celik\\Downloads\\fasta\\R_deneme")
files = list.files(pattern="*.fasta")
seq = lapply(files, readAAStringSet)
pws.alignments <- sapply(1:(length(seq)), function(ind)
{
sapply((ind):length(seq), function(sec.ind)
{
return(pairwiseAlignment(seq[[ind]], seq[[sec.ind]]))
})
})

 

ADD REPLYlink written 4.9 years ago by mastercod3r0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1586 users visited in the last hour