Entering edit mode
5.3 years ago
mariasv1
•
0
Hi all,
I have a large multiple nucleotide alignment, and I need to delete specific nucleotide positions from each sequence (basically I need to delete several columns of nucleotides). Someone familiar with a sufficient way to do this?
I was thinking to read an alignment as data.frame in R, and then delete selected columns. But I am having troubles to separate nucleotides by columns.
what I have now:
mydata<-readDNAStringSet("myalignmet.fas")
names<-names(mydata)
sequence<-paste(mydata)
ptid<-names
df<-data.frame(names, sequence)
df$sequence<-as.character(df$sequence)
df2$sequence<-strsplit(df2$sequence, " ")
I am getting a column df$sequence as c("A", "C", "G", "T", ...).
Thanks in advance, Maria
How do you determine which to delete? Do you know the column numbers? I have edited your title to make it more specific.
Yes, I have specific nucleotide positions which I want to delete (they associate with resistance)
There probably is a way to do this in R, but I'd do it in Python. Do you know any Python?
column wise deletion from an MSA can be done in bioedit mariasv1