Position Weight Matrix From Sequence Alignment
2
3
Entering edit mode
13.0 years ago
Timtico ▴ 330

Hello,

I use a Perl script to find a specific sequence, and as output I parse a fastafile containing the match sequence including a defined number of basepairs.

Now I want to do a multiple sequence alignment (e.g. ClustalW), but I want a position weight matrix as output, since for compatibility issues I need to create sequence using an R package. (instead of online tools such as weblogo)

Anyone knows a alignment tool (Bioperl, Bioconductor, online?) which gives me the matrix as output?

Regards

clustalw multiple • 9.7k views
ADD COMMENT
0
Entering edit mode

Duplicate? Consensus Sequence

ADD REPLY
5
Entering edit mode
13.0 years ago
brentp 24k

You can create a position weight matrix with motility (python or c++). The interface is very simple so even if you don't use python, it should be straight-forward.

seqs = ['AGATAA', 'TGATAA', 'AGATAG']
pwm = motility.make_pwm(seqs)
print (pwm.max_score(), pwm.min_score())
# (11.1699250014, 0.0)
print pwm
#[[1.5849625007211563, 0.0, 0.0, 1.0],
# [0.0, 0.0, 2.0, 0.0],
# [2.0, 0.0, 0.0, 0.0],
# [0.0, 0.0, 0.0, 2.0],
# [2.0, 0.0, 0.0, 0.0],
# [1.5849625007211563, 0.0, 1.0, 0.0]]
ADD COMMENT
0
Entering edit mode

thanks for the answer, I'm not familiar with python however.. would like to keep working in R and Perl.

ADD REPLY
3
Entering edit mode
13.0 years ago
hadasa ★ 1.0k

Have a look at the BioStrings package in R. It is part of the Bioconductor. To install

source("http://www.bioconductor.org/biocLite.R"
 biocLite("Biostrings")
ADD COMMENT
0
Entering edit mode

nice, the consmat() function in the biostrings package seems to do the trick :)

ADD REPLY
0
Entering edit mode

Just an update. Now it is called: consensusMatrix() function.

ADD REPLY

Login before adding your answer.

Traffic: 2566 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6