Looking For A Simple Method For Calculating Pwms/Pssms
3
1
Entering edit mode
14.3 years ago
Will 4.6k

I'm looking for a simple method for calculating position specific weight matrices (PWM) from a position occurrence matrix ... like what's found in the jaspar database:

    >MA0004.1 Arnt
A  [ 4 19  0  0  0  0 ]
C  [16  0 20  0  0  0 ]
G  [ 0  1  0 20  0 20 ]
T  [ 0  0  0  0 20  0 ]

I need to scan a large collection of sequences and submitting them to online services would be a complete hassle.

Once I have a PWM I know how to scan sequences, but I'm just having trouble creating them.

Thanks,

Will

PS. I'm looking for an equation (or psuedocode) for fining the PWM, not a library. I plan to implement it in python and matlab. I'd prefer not to make a wrapper for system-calls but to actually implement it in my code. Thanks

transcription pssm • 5.5k views
ADD COMMENT
0
Entering edit mode

Have you checked the wikipedia page for PSSM?. It covers the math behind PWMs / PSSM in a comprehensive way. It is not clear to me what you mean by position occurrence matrix.

ADD REPLY
0
Entering edit mode

I presumed that by 'position occurrence matrix' they meant a matrix of counts, rather than frequencies or weights, such as those in Jaspar, based on experimental data. The wikipedia page is fine, but I think the primary reference by Hertz and Stormo is preferable as it can be referenced in any future publications by Will.

ADD REPLY
2
Entering edit mode
14.3 years ago
Stew ★ 1.4k

This paper describes the process. Which you should be able to follow.

Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.

ADD COMMENT
1
Entering edit mode
14.3 years ago
brentp 24k

i know you say you're not looking for a library, but i feel compelled to link you to this. it even has a section titled "Loading and using JASPAR and TFD".

ADD COMMENT
1
Entering edit mode
14.2 years ago
Will 4.6k

Although this question is answered and closed I figured I'd post a new library I found which completely implements everything I wanted to do. Its called MOODS and the source can be found here. This program uses a C++ interface to improve speed but has python and perl bindings.

ADD COMMENT

Login before adding your answer.

Traffic: 889 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6