Creating a PWM from a PFM - redistribute gapped alignment count
0
0
Entering edit mode
9.3 years ago
Affan ▴ 300

I have a position frequency matrix which I want to convert to a frequency weight matrix using the PWM() function of BioConductor in R.

The PFM is

A          271          342          445          1017          547           648          673          722          660          935          793          531
C          262          673          316          155           80            54           70           90           67           55           88           100
G          98           83           75           58            43            41           56           126          254          539          443          220
T          532          468          872          597           1046          891          991          855          879          267          412          153
M          0            0            0            0             0             0            0            0            0            0            0            0
R          0            0            0            0             0             0            0            0            0            0            0            0
W          0            0            0            0             0             0            0            0            0            0            0            0
S          0            0            0            0             0             0            0            0            0            0            0            0
Y          0            0            0            0             0             0            0            0            0            0            0            0
K          0            0            0            0             0             0            0            0            0            0            0            0
V          0            0            0            0             0             0            0            0            0            0            0            0
H          0            0            0            0             0             0            0            0            0            0            0            0
D          0            0            0            0             0             0            0            0            0            0            0            0
B          0            0            0            0             0             0            0            0            0            0            0            0
N          0            0            0            0             0             0            0            0            0            0            0            0
-          712          309          167          48            159           241          85           82           15           79           139          871
+          0            0            0            0             0             0            0            0            0            0            0            0
.          0            0            0            0             0             0            0            0            0            0            0            0

So now, the problem is that I have counts of "-" which came from the gapped alignment (done using ClustalW).

Now my main question is that would it be okay for me to redistribute the counts of "-" equally to the rest of the bases? I've also heard the suggestion of ignoring the "-" row and just using the PFM as the PWM. What would be a better solution for research? I believe I can redistribute the counts and be okay.

position-weight-matrix • 2.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 2587 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6