Position Frequency Matrix to Position Weight Matrix, whose colsums need to be 1.
2
2
Entering edit mode
7.0 years ago
Zhilong Jia ★ 2.0k

In R,  Biostrings::PWM can convert the Position Frequency Matrix to Position Weight Matrix (pfm to pwm). but the colsums of pwm is not 1. However, in seqLogo::makePWM request a pwm with colsum is 1. What's the reason or wrong? Is there any other way to convert the pfm to pwm since seqLogo::makePWM is of necessity for me. Thank you.

Update: Code as below.

    library(Biostrings)
data(HNF4alpha)
pfm <- consensusMatrix(HNF4alpha)

#JASPAR_CORE/MA0002.2.pfm
pfm1 <- c(1387 , 2141,727,1517,56,0,0,62,346,3738,460,0,116
,1630,1060,1506,519,1199,5098,4762,1736,2729,236,0,0,1443
,851,792,884,985,3712,0,0,85,1715,920,4638,5098,3455
,1230,1105,1981,2077,131,0,336,3215,308,204,0,0,84)

pfm1 <- as.integer(pfm1)
pfm1 <- matrix(pfm1, nrow=4, ncol=13, byrow = TRUE)
rownames(pfm1) <- c("A", "C", "G", "T")
pfm[rownames(pfm1),] <- pfm1

pwm <- PWM(pfm)
colSums(pwm)
## 0.24534982  0.24245488  0.24187243  0.23899403  0.17464467 -0.15326904 -0.03531437 0.17292782  0.21962851  0.20141431 -0.03056159 -0.15326904  0.18119170

R bioconductor motif PWM PFM • 5.5k views
0
Entering edit mode

Hi, thanks for this. Did you find an answer to the issue above? Also, could you recommend the best way to obtain PWMs from the Jaspar collection of pfms?

0
Entering edit mode
6.9 years ago

Sorry for not picking this up earlier, but if you are still following could you post an example?

I think to convert from a frequency matrix to weight matrix should be simply done by dividing all columns by their colsums. So, it might just be a rounding error, which could be introduced possibly by large colsums, but that is hard to say without example.

0
Entering edit mode

0
Entering edit mode

Thank you for the code but Michael and others can help you better if you provide an example, with the actual numbers you see in the PWM and the PFM.

0
Entering edit mode

@RamRs, @Michael, From the output. It's not the  rounding error. Thank you.

0
Entering edit mode
6.1 years ago
pengchy ▴ 450

This can be calculated using prop.table function with margin=2

> a <- matrix(1:20,nr=4)
> a
[,1] [,2] [,3] [,4] [,5]
[1,]    1    5    9   13   17
[2,]    2    6   10   14   18
[3,]    3    7   11   15   19
[4,]    4    8   12   16   20

> prop.table(a,2)
[,1]      [,2]      [,3]      [,4]      [,5]
[1,]  0.1 0.1923077 0.2142857 0.2241379 0.2297297
[2,]  0.2 0.2307692 0.2380952 0.2413793 0.2432432
[3,]  0.3 0.2692308 0.2619048 0.2586207 0.2567568
[4,]  0.4 0.3076923 0.2857143 0.2758621 0.2702703