Reverse Engineer a PWM from a logo?
2
0
Entering edit mode
3 months ago
Laura ▴ 50

Hi!

I am looking for a way to create a position-weighted matrix from a logo. There are many tools to create a logo from a pwm, but not the other way around. Basically, I'm looking for a way to reverse engineer a logo back to a pwm.

There was a tool called Logo2PWM, but it does not seem to work any longer.

Does anyone know of any tools for this?

I am hoping to not have to re-invent a version of Logo2PWM. For example, if I have a logo like:

How can I convert it to a matrix like:

Obviously these don't match but you get the idea.

meme pwm motif • 270 views
1
Entering edit mode
3 months ago
ngarber ▴ 60

Yes, this is possible - as Mensur said above, you will need to get the letter heights as accurately as possible. If you have a vector graphics image, you can extract the actual heights from it using any vector graphics software. If you just have a rasterized image (e.g. PNG, JPEG, etc.), you can still do it by using letter height in pixels, although it's a little less accurate, with the accuracy being proportional to the resolution.

However, you don't actually need to do the reverse of the information content calculation (measured in bits) - this value is only related to the height of the whole stack of nucleotides at a particular position, not the relative heights of nucleotides within the stack. In fact, the relative heights of nucleotides within the stack are the relative frequencies of that letter as found in the position-weighted matrix.

So to make up some numbers to illustrate the point, let's say you have a sequence logo, and at position 1, the heights are:

A = 100 pixels
C = 10 pixels
T = 25 pixels
G = 37 pixels
Sum = 100 + 10 + 25 + 37 = 172

The relative frequencies would be:

A = 100/172 = 0.581
C = 10/172 = 0.058
T = 25/172 = 0.145
G = 37/172 = 0.215

And those would form the values in your position-weighted matrix for the column at position 1.

Again though, this gets inaccurate if the number of pixels is low, so you want as high of resolution as possible (assuming you don't have vector graphics files)! Hope this helps :)

0
Entering edit mode
3 months ago
Mensur Dlakic ★ 21k

I don't think there is a way to do this from an image, or at least not very accurately. If you have an EPS file, it could be done because letter heights are plain numbers in EPS files. From them, and with the number of sequences in the alignment, one could reconstruct the PWM by doing the reverse of information content calculation.