How to calculate log scores from an alignment?
2
0
Entering edit mode
7.0 years ago

Hi,

I'm making a practice exam. I have the question, and the correct answer, but I don't know the calculation. Could someone help me with this?

I have 6 sequences:

Seq1  CACCGGATGA
Seq2  CGCCGGATGG
Seq3  CGAAAGGTCG
Seq4  CGTAGCATCG
Seq5  GCTATCATCA
Seq6  GCTAGCATCA

The question is to calculate some scores. For example: Score (G, C). This is the explanation:

enter image description here

Some questions:

  1. How is the 31 of n(G&C) calculated?
  2. Why do they share that score by 150? Is that always 150?
  3. How do they calculate Score (G,C) = 0.6?

Because, when I use the given formula for score(g,c), it is: 2 * 2log(31/150) / ( 2 * 0.28333 * 0.3 ) ) = -16

And not 0.6.

Could anyone help me with this? Thanks :)

log scores matrix alignment • 1.5k views
ADD COMMENT
0
Entering edit mode

NO :) :) :) :) :) :)

ADD REPLY
0
Entering edit mode

NO :) :) :) :) :) :)

Moved to a comment because obviously not an answer to this thread. Not helpful. Not respectful. There is absolutely no need to post this.

ADD REPLY
2
Entering edit mode
7.0 years ago

1.

n(C&G) is the number of times when C is substituted with G (or G with C). For example, In the first column of the alignment you have 4 C and 2 G, which gives you 4*2 = 8 C <-> G subsitutions. In the second column, you have 3 * 2 = 6 C<>G substitutions, In the third column, 2 * 0 = 0 and so on. If you sum these values through all 10 columns you will have 31.

2.

150 is the number of all possible substitions in this alignment. In each column, you have 6*5/2 = 15 possible substitutions. So in the whole alignment you have 15 * 10 = 150 possible substituions.

3.

Score(C,G) = 2 * log2[(31/150)/2*17/60*18/60] = 2 * log2(0.21/0.17) = 0.56
ADD COMMENT
1
Entering edit mode
7.0 years ago

1- n(x&y) refers to the number of times x and y are aligned

2- 150 is the total number of all possible pairs of nucleotides in the 6 sequences of length 10: 10 * 6!/(2!(6-2)!)

3- I think your problem is that you're not using the base 2 log i.e. 2log means log2.

ADD COMMENT

Login before adding your answer.

Traffic: 4004 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6