Observed to Expected CpG
2
0
Entering edit mode
5.3 years ago

Observed to Expected CpG is calculated as below :

Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G)   where N = length of sequence.

I also don't understand the expected CpG which is:

Expected =  (Number of C * Number of G)/ N

Can someone give an example or intuition for the above formula ?

genome CpG • 4.1k views
ADD COMMENT
0
Entering edit mode
4.9 years ago

Example: a sequence list is

ccattcgactcatcacgctccccccccc cccccccccccttatccgttccgttcgacgtatttcgttgtctaatttctgacgtaactt gttccctgttaagtaccgtttatggcctatactccggtatttaaaacgacgacgattcca ccgtaaagccgtcaaccagatgaacgacctcgctcgttatatttttccggca

GC content=(70+31)/200=0.505=50.5%

Obs/Exp CpG =19 * 200/70/31

expected=(70 * 31)/200

ADD COMMENT
0
Entering edit mode
4.0 years ago
lukelahood • 0

This is my way of thinking of the formula for "expected"

Lets say you just have a single C in a 200 long nucleotide chain. The probability that the next nucleotide is G (and thus, the probability that you have a single CpG island) is #G/200. The probability in this case is the expected number of CpG repeats. So if 50 of the nucleotides are G, the chances of getting 1 CpG islands is 0.25, and on average you'd expect 0.25 CpG islands.

However, usually there is more than 1 C in a nucleotide chain. Every C you have is another shot at having a CpG island, so since every C gives you another chance, you multiply the above calculated probability by the C. This gives you #G/200 * #C

ADD COMMENT

Login before adding your answer.

Traffic: 2898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6