Question

Probability of having same starting base on complementary strands

0

Entering edit mode

6.7 years ago

Gene_MMP8 ▴ 240

So this question came up in a lecture on "Introduction to transcription". Our professor asked, what is the probability of having the same starting base on the forward and the reverse strand. In other words, if I have

5'_______3'  
ATTGCCATAT  
TAACGGTATA  
3'_______5'

What are the odds of that happening? (same for other bases, T,G,C) My answer is as follows:
P(A)=P(T)=P(G)=P(C)=1/4 So, P(Aon5' and Aon3')=P(A).P(A) =1/16
and, P(Gon5' and Gon3')=P(G).P(G)=1/16, and going on like this, we add up (for all bases) and get 1/4. Am I correct?
The issue here is this fact: The probability of having same base on reverse strand =1/4 = Probability of having any one base
Is there any significance to this?

sequence • 1.0k views

ADD COMMENT • link updated 6.7 years ago by Devon Ryan 104k • written 6.7 years ago by Gene_MMP8 ▴ 240

1

Entering edit mode

This model assumes that the p for each base is 1/4. Coming from a biological standpoint, transcription (start) sites are highly clustered by binding motifs, so by far not a random distribution of nucleotides. To have it accurately, one probably needs to correct for factors like GC content. So I would say a naive probability as you propose will not be accurate. Maybe you have a look at papers about motif enrichments and how they model nucleotide occurrence.

ADD REPLY • link 6.7 years ago by ATpoint 82k

0

Entering edit mode

Thanks for replying. Will look into factors like GC content.

ADD REPLY • link 6.7 years ago by Gene_MMP8 ▴ 240

score 2 · Answer 1 · 2017-08-24

2

Entering edit mode

6.7 years ago

Devon Ryan 104k

Yes, assuming equal probability bases, the probability of having a base and its complement at opposite ends of a given interval is 25%. Yes, this is because the probability of any base is 25% and would be different otherwise. Of course this is unlikely to match what happens in any organism for the reasons ATPoint mentioned.

ADD COMMENT • link 6.7 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks for clearing.

ADD REPLY • link 6.7 years ago by Gene_MMP8 ▴ 240

0

Entering edit mode

By curiosity , what is other bias to consider for this kind of problem ?

Is there influences of genetic code ? (i thought about proportion of base on codon position )

ADD REPLY • link 6.7 years ago by Titus ▴ 910

0

Entering edit mode

The proportion of a base on a strand needn't equal the proportion at a given end of some interval (consider transcription factor binding sites as an example where this is very much not the case).

ADD REPLY • link 6.7 years ago by Devon Ryan 104k