How often to expect this particular consensus sequence?
1
0
Entering edit mode
9.7 years ago
gms2005gms • 0

How often should I expect to see a consensus sequence of GGNGC, where N is any base and there is less than 120 nucleotides separating this consensus sequence to the start of another of the same sequence? I really have no clue where to start. Should I take into account all four possible consensus sequences replacing nucleotide N?

consensus-sequence • 1.8k views
ADD COMMENT
0
Entering edit mode
  1. Is this a homework question?
  2. What do you know? That is, do you know the actual 5mer frequency or do you have to assume that they're all equally distributed? This question alone should give you a hint on how to get started.
ADD REPLY
0
Entering edit mode
9.7 years ago
Cytosine ▴ 460

Sounds like a statistics homework. I'll try to give it a shot, but I'm no statistics expert...

Assuming you have a multinomial distribution of equal probabilities for every nucleotide, you can calculate the chance for this consensus sequence to occur with the info you provided.

After you calculate the probability for this sequence, you can use the information about the occurence frequency to calculate the chance to see this sequence every 120 nucleotides, assuming again that you have a multinomial distribution of equal probabilites.

That percentage will tell the probability of the given sequence appearing exactly once in windows of 120 nucleotides. If the percentage you observe in your test data is higher, then I suppose that means the sequence is enriched.

ADD COMMENT

Login before adding your answer.

Traffic: 2037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6