Entering edit mode

15 months ago

Lasha
▴
10

For each DNA there is 6 possible frame, thus 6 different protein can be coded (3 for parallel transcribing genes + 3 for anti-parallel transcribing genes). I'm trying to calculate the rough estimation for 2 independent protein with the same length, what is the chance there exist at least 1 DNA which codes both? Basically, I have 2 questions: (i) is my estimation correct based on my simplified assumptions or are some assumptions "too rough"? (ii) can I include it in my research? or will this data be useful for you as a research reader (research is about intrinsic regions of dual-coding genes)?

hard to completely understand... very obviously you aren't going to encode a protein without a start codon... so you need to factor that into the math

I can explain if you are stuck at a certain point. I'm ignoring the start codon, as these regions which I'm going to evaluate are sub-intervals on their coding sequence (CDS), so it's only important that this sequence must hold 2 protein sequence data.

but so you have a 60% chance that with 3-nt you encode the same protein?