Question: What'S The Difference Between Cds And Orf?
6
gravatar for shiy05
22 months ago by
shiy0560
shiy0560 wrote:

What's the difference between the terms CDS and ORF?

ADD COMMENTlink modified 22 months ago by Christian710 • written 22 months ago by shiy0560
9
gravatar for Gjain
22 months ago by
Gjain3.9k
Worcester, MA
Gjain3.9k wrote:

Hi,

In more details:

ORFs:

The region of the nucleotide sequences from the start codon (ATG) to the stop codon is called the Open Reading frame.

Gene finding in organism specially prokaryotes starts form searching for an open reading frames (ORF). An ORF is a sequence of DNA that starts with start codon “ATG” (not always) and ends with any of the three termination codons (TAA, TAG, TGA). Depending on the starting point, there are six possible ways (three on forward strand and three on complementary strand) of translating any nucleotide sequence into amino acid sequence according to the genetic code .These are called reading frames.

While eukaryotic gene finding is altogether a different task as the eukaryotic genes are not continuous and interrupted by intervening noncoding sequences called ‘introns’. Moreover organization of genetic information in eukaryotes and prokaryotes is different.

CDS:

The Coding Sequence (CDS) is the actual region of DNA that is translated to form proteins. While the ORF may contain introns as well, the CDS refers to those nucleotides(concatenated exons) that can be divided into codons which are actually translated into amino acids by the ribosomal translation machinery.

Mainly: CDS means only that the sequence is known to be transcribed and, therefore, it is coding for something -- neither gene nor protein has to be known. Any full mRNA sequence (obtained from cDNA sequencing) will have a full coding sequence. ORF is usually predicted based on DNA sequence and not proven to be transcribed.

Sources:

ADD COMMENTlink modified 1 day ago • written 22 months ago by Gjain3.9k
1

I like your graphics;)

ADD REPLYlink written 22 months ago by Leszek2.9k

Thanks, I thought some visuals would be nice.

ADD REPLYlink written 22 months ago by Gjain3.9k

The image is not accessible anymore, any chance to uploading another version to another hosting service perhaps? This is a quite popular post.

ADD REPLYlink written 1 day ago by Istvan Albert ♦♦ 39k

Thanks for bringing this up. I have updated the image. This should be permenant. 

ADD REPLYlink written 1 day ago by Gjain3.9k

thanks, this post is one of the most accessed ones on Biostar, and most likely thanks to that the image

ADD REPLYlink written 1 day ago by Istvan Albert ♦♦ 39k

You're welcome... I am happy to contribute. 

ADD REPLYlink written 1 day ago by Gjain3.9k
3
gravatar for Leszek
22 months ago by
Leszek2.9k
Barcelona, Spain
Leszek2.9k wrote:

CDS - coding dna sequence - > only sequence that is translated into protein
ORF - open reading frame -> entire gene sequence 5'-utr + transcript (all exons + introns) + 3'-utr

ADD COMMENTlink written 22 months ago by Leszek2.9k
3

CDS is right, but ORF is wrong - Gjain's definition below is correct: an ORF is just a nucleotide sequence from a start to a stop codon.

ADD REPLYlink written 22 months ago by Gareth Morgan210
1

I like the brevity of your answer.

ADD REPLYlink modified 22 months ago • written 22 months ago by Gjain3.9k

an ORF is the part of the mRNA sequence, starting at an intiation codon (usually AUG), that terminates either at a stop codon (TAA, TAG or TGA for the standard genetic code), or at the end of the sequence, if no stop codon is found in the same phase; the later case meaning that the mRNA sequence is incomplete. Usually, the AUG codon is embedded in a longer less defined sequence (for example, Kozak sequence for vertebrates).

ADD REPLYlink written 17 months ago by Mycroft3490
2
gravatar for Dave Lunt
22 months ago by
Dave Lunt1.7k
Hull, UK
Dave Lunt1.7k wrote:

ORF (Open Reading Frame) is best seen as a hypothesis of a protein coding region. It is the stretch of DNA between a start codon and the next stop codon. It is not a hypothesis of the whole protein coding region in eukaryotes (due to introns). CDS should be the whole coding region.

Both those start/stop 'codons' could be just randomly found in an intergenic region that does not actually code for any protein- so not every ORF means a protein. An ORF will be found between the actual start codon of a protein coding gene and the next stop codon. It is quite possible that this stop codon will be found in an intron, in which case the ORF includes an exon and part of an intron. Since introns are mostly just random sequence a stop codon could just occur by chance. If the intron by chance does not contain a stop 'codon' (ie 3 nucleotides TAA/TAG/TGA in the same reading frame as the exon) then the ORF will continue until it meets a stop codon- either randomly in the next intron, else a genuine stop at the end of the gene.

If the intron without a stop is not a multiple of 3 nucleotides, then it will introduce a frameshift, and the next stop could easily occur within the next exon. If it is a multiple of 3 it will introduce false amino acids into the ORF as it continues through the intron and into the exon. These sorts of errors are not uncommon in gene annotation, since intron detection is complex, and if it 'reads through' the intron might not be annotated until cDNA sequences are compared to the genome sequence.

If you want to see a demonstration of these ideas try getting a sequence from GenBank for a gene that contains a leader sequence 5'-UTR, exons, introns, 3'UTR. The CDS will be annotated as such and will just be exonic regions. Take this gene sequence and use NCBI ORF-Finder which will outline all the potential ORFs. Some of these, but not all, will be the actual coding parts.

ADD COMMENTlink written 22 months ago by Dave Lunt1.7k
0
gravatar for Christian
22 months ago by
Christian710
Vienna
Christian710 wrote:

I would define an open reading frame (ORF) as any stretch of nucleotide sequence from start to top codon (coding or not coding for protein), whereas a coding sequence (CDS) is a nucleotide sequence that is believed to code for protein. A CDS can correspond to an individual exon of a protein-coding gene or represent the complete (spliced) sequence of a protein-coding transcript.

ADD COMMENTlink written 22 months ago by Christian710
Please log in to add an answer.

Help
Access
  • RSS
  • Stats
  • API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.0.0
Traffic: 619 users visited in the last hour