Question: What Does Cds And Exon Mean In A Gtf File?
2
gravatar for dfernan
6.7 years ago by
dfernan650
United States
dfernan650 wrote:

Hi,

I have a hg19 GTF file that I ordered according to chromosome start and end positions, and group by gene_id.

Here is an example of a few lines of the file:

lines 45-47

chr1  refGene transcript  367659  368597  . + . gene_id "OR4F29"; transcript_id "NM_001005221"; gene_name "OR4F29";
chr1  refGene exon  367659  368597  . + . gene_id "OR4F29"; transcript_id "NM_001005221"; exon_number "1"; exon_id "NM_001005221.1"; gene_name "OR4F29";
chr1  refGene CDS 367659  368594  . + 0 gene_id "OR4F29"; transcript_id "NM_001005221"; exon_number "1"; exon_id "NM_001005221.1"; gene_name "OR4F29";

lines 143-146

chr1  refGene transcript  861121  879961  . + . gene_id "SAMD11"; transcript_id "NM_152486"; gene_name "SAMD11";
chr1  refGene exon  861121  861180  . + . gene_id "SAMD11"; transcript_id "NM_152486"; exon_number "1"; exon_id "NM_152486.1"; gene_name "SAMD11";
chr1  refGene exon  861302  861393  . + . gene_id "SAMD11"; transcript_id "NM_152486"; exon_number "2"; exon_id "NM_152486.2"; gene_name "SAMD11";
chr1  refGene CDS 861322  861393  . + 0 gene_id "SAMD11"; transcript_id "NM_152486"; exon_number "2"; exon_id "NM_152486.2"; gene_name "SAMD11";

I was wondering what the CDS means. I mean the transcript is one isoform of the gene, the exon are the exon positions of the given isoforms, but what'd the CDS mean?

Sorry about such basic Q but I got confused.

gtf rna-seq • 17k views
ADD COMMENTlink modified 6.7 years ago by Istvan Albert ♦♦ 81k • written 6.7 years ago by dfernan650
16
gravatar for Istvan Albert
6.7 years ago by
Istvan Albert ♦♦ 81k
University Park, USA
Istvan Albert ♦♦ 81k wrote:

Don't feel bad this is not a basic question at all. The terminology is not nearly as obvious as it should be moreover the exact definitions may carry many subtle details. It is the so called Sequence Ontology that specifies the meaning of each term:

CDS: "A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon."

http://www.sequenceontology.org/browser/current_cvs/term/SO:0000316

Exon: "A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing."

http://www.sequenceontology.org/browser/current_cvs/term/SO:0000147

ADD COMMENTlink written 6.7 years ago by Istvan Albert ♦♦ 81k
4

Ironically I have come to believe that this definition is incorrect. The CDS should not actually include the stop codon and most annotations that label a feature as CDS do not include the stop codon. Which IMHO is the correct behavior the stop codon is not actually translated into an amino acid so it is not actually coding.

ADD REPLYlink written 5.3 years ago by Istvan Albert ♦♦ 81k
6
gravatar for Sangwoo Kim
6.7 years ago by
Sangwoo Kim380
UC San Diego
Sangwoo Kim380 wrote:

IMHO, exon contains both of UTR and CDS. So the CDS is the sequence that actually makes proteins. In you example of SAMD11, the region upstream of 861322 is thought to be 5' UTR which is transcribed to mRNA but does not build a protein.

ADD COMMENTlink written 6.7 years ago by Sangwoo Kim380
1

This clarify things better. 5' UTR, CDS and 3'UTR make up exons.

ADD REPLYlink written 3.3 years ago by epigene470
1
gravatar for fo3c
6.7 years ago by
fo3c430
.eu
fo3c430 wrote:

CoDing Sequence

ADD COMMENTlink written 6.7 years ago by fo3c430
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1225 users visited in the last hour