Question: Is the union of cDNA sequences the exome?
0
gravatar for Bioaln
2.0 years ago by
Bioaln340
France
Bioaln340 wrote:

Hello all, I've recently started with DNA-related analysis, and was wondering, whether, if I take:

https://www.ensembl.org/info/data/ftp/index.html (cDNA)

does this represent, for example, the human exome? If not, what are the differences, and how can one obtain the missing information then?

Thank you!

dna-seq cdna exome • 576 views
ADD COMMENTlink modified 2.0 years ago by lieven.sterck8.5k • written 2.0 years ago by Bioaln340
2
gravatar for lieven.sterck
2.0 years ago by
lieven.sterck8.5k
VIB, Ghent, Belgium
lieven.sterck8.5k wrote:

the cDNA file will contain all mRNAs of the human genome. It will be CDS + UTR (if available) and thus represents the transcribed part of the genome that will eventually be translated into proteins.

ADD COMMENTlink written 2.0 years ago by lieven.sterck8.5k

To be more precise cDNA consists of all transcribed RNAs so mRNA as you say but also ncRNAs, pseudogenes, rRNAs, etc..

ADD REPLYlink written 2.0 years ago by Nicolas Rosewick9.0k

was thinking that as well but since they also offer a ncRNA fasta file I would assume they focus on the protein coding in the cDNA one but indeed possible it contains all transcribed things.

ADD REPLYlink written 2.0 years ago by lieven.sterck8.5k

cDNA = cDNA sequences for Ensembl or ab initio predicted genes.

is what's written on their site but does not give much additional info

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by lieven.sterck8.5k

So, technically this is the exome, i.e., the set of all (known) exons?

ADD REPLYlink written 2.0 years ago by Bioaln340

I would say yes indeed.

Depends however how you define 'exons' , it might be that it is mainly/only the ones being part of an mRNA and thus not includes the non-translated ones (not sure if you're interested in those as well)

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by lieven.sterck8.5k

Currently I am not, so this seems to hold! Thanks.

ADD REPLYlink written 2.0 years ago by Bioaln340
0
gravatar for Benn
2.0 years ago by
Benn8.0k
Netherlands
Benn8.0k wrote:

It is not exactly clear to me why and what you want to do with it, but if you look at your same link https://www.ensembl.org/info/data/ftp/index.html in column "gene sets" you will find GTF and GFF3 annotation files with all exons (in coordinates).

Just to show the difference between exons, mRNA, and CDS here the info from such annotation file of mouse genome. Let's have a look at the gene ENSMUST00000130201:

grep "ENSMUST00000130201" ensGene.gff3
chr1    ensGene mRNA    4773206 4785710 .   -   .   Name=ENSMUST00000130201;Parent=ENSMUSG00000033845;ID=ENSMUST00000130201;Alias=ENSMUSG00000033845
chr1    ensGene exon    4773206 4774516 .   -   .   Name=ENSMUST00000130201.exon4;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.exon4
chr1    ensGene exon    4777525 4777648 .   -   .   Name=ENSMUST00000130201.exon3;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.exon3
chr1    ensGene exon    4782568 4782733 .   -   .   Name=ENSMUST00000130201.exon2;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.exon2
chr1    ensGene exon    4783951 4784105 .   -   .   Name=ENSMUST00000130201.exon1;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.exon1
chr1    ensGene exon    4785573 4785710 .   -   .   Name=ENSMUST00000130201.exon0;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.exon0
chr1    ensGene three_prime_UTR 4773206 4774451 .   -   .   Name=ENSMUST00000130201.utr4;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.utr4
chr1    ensGene five_prime_UTR  4785678 4785710 .   -   .   Name=ENSMUST00000130201.utr0;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.utr0
chr1    ensGene CDS 4785573 4785677 .   -   0   Name=ENSMUST00000130201.cds0;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.cds0
chr1    ensGene CDS 4783951 4784105 .   -   0   Name=ENSMUST00000130201.cds1;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.cds1
chr1    ensGene CDS 4782568 4782733 .   -   1   Name=ENSMUST00000130201.cds2;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.cds2
chr1    ensGene CDS 4777525 4777648 .   -   0   Name=ENSMUST00000130201.cds3;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.cds3
chr1    ensGene CDS 4774452 4774516 .   -   2   Name=ENSMUST00000130201.cds4;Parent=ENSMUST00000130201;ID=ENSMUST00000130201.cds4

You'll see that the exons overlap the complete mRNA region, but not CDS.

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Benn8.0k

Thank you for this information. Indeed, this helps me identify the exons.

ADD REPLYlink written 2.0 years ago by Bioaln340
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1164 users visited in the last hour