Question: Why are the sum of all exons so much longer than CDS?
1
gravatar for Jeremy Leipzig
6 weeks ago by
Philadelphia, PA
Jeremy Leipzig18k wrote:

Is this because of non-coding RNA? I thought these would have been at least comparable.

library(GenomicFeatures)
txdb <- makeTxDbFromEnsembl("Homo Sapiens",server="useastdb.ensembl.org")
gr<-cds(txdb)
sum(width(reduce(gr)))
[1] 41901692

gr<-exons(txdb)
sum(width(reduce(gr)))
[1] 153094341
exons cds • 203 views
ADD COMMENTlink modified 6 weeks ago by swbarnes25.2k • written 6 weeks ago by Jeremy Leipzig18k
1

yes, there are a bunch of non-coding RNA families (rRNA, tRNA, miRNA, snRNA, lncRNA, ...) which can be in your exon list but not in your CDS

ADD REPLYlink written 6 weeks ago by JC7.7k
5
gravatar for Eric Lim
6 weeks ago by
Eric Lim1.3k
Boston
Eric Lim1.3k wrote:

3' and 5' UTR as well as non-coding species.

ADD COMMENTlink written 6 weeks ago by Eric Lim1.3k
2

how about that!

sum(width(unlist(fiveUTRsByTranscript(txdb))))
[1] 21299710
sum(width(unlist(threeUTRsByTranscript(txdb))))
[1] 86927764
ADD REPLYlink written 6 weeks ago by Jeremy Leipzig18k
1

Yeah, the mean 3' UTR is around 40% of the length of a transcript and a not insubstantial number of UTRs are more than 75% of the transcript.

ADD REPLYlink written 6 weeks ago by i.sudbery4.3k

4-5x is right in the neighborhood of their average between 5' and 3'. :)

Table 1 in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC139023/

ADD REPLYlink written 6 weeks ago by Eric Lim1.3k
0
gravatar for swbarnes2
6 weeks ago by
swbarnes25.2k
United States
swbarnes25.2k wrote:

Also, ensembl will count an exon as unique if it overlaps with other exons, so a lot of sequence is being counted over and over again because it belongs to multiple slightly different exons.

ADD COMMENTlink written 6 weeks ago by swbarnes25.2k

That’s why I used reduce

ADD REPLYlink written 6 weeks ago by Jeremy Leipzig18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1121 users visited in the last hour