Entering edit mode
2.7 years ago
Sora Yoon
▴
20
I have just found an example that biomart's transcript_length is not identical with transcript_end - transcript_start.
ensembl_gene_id mgi_symbol chromosome_name strand start_position end_position gene_biotype transcript_start transcript_end strand.1 transcript_length
128537 ENSMUSG00000037860 Aim2 1 1 173178445 173293606 protein_coding 173178445 173293606 1 2839
128538 ENSMUSG00000037860 Aim2 1 1 173178445 173293606 protein_coding 173246762 173255347 1 383
128539 ENSMUSG00000037860 Aim2 1 1 173178445 173293606 protein_coding 173248164 173287285 1 744
Does anyone know why such discrepancy happen??
Thanks
Thanks. Then, it means that a great portion is intron in Aim2 gene.
Yes, this is almost always the case. Exons make up 2% of the human genome, but transcripts, including introns, cover 40% of the human genome (ref)