What is the genic size (exon+intron) of a genome
3
0
Entering edit mode
7.8 years ago

Hi,

I wonder if there is a quick way to know the total genic length (exon+intron) of a genome. My guess it would be possible to parse a gtf file and for each gene get the start and end position, then merge overlapping genes. And compute the total size. But I wonder if there is maybe an other more easy way to do this (or if it's maybe available somewhere for well known species e.g. human, mouse,...)

Thanks

gene genome • 1.9k views
ADD COMMENT
1
Entering edit mode
7.8 years ago

I can't think of an easier way then parsing a gtf (or similar file) as you say. Surely for human or mouse there is already some table out there which Google should be able to find.

By the way, this database http://bionumbers.hms.harvard.edu/default.aspx is useful for this sort of questions.

ADD COMMENT
0
Entering edit mode
7.8 years ago
igor 13k

Keep only the coordinates (BED format) from the GTF file. Use bedtools merge (or similar) to combine overlapping regions. Sum up the sizes for the merged regions.

ADD COMMENT
0
Entering edit mode
7.8 years ago
Sinji ★ 3.2k

Doesn't bio-epic have this ability? Or maybe i'm mistaking what you're looking for. Here's the link, check the script named effective_genome_size.py.

ADD COMMENT
1
Entering edit mode

The effective genome size is the size of the genome that can actually be mapped against, not the genic length (i.e., the length taken by genes). The genic size is probably a couple percent of the effective size.

ADD REPLY
0
Entering edit mode

Ahh! Great, thanks for letting me know!

ADD REPLY

Login before adding your answer.

Traffic: 3195 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6