why gtf file has 0 length of CDS region
1
0
Entering edit mode
6.8 years ago
wangshx ▴ 10

I retrieve CDS region from gtf file (ensembl v75). When I check the region length of CDS, I am very curious about the following result

library(data.table)

annotation <- fread('Homo_sapiens.GRCh37.75.gtf')

> summary(annotation[annotation$V3=="CDS",]$V5 - annotation[annotation$V3=="CDS",]$V4)    

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    0.0    77.0   115.0   149.2   162.0 21692.0

More than 100 CDS regions have 0 length. Why? Please help me if you know, thanks a lot.

R genome annotation • 1.4k views
ADD COMMENT
2
Entering edit mode
6.8 years ago

GTF is 1-based, so those are 1 base long microexons.

ADD COMMENT
0
Entering edit mode

It is new for me that exists 1 base long microexons.

ADD REPLY

Login before adding your answer.

Traffic: 2096 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6