Question: How to separate protein-coding and non-coding in gtf file
0
gravatar for Vasu
2.2 years ago by
Vasu450
Vasu450 wrote:

Hi,

In a gtf file I see "gene_type" column with different names like below. Among those names what all come under non-coding, protein_coding, lncRNA?

3prime_overlapping_ncRNA
IG_C_gene
IG_C_pseudogene
IG_D_gene
IG_J_gene
IG_J_pseudogene
IG_V_gene
IG_V_pseudogene
IG_pseudogene
Mt_rRNA
Mt_tRNA
TEC
TR_C_gene
TR_D_gene
TR_J_gene
TR_J_pseudogene
TR_V_gene
TR_V_pseudogene
antisense_RNA
bidirectional_promoter_lncRNA
lincRNA
macro_lncRNA
miRNA
misc_RNA
non_coding
polymorphic_pseudogene
processed_pseudogene
processed_transcript
protein_coding
pseudogene
rRNA
ribozyme
sRNA
scRNA
scaRNA
sense_intronic
sense_overlapping
snRNA
snoRNA
transcribed_processed_pseudogene
transcribed_unitary_pseudogene
transcribed_unprocessed_pseudogene
translated_processed_pseudogene
unitary_pseudogene
unprocessed_pseudogene
vaultRNA

I see the gene_type protein_coding. Are those only the protein_coding or should I also consider any other gene_type? What all come under non-coding? And lncRNA?

ADD COMMENTlink modified 2.2 years ago by Nicolas Rosewick8.8k • written 2.2 years ago by Vasu450
3
gravatar for Nicolas Rosewick
2.2 years ago by
Belgium, Brussels
Nicolas Rosewick8.8k wrote:

If you look at protein coding genes then yes you can only filter in the protein_coding type ones.

Here's an explanation for the different types found in ENSEMBL as suggested by i.sudbery : https://www.gencodegenes.org/gencode_biotypes.html

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Nicolas Rosewick8.8k

I might be wrong, but I think VEGA is now retired and the up to date reference for biotypes found in the recent (gencode based) ensembl builds is https://www.gencodegenes.org/gencode_biotypes.html

ADD REPLYlink written 2.2 years ago by i.sudbery8.2k

After looking on ENSEMBL website, you are right. I edit my answer.

ADD REPLYlink written 2.2 years ago by Nicolas Rosewick8.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2025 users visited in the last hour