Question: Where did 'antisense' biotype go when transitioning from mm9 to mm10 genome build in encode?
gravatar for adrija.kalvisa
3 months ago by
adrija.kalvisa0 wrote:

I have been comparing GTF annotations for mm9 and mm10 genome builds from ensembl and noticed that several biotypes are unique for mm9 and mm10 gtf files.

transcript_biotype for mm10 (but not for mm9): IG_C_pseudogene, IG_D_pseudogene, IG_LV_gene, IG_pseudogene, IG_V_pseudogene, lncRNA, non_stop_decay, nonsense_mediated_decay, processed_pseudogene, retained_intron, ribozyme, scaRNA, scRNA, sRNA, TEC, TR_C_gene, TR_D_gene, TR_J_gene, TR_J_pseudogene, TR_V_gene, TR_V_pseudogene, transcribed_processed_pseudogene, transcribed_unitary_pseudogene, transcribed_unprocessed_pseudogene, translated_processed_pseudogene, translated_unprocessed_pseudogene, unitary_pseudogene, unprocessed_pseudogene

Biotype for mm9 (but not for mm10): 3prime_overlapping_ncrna, antisense, lincRNA, ncrna_host, non_coding, processed_transcript, sense_intronic, sense_overlapping

So my question is: What happened to the biotypes that went missing? For example, what happened to the "antisense" biotype in mm9? By looking at few selected transcripts belonging to "antisense" biotype in mm9 (such as Nespas-003, Gm16119-002, 1300015D01Rik-003, C130080G10Rik-003), I can not find these transcript names in mm10 anymore.

mm9 gtf file was generated like this:


zless Mus_musculus.NCBIM37.67.gtf.gz | grep -v "NT_" | perl -ane 'print "chr$_";' > mm9_ensembl.gtf

mm10 gtf file was generated like this:


zless Mus_musculus.GRCm38.97.chr.gtf.gz | grep -v "^#" | perl -ane 'print "chr$_";' > mm10_ensembl.gtf
ensembl gtf mm10 mm9 • 105 views
ADD COMMENTlink modified 3 months ago by ATpoint32k • written 3 months ago by adrija.kalvisa0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 856 users visited in the last hour