Why are there more lincRNA genes than transcripts in Mus_musculus.GRCm38.100.gtf?
1
0
Entering edit mode
2.3 years ago

Hi All,

There are 52446 annotated genes (ENSMUSG IDs) and 142,699 transcripts (ENSMUST IDs) in Mus_musculus.GRCm38.100.gtf. It makes sense that there are WAY more transcripts than genes. My question, however, is - why are there more genes than transcripts for lincRNA in the genome? Specifically there are 9847 genes and 8357 transcripts.

-Jen

Genes Transcripts Genome Annotation • 1.1k views
ADD COMMENT
0
Entering edit mode

How can there be a gene without a transcript? Some reproducible code available?

library(rtracklayer)

gtf <- import("http://ftp.ensembl.org/pub/release-100/gtf/mus_musculus/Mus_musculus.GRCm38.100.gtf.gz")

lincrna <- gtf[gtf$gene_biotype=="lincRNA"]

knitr::kable(data.frame(n_genes=length(unique(lincrna$gene_id)),
                        n_tx=length(unique(lincrna$transcript_id))))

| n_genes| n_tx|
|-------:|----:|
|    5629| 8358|

More tx than genes, as it biologically must be.

ADD REPLY
0
Entering edit mode
#STAR 
STAR --genomeDir star --readFilesCommand zcat --readFilesIn Rep2_Data/2_1.fq.gz Rep2_Data/2_2.fq.gz --outSAMtype BAM SortedByCoordinate --limitBAMsortRAM 16000000000 --outSAMunmapped Within --twopassMode Basic --outFilterMultimapNmax 1 --quantMode TranscriptomeSAM --runThreadN 16 --outFileNamePrefix Rep2_star_output/STAR_TP2/
#RSEM
RSEM-1.3.1/rsem-calculate-expression --bam -p 24 \
--paired-end --forward-prob .5 \
Rep2_star_output/STAR_TP2/Aligned.toTranscriptome.out.bam \
rsem/GRCm38 Rep2_rsem_output/RSEM_TP2/rsem >& \
Rep2_rsem_output/RSEM_TP2/rsem.log
ADD REPLY
0
Entering edit mode

To identify which biotypes are associated with each gene or transcript I ran the IDs through ensembl.org/biomart. Not sure if that is what is causing this....?

ADD REPLY
0
Entering edit mode
2.3 years ago
JC 13k

why are there more genes than transcripts for lincRNA in the genome? Specifically there are 9847 genes and 8357 transcripts.

A gene contains one or more transcripts, for lincRNAs, there each lincRNA there is a gene record, and many have only 1 associated transcript

ADD COMMENT

Login before adding your answer.

Traffic: 2657 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6