Question: How many gene transcripts are there in Rat genome (latest assembly)?
gravatar for natarajan.padmaja
22 months ago by
natarajan.padmaja20 wrote:

I am working on an RNA-Seq project with sequencing data from Rat samples. I would appreciate some help with obtaining the latest list of Rat (Rattus-norvegecus) transcripts (FASTA file). It appears that Rat genome (Rnor_6.0 assembly) has about 41,000+ gene transcripts as opposed to over 135,000+ Mouse gene transcripts (numbers from Ensembl database). Could anyone confirm if this is the latest info’ for Rat species? I have also looked at NCBI site for Rat transcripts and I see over 69000 transcripts (in the RefSeq categories NM, NR, XM and XR).

rna-seq assembly • 833 views
ADD COMMENTlink modified 22 months ago by luxeredias10 • written 22 months ago by natarajan.padmaja20
gravatar for colindaven
22 months ago by
Hannover Medical School
colindaven1.9k wrote:

Not sure about your exact question. As you say I'd expect the mouse genome to be much more exactly defined than the rat due to the number of groups working on it.

I have found the Rat Genome Database to be very good but have only been working on genomics so far, not RNA-seq.

To see the differences between different rat annotations I would strongly recommend mapping them to the genome with gmap, and / or importing them into a genome browser for visual comparison at multiple loci.

ADD COMMENTlink written 22 months ago by colindaven1.9k
gravatar for genomax
22 months ago by
United States
genomax78k wrote:

Ensembl's stats are available on this page. Actual file can be downloaded here (filter for Rat).

ADD COMMENTlink written 22 months ago by genomax78k
gravatar for luxeredias
22 months ago by
luxeredias10 wrote:

Dear OP, I'm going through the exact same process right now!

I was also a little skeptical about the number of transcripts in the Rat transcriptome, as opposed to the much larger number in the mouse one. However, I believe colindaven's answer explains it: not so many groups use rat as a model, so less information is known on its transcriptome.

As for genomax's answer, that was also where I obtained my transcriptome to index Salmon (does this happen to be the same reason you have to need this transcriptome?). In the stats link genomax posted, it says the rat transcriptome has 41,078 transcripts. However, the "Rattus_norvegicus.Rnor_6.0.cdna.all.fa" file has only 31,715 seqs. If you look into the non-coding RNA file (here:, it has 9,331 seqs. Together, cdna and ncrna files amount to 41,046, which is roughly the number of transcripts said to be present in the rat transcriptome. I believe this could be how ensemble got to the number on the stats page of the rat genome/transcriptome.

You will also find an abinitio transcript file in the ensemble ftp ( which has 59,821 seqs. I do not know how this file relates to the others, so if anyone could help clear that up it would be great!

All in all, I used the shorter cdna fasta (31,715 transcripts) to index Salmon, but have also built indexes using the abinitio fasta and the concatenated file between cdna and ncrna. I'll do further anlyses running salmon with each one to see where I get to.



ADD COMMENTlink written 22 months ago by luxeredias10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 997 users visited in the last hour