Question: How many gene transcripts are there in Rat genome (latest assembly)?
0
gravatar for natarajan.padmaja
5 months ago by
natarajan.padmaja10 wrote:

I am working on an RNA-Seq project with sequencing data from Rat samples. I would appreciate some help with obtaining the latest list of Rat (Rattus-norvegecus) transcripts (FASTA file). It appears that Rat genome (Rnor_6.0 assembly) has about 41,000+ gene transcripts as opposed to over 135,000+ Mouse gene transcripts (numbers from Ensembl database). Could anyone confirm if this is the latest info’ for Rat species? I have also looked at NCBI site for Rat transcripts and I see over 69000 transcripts (in the RefSeq categories NM, NR, XM and XR).

rna-seq assembly • 279 views
ADD COMMENTlink modified 5 months ago by luxeredias10 • written 5 months ago by natarajan.padmaja10
2
gravatar for colindaven
5 months ago by
colindaven790
Hannover Medical School
colindaven790 wrote:

Not sure about your exact question. As you say I'd expect the mouse genome to be much more exactly defined than the rat due to the number of groups working on it.

I have found the Rat Genome Database to be very good but have only been working on genomics so far, not RNA-seq.

To see the differences between different rat annotations I would strongly recommend mapping them to the genome with gmap, and / or importing them into a genome browser for visual comparison at multiple loci.

ADD COMMENTlink written 5 months ago by colindaven790
1
gravatar for genomax
5 months ago by
genomax55k
United States
genomax55k wrote:

Ensembl's stats are available on this page. Actual file can be downloaded here (filter for Rat).

ADD COMMENTlink written 5 months ago by genomax55k
1
gravatar for luxeredias
5 months ago by
luxeredias10
luxeredias10 wrote:

Dear OP, I'm going through the exact same process right now!

I was also a little skeptical about the number of transcripts in the Rat transcriptome, as opposed to the much larger number in the mouse one. However, I believe colindaven's answer explains it: not so many groups use rat as a model, so less information is known on its transcriptome.

As for genomax's answer, that was also where I obtained my transcriptome to index Salmon (does this happen to be the same reason you have to need this transcriptome?). In the stats link genomax posted, it says the rat transcriptome has 41,078 transcripts. However, the "Rattus_norvegicus.Rnor_6.0.cdna.all.fa" file has only 31,715 seqs. If you look into the non-coding RNA file (here: ftp://ftp.ensembl.org/pub/release-92/fasta/rattus_norvegicus/ncrna/Rattus_norvegicus.Rnor_6.0.ncrna.fa.gz), it has 9,331 seqs. Together, cdna and ncrna files amount to 41,046, which is roughly the number of transcripts said to be present in the rat transcriptome. I believe this could be how ensemble got to the number on the stats page of the rat genome/transcriptome.

You will also find an abinitio transcript file in the ensemble ftp (ftp://ftp.ensembl.org/pub/release-92/fasta/rattus_norvegicus/cdna/Rattus_norvegicus.Rnor_6.0.cdna.abinitio.fa.gz) which has 59,821 seqs. I do not know how this file relates to the others, so if anyone could help clear that up it would be great!

All in all, I used the shorter cdna fasta (31,715 transcripts) to index Salmon, but have also built indexes using the abinitio fasta and the concatenated file between cdna and ncrna. I'll do further anlyses running salmon with each one to see where I get to.

Best,

Thomaz

ADD COMMENTlink written 5 months ago by luxeredias10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 691 users visited in the last hour