Question: Trouble with choosing the right reference transcriptome file for mm10
3 months ago by
wyoo083

Hello, I am having a trouble with Salmon quant and Kallisto quant. I have used Mus_musculus.GRCm38.ncrna.fa.gz file as the reference transcriptome for both Transcripts fasta file for Salmon quant and Kallisto quant. Even though I was expecting to have an almost identical value with TPM, I ended up getting a different TPM value for both Salmon quant and Kallisto quant. I have no idea which one went wrong, or perhaps both. For other parameters, I have set everything as default. I need a reference transcriptome for mm10 (Mus Musculus) in my RNA-Seq. Where can I find the correct file for reference transcriptome for mm10? (I have read that and NCBI also offers reference transcriptome, but I just don't know which file to use.

In addition, is there a way I can check if my TPM values are correct (quality check for Salmon or Kallisto?) ?

Thank you so much for helping!

the ncrna in the name could already give you a hint what the file represents - so I would change the file you use. But this is completely independent of the question why you think that two different programs should deliver identical results for a complex task implemented with different assumptions

Ido Tamir

If you are looking for transcriptome file as your title says you should take a look at the files available from GENCODE. There are versions available for all transcripts, protein-coding, long-non coding classes. Look under Fasta files on page linked.

genomax

I second the comment of Ido, and would like to add, that you probably are looking for the cDNA file from ensembl, which you can find here: mouse cDNA from ensembl version 100. This should work for you.

General FTP download server from Ensembl

Edit: Edited the download file, since genomax pointed out, that you need the transcriptome file

caggtaagtat
