I am using Kallisto to quantify the transcript expression. I will use the transcriptome reference file (cDNA fasta) as a reference.
I know we could download cDNA file directly in Fasta format from the Ensembl FTP site
But in NCBI, there is a new version of the assembly, so I want to use the cDNA file in NCBI, But I am not sure which one is the corresponding cDNA file in NCBI
After reading the README file it seems this file
is what I want but I am not sure they are the same thing I want.
*_rna.fna.gz file FASTA format of accessioned RNA products annotated on the genome assembly; Provided for RefSeq assemblies as relevant (Note, RNA and mRNA products are not instantiated as a separate accessioned record in GenBank but are provided for some RefSeq genomes, most notably the eukaryotes.) The FASTA title is provided as sequence accession.version plus description.
Can anyone help me clarify this question?
Which one is the corresponding proteome or transcriptome (cDNA fasta in Ensembl) file in NCBI (is it this file *_rna.fna.gz file)?
Which one (NCBI or ENSEMBL cDNA reference) is better for Kallisto quantification?