I have some paired end RNA-Seq from rats which is a 2 conditions x 4 replicates experiment design. My primary goal is to find DEGs between the 2 different conditions.
After QC and triming there are the clean_fastq files.
I want to follow the Nature Protocol Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. So first I have to get HISAT index file and GTF annotation file.
I first downloaded the index file HISAT provided on its website named R. norvegicus, UCSC rn6 genome, so now I need to get UCSC GTF file. But UCSC does not seem to provide that file.
So I try to get GTF file for Rat in Ensembl: Ensembl-GTF-Rat named Rattus_norvegicus.Rnor_6.0.92.gtf.gz. Next I assumed is to get HISAT index from Ensembl genome. But I got lost which file to get as I went to ftp://ftp.ensembl.org/pub/release-92/fasta/rattus_norvegicus/ folder. I totally did not know which file I should get to build index. I tried the /pub/release-92/fasta/rattus_norvegicus/dna/Rattus_norvegicus.Rnor_6.0.dna.toplevel.fa.gz file and build the index, which resulted in a huge file that was much much bigger than the UCSC index HISAT provided and my PC coomplaind about out of memory several times. So I supposed I got the wrong file, again.
Could someone tell me what I am doing wrong? What should I do now? Thanks!