I used the following commands:
hisat2_extract_splice_sites.py Homo_sapiens.GRCh38.80.gtf > splice_sites.txt hisat2_extract_exons.py Homo_sapiens.GRCh38.80.gtf > exons.txt hisat2-build referenceData/fasta/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa \ --ss referenceData/hisat2_index/splice_sites.txt \ --exon referenceData/hisat2_index/exons.txt \ referenceData/hisat2_index/GRCh38.hisat2 Settings: Output files: "referenceData/hisat2_index/GRCh38.hisat2.*.ht2" Line rate: 7 (line is 128 bytes) Lines per side: 1 (side is 128 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Local offset rate: 3 (one in 8) Local fTable chars: 6 Local sequence length: 57344 Local sequence overlap between two consecutive indexes: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 0 Sizeofs: void*:8, int:4, long:8, size_t:8 Input files DNA, FASTA: referenceData/fasta/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa Reading reference sizes Time reading reference sizes: 00:00:22 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:16 Time to read SNPs and splice sites: 00:00:02"
It's been running for over 1 hour. In my directory the outputs were created:
GRCh38.hisat2.0.rf (27GB) GRCh38.hisat2.1.ht2 (8.2kb) GRCh38.hisat2.2.ht2 (0 bytes) GRCh38.hisat2.3.ht2 (11.3Kb) GRCh38.hisat2.4.ht2 (736 MB) GRCh38.hisat2.7.ht2 (13.1 MB) GRCh38.hisat2.8.ht2 (2.6 MB)
It hasn't given any errors yet, but I'm worried. It's my first time analyzing RNA-seq data, does anyone know what's going on?