Dear all, I am trying to create index file using the splice site and exon information from the .gtf file. However, using the hisat2 python commands in Ubuntu, I get empty output files. I verified hisat2 command with Saccharomyces gtf file and it produces respective files properly. In conclusion, I concluded that my .gtf file may have some format differences with the Saccharomyces gtf file. So I compared both but it seems correct in format too.
Here is a preview of my .gff3 file:
chloro . exon 5158 6606 . - . ID=Ljchlorog3v0000040.1.exon.1;Parent=Ljchlorog3v0000040.1;sequencetype=Protein coding chloro . CDS 5158 6585 . - 0 ID=Ljchlorog3v0000040.1.CDS.1;Parent=Ljchlorog3v0000040.1;sequencetype=Protein coding
I feel that the description in the last column may have something to do with the splice site information extraction. However, for splice site info, exon number description in the last column maybe needed. I wonder if instead of exon.1 in the last column, I need to state it as "exon number = 1", for the splice site information extraction using the hisat2 python command.
FYI: I extracted the exons file by simply selecting exon rows and subtracting 1 from the positions in column 4 & 5.
Please help. Regards.