Question

Annotation (.gff) and .fasta files as index in Hisat2

0

Entering edit mode

4.7 years ago

luanaaprogerio • 0

Hello, I´m trying to index a reference genome available in both .gff and .fasta formats in NCBI but the hisat2-build is just for one file format. How could I join together these 2 formats to create a single complete reference index?

alignment genome • 3.4k views

ADD COMMENT • link updated 4.7 years ago by h.mon 35k • written 4.7 years ago by luanaaprogerio • 0

1

Entering edit mode

That does not really make sense. The fasta file is intended to provide the actual DNA sequence. The annotation file lists positions of genomic elements such as exons, transcripts, coding sequences etc. One typically uses the GTF to extract splice sites, e.g. using the hisat2_extract_splice_sites.py. What is the aim of your analysis?

ADD REPLY • link 4.7 years ago by ATpoint 82k

score 1 · Answer 1 · 2019-08-14

If you want to incorporate the annotation into the index, you have to use the --ss and --exon options of hisat2-build.

--ss <path>      Note this option should be used with the following --exon option. 
                 Provide a list of splice sites (in the HISAT2's own format) as follows 
                 (four columns).

chromosome name <tab> zero-offset based genomic position of the flanking base on the left side of an intron <tab> zero-offset based genomic position of the flanking base on the right <tab> strand

                 Use hisat2_extract_splice_sites.py (in the HISAT2 package) to extract 
                 splice sites from a GTF file.

--exon <path>    Note this option should be used with the above --ss option. Provide a 
                 list of exons (in the HISAT2's own format) as follows (three columns).

chromosome name <tab> zero-offset based left genomic position of an exon <tab> zero-offset based right genomic position of an exon

                 Use hisat2_extract_exons.py (in the HISAT2 package) to extract exons 
                 from a GTF file.

You may need to convert the GFF to GTF to use these scripts, though.