Hello, I am working with RNA seq data and creating an index of reference genome Gossypium hirsutum by using STAR. STAR asks GTF annotation format while my file is GFF3. According to literature, in order to run GFF file I need to remove --sjdbOverhang 50 and also need to replace -sjdbGTFfile with sjdbGTFfeatureExon.
My question is:
Removing "sjdbOverhang 50" from command will affect the index efficacy? As sjdbOverhang option needs to be specified for detecting possible splicing sites.
Is replacing sjdbGTFfile with sjdbGTFfeatureExon mean indexing will be of exons only not of whole genome? Please explain what "sjdbGTFfeatureExon" meant?
This is the command I used:
STAR --runMode genomeGenerate --genomeDir indexes/Gh --genomeFastaFiles /mnt/e/fizza\ data/S017679/trimmed/Ghirsutumv1.1_genome.fasta
--sjdbGTFfeatureExon exonfile /mnt/e/fizza\ data/S017679/trimmed/Ghirsutumv1.1_genome_repeat.gff3
--outFileNamePrefix Gh_align
Regards Fizzah