Using augustus gene model gff as STAR input
13 months ago

Hi all, I'm trying to do a DGE analysis on a non-model organism that has a published Augustus gff file & accompanying fasta file. I keep getting this error:

terminate called after throwing an instance of 'std::out_of_range'
what():  vector::_M_range_check
/home/alice/tmp.sh: line 27: 31318 Aborted                 STAR --runMode genomeGenerate --genomeDir $genomeDir --sjdbGTFfile$GTF --genomeFastaFiles \$REFERENCE --runThreadN 4


I believe that the gff and fasta may not be labelled the same. I have included a small sample of what the gff and fasta look like.

GFF:

# start gene scaffold1.g8
scaffold1       AUGUSTUS        gene    71594   80958   0.07    -       .       ID=scaffold1.g8
scaffold1       AUGUSTUS        transcript      71594   80958   0.07    -       .       ID=scaffold1.g8.t1;Parent=scaffold1.g8
scaffold1       AUGUSTUS        transcription_end_site  71594   71594   .       -       .       Parent=scaffold1.g8.t1


..

# coding sequence = [atggggaatcgtggaatggaagatttaatccctatcgtaaacaagttgcaagatgcatttgcacaaattggtatagagt


..

# cacgacctccacctgtaccaagtcgaccttag]

# protein sequence = [MGNRGMEDLIPIVNKLQDAFAQIGIESPIDLPQIAVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLVLQLSYANT


..

# ASSANKPSMPSRPVSVAVAPRPQLDNPSVPRRPAPTRPVPSQPPQPARPPPVPSRP]


FASTA:

scaffold1 len=5136627
TATACTACATATGATTTTTTATAGATAAAAAATCATATGTAGTatatttattgcaaaaaaaaactcacatataacatatttatttcaataaaaaactatt
gagtgatttttctgaaatcccccttccagaattgacagtggttttaaatgcatgtcttttactaccctaagctattcaaaaggaaatagcttacaaattt


..

Please let me know if you have any idea how to make the files similar, or any advice on how to make this work.

