Entering edit mode
4.2 years ago
bioinf2305
▴
30
I have generated PASA gene structures to use them for gene training in SNAP, Augustus and other gene prediction software. However, SNAP produced errors for all the training genes identified on minus strand ("skipped due to errors"). It considered the genes on positive strand though. I am confused about this behaviour of SNAP. Did I made any mistake in file conversion? Following is the toy dataset:
>ctg4
Eterm 690335 690538 asmbl_142988.p1
Exon 627558 627721 asmbl_142988.p1
Exon 618407 618609 asmbl_142988.p1
Exon 617865 617952 asmbl_142988.p1
Exon 611927 611968 asmbl_142988.p1
Exon 608393 608512 asmbl_142988.p1
Exon 606645 606704 asmbl_142988.p1
Exon 599937 600005 asmbl_142988.p1
Exon 599526 599669 asmbl_142988.p1
Exon 592777 592899 asmbl_142988.p1
Exon 589983 590188 asmbl_142988.p1
Exon 589422 589511 asmbl_142988.p1
Einit 585167 586707 asmbl_142988.p1
Einit 884878 885007 asmbl_143060.p1
Exon 926290 926373 asmbl_143060.p1
Exon 932048 932092 asmbl_143060.p1
Exon 935718 935879 asmbl_143060.p1
Exon 937104 937257 asmbl_143060.p1
Eterm 938144 938274 asmbl_143060.p1
In the above example, the first gene is on minus strand, while the second gene is on plus strand.
Any comments on this issue will be highly appreciated.