Hi everyone !
I'm still struggling on maker.
What I want to do, is to follow step by step the maker tutorial for Snap training (the one described here : http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial#Training_ab_initio_Gene_Predictors ).
My input file is an genome assembly I made, with several contigs.
In the maker tutorial, they ask to convert the generated gff into a zff, and some other task. My problem is that, in the tutorial, they are working only on one locus, or one file. But in my results, I have like thousands of directories, each one containing several gff.
For example :
Loutre:~/Documents/maker/cleaned_suzukii_90_size.maker.output/cleaned_suzukii_90_size_datastore$ ls 00 0B 16 22 2D 38 43 4F 5A 66 71 7C 87 92 9D A9 B4 BF CA D5 E0 EB F6 (...) DF EA F5
And each directory could possibly contain several subdirectories, but the number can vary from one to another :
Loutre:~/Documents/maker/cleaned_suzukii_90_size.maker.output/cleaned_suzukii_90_size_datastore$ cd 0B/ Loutre:~/Documents/maker/cleaned_suzukii_90_size.maker.output/cleaned_suzukii_90_size_datastore/0B$ ls 28 67 7C 92 D0 F8 Loutre:~/Documents/maker/cleaned_suzukii_90_size.maker.output/cleaned_suzukii_90_size_datastore/0B$ cd 28 Loutre:~/Documents/maker/cleaned_suzukii_90_size.maker.output/cleaned_suzukii_90_size_datastore/0B/28$ ls tig00000634 Loutre:~/Documents/maker/cleaned_suzukii_90_size.maker.output/cleaned_suzukii_90_size_datastore/0B/28$ cd tig00000634/ Loutre:~/Documents/maker/cleaned_suzukii_90_size.maker.output/cleaned_suzukii_90_size_datastore/0B/28/tig00000634$ ls run.log theVoid.tig00000634 tig00000634.gff tig00000634.maker.augustus_masked.proteins.fasta tig00000634.maker.augustus_masked.transcripts.fasta tig00000634.maker.non_overlapping_ab_initio.proteins.fasta tig00000634.maker.non_overlapping_ab_initio.transcripts.fasta
The ones I'm interested in are the simple tig00(number).gff, but I want to train snap for each contig like this, I want to train snap for the whole assembly, and I hope that I don't have to do this for each .gff file, even if a script do it for me... Because the folowing steps for Snap trainign require to launch maker again, with the hmm model produced by SNap on the whole genome.
What I want, is a easy way to convert all theses gff to only one output gff, which correspond to the maker output. I can't find something looking like this in the maker documentation, but I'm sure that maker users know a way to do what I want.
Do you have any advices ? Thanks for your help !