I've been recommended to use BUSCO in favour of CEGMA for the assessment of some de novo transcriptome assemblies, but I'm seeing inconsistent results with repeated runs of BUSCO and different numbers of threads (1-6). Despite rerunning with the same parameters I'm getting results which vary by upto 2 BUSCOs. I can't see anywhere suggesting that BUSCO is non-deterministic so am wondering if this is a bug.
I'm using the latest version of BUSCO v1.22, hmmer v3.1b2, python v3.4.3 and blast 2.2.29+ against the eukaryotic dataset in transcriptome mode: python3 BUSCO_v1.22.py -o test -in transcripts.fasta -l /db/busco/eukaryota -m trans -c 2
Grepping for the complete BUSCOs I see this:
./run_test_pe1/short_summary_test_pe1: 302 Complete BUSCOs
./run_test_pe2/short_summary_test_pe2: 300 Complete BUSCOs
./run_test_pe3/short_summary_test_pe3: 302 Complete BUSCOs
./run_test_pe4/short_summary_test_pe4: 301 Complete BUSCOs
./run_test_pe5/short_summary_test_pe5: 301 Complete BUSCOs
./run_test_pe6/short_summary_test_pe6: 302 Complete BUSCOs
Anyone else seen this? Any comments appreciated.