OrthoFinder developer here!
The very first thing to say is that I've recently added the option to use DIAMOND instead of BLAST and I'm really impressed with it. From now on I would recommend virtually always using DIAMOND with OrthoFinder as it's about 60x faster in the tests I've done and the resulting OrthoFinder accuracy is virtually identical, you can see public benchmark results here:
Back to your question, as a very rough guess I'd say less than 5 days to finish but probably quicker.
To add a bit more detail, I am a little surprised the BLAST calculations took that long though with the kind of computing power you've got there. For example, I've just run an analysis on 128 fungal genomes earlier this week, which I think should be pretty similar to your analysis as the total number of sequences is approximately the same (~990,000 versus ~1,280,000). I used only 1 node with 16 cores but used DIAMOND instead of BLAST. The total run time was under 19 hours to get all the orthogroups, gene trees and orthologues etc!
I am working on the performance as we speak though so there will be improvements over the previous versions at the moment. For example, the latest version uses a new method for getting orthologues from the gene trees instead of dlcpar, which has improved the accuracy and speed of this step significantly so is definitely worth considering if you're not using it already. They'll be a new paper coming out very soon which will provide details on these methods but feel free to email me or message here if I can help with any specific problems with the analysis you're currently running.
All the best