I'm going to annotate a large number of variants (about 50 millions) derived from the whole genome sequencing of a given population. For getting output, as you know, we can specify "Pick once line or block of consequence data per variant" or "Pick once line or block of consequence data per variant allele" as explained at here. Could you let me know which one should be selected? Also, please kindly let me know any your experience or comments to reduce the running time.
P.S. Regrading the speed, Emily from Ensembl kindly suggested me to use the buffer size of 5000 and 4 fork depend on the system. I'm looking for other experiences that you may obtain during your work.