The userguide says:
But I find another explanation saying that,the -Z should be set to the size of genome times 2.
I am in a mess now. How should I set the -Z , and I also want to know what's the E-value?
The userguide says:
But I find another explanation saying that,the -Z should be set to the size of genome times 2.
I am in a mess now. How should I set the -Z , and I also want to know what's the E-value?
To calculate Z please visit this link and use the function
esl-seqstat my_reference.fasta*
E-score: The E-value is the statistical significance of the hit: the number of hits we’d expect to score this highly in a database of this size (measured by the total number of nucleotides) if the database contained only nonhomologous random sequences. The lower the E-value, the more significant the hit.
Z: This option ensures that the reported E-values are accurate.
I hope that his helps:
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I believe the question about
-Zhas been answered already.As for the
E-value, it is basically the number of matches you would expect to get purely by chance that are of a similar quality as the current match (or better). So if you have anE-valueof 1, you could expect to get up to one extra match (purely by chance) whose alignment to the query is as good as the match you're looking at currently.