I have multiple sequence alignments (MSAs) of clusters of sequences. I would like to know how I assess whether my MSA is informative enough (or say diverse enough) to apply sequence analysis tools like finding conserved positions or specificity determining positions.
When I use Scorecons (http://www.ebi.ac.uk/thornton-srv/databases/valdarprograms/scorecons_server_help.html), it gives me a Dops score (diversity of alignment score) which helps me to an extent (100 for very informative or diverse sequences in a cluster and 0 for all exactly similar sequences in a MSA). However, I do not know the lowest/threshold Dops score which can be used. Also if there are any other ways of finding the same, kindly let me know how everyone tackles this problem.
I guess otherwise the results of the sequence analysis would not make sense at all, right?
I would say that the scope of the analysis (in your case, it's the diversity of the MSA) depends on the question you're really trying to answer. In biology (bioinformatics) there are no absolute thresholds for anything. So in principle, I wouldn't ask if an alignment is informative enough, but if it contains all the sequences I'm interested in, and no other sequences. The other test I often use is:
convert an MSA to a HMM
run it against NR or UNIPROT
inspect the results to see how many obvious false positives are there with good E-value (and vice versa, what should be there with good E-value but is missing)
Visual inspection suggested by cacaucenturion is also worth trying, although I would say that it requires a bit of experience (knowledge-based intuition, as some say).